fi-lookup-mcp
Resolves messy financial institution records against canonical regulatory identifiers from FDIC, NCUA, and FFIEC public datasets.
README
fi-lookup-mcp
A personal portfolio project demonstrating a tool-use, reconciliation, and lineage-tracing pattern over public regulatory data, implemented as a local MCP (Model Context Protocol) server. It speaks stdio, so it works with any MCP host — Claude Code (CLI) and Claude Desktop are both supported.
Built by Nelson Anievas, with development assisted by Claude Code. Public data only — no proprietary or employer systems involved.
What It Does
This server exposes 11 tools that allow an AI agent to resolve, enrich, and track the history of US financial institution records using canonical regulatory identifiers from FDIC, NCUA, and FFIEC public datasets.
The server handles three distinct patterns:
- Reconciliation: given a dirty external record (e.g.
"Mtn America FCU, Sandy UT"), return ranked candidate matches with confidence scores and match reasons - Lineage tracing: given an RSSD ID, return the full merger, acquisition, rebrand, and consolidation history — predecessors, successors, parent company, and subsidiaries — with real names resolved across 223,750 active and historical institutions
- Change feed: return all transformation events (mergers, failures, rebrands, splits) within a configurable lookback window, filterable by institution type, event type, and state — for dataset maintenance and regulatory monitoring
Tools
search_institutions
Free-text name search across all FDIC banks and NCUA credit unions. Supports filtering by institution type and state. Returns ranked candidates with fuzzy match scores.
get_institution_profile
Full regulatory profile lookup by any identifier — FDIC cert, NCUA charter number, or RSSD ID. Returns all available metadata including regulator, charter type, ABA routing number, deposit account count, and web address.
reconcile_institution
The centerpiece reconciliation tool. Takes a messy external record (name, optional city/state/identifier) and returns ranked candidate matches, each with a confidence score (0–1) and human-readable match reasons.
Scoring blends:
- Name similarity (0.6 weight): token-set ratio + Jaro-Winkler, with abbreviation expansion (FCU → federal credit union, Mtn → mountain, N.A. → national association)
- Geographic agreement (0.4 weight): state match (0.6) + city match (0.4)
- Exact identifier override: if a cert, charter, or RSSD is provided and matches, confidence is set to 1.0
crosswalk_identifiers
Translates between FDIC cert, NCUA charter number, and RSSD ID. Explains regulatory boundaries (e.g. why a credit union has no FDIC cert).
get_institution_history
Returns the full merger, acquisition, and rebrand lineage for any institution by RSSD ID. Resolves real names for both active and defunct predecessor/successor institutions using a 223,750-record historical name lookup built from FFIEC NIC active and closed attributes files. Includes parent company and subsidiary relationships.
Example output for JPMorgan Chase (RSSD 852218): 52 predecessors including Washington Mutual (FDIC-assisted, 2008), Bank One (merger, 2004), and Bear Stearns entities — all with resolved names and dates.
get_recent_changes
A configurable regulatory change feed built from FFIEC NIC Transformations data. Returns mergers, failures, rebrands, splits, and other structural events within a lookback window. Useful for identifying institutions that have changed status and may need dataset updates.
Each event carries the full metadata of both the predecessor and successor (name, type, regulator, city/state, FDIC cert / NCUA charter, ABA routing, deposit accounts, web address). For every predecessor with a portal on record, the tool also fetches its home/login URL and classifies whether it is still operating independently or has been consumed by the acquirer:
independent_portal_live— still served on its own domainconsumed_by_acquirer— redirects to the acquirer's domainredirects_elsewhere— redirects to a third domain (rebrand/division site)unreachable— portal did not respond (likely retired)
Portal checks run concurrently and are reported in a portal_summary tally. Lookups use a one-time RSSD index (O(1)), so the data-only path is near-instant; portal checks are the only network cost and can be tuned or disabled.
Parameters:
days: lookback window (default 365, max 3650)institution_type:"bank","cu", or"all"event_type:"merger","failure","split","rebrand", or"all"state: optional 2-letter state filtercheck_portals: fetch and classify predecessor portals (defaulttrue; setfalsefor an instant data-only feed)max_portal_checks: cap on portals fetched, most-recent first (default 50)
get_top_institutions
Returns the top N institutions ranked by deposit account count, with individual and cumulative market share percentages. Supports filtering by institution type.
export_institutions
Exports the full institution dataset to a CSV file with configurable filters, sorting, and market share calculations.
list_institutions
General-purpose browse/query tool over the complete FDIC + NCUA dataset, exposing all 21 metadata fields per institution (every other tool returns a trimmed projection). One tool that is searchable, filterable, sortable, and exportable:
- Search: case-insensitive substring across any subset of fields (
search_fields, or"all") - Filter: institution type; state (accepts
UTorUtah); min/max deposit accounts;has_routing,has_rssd,has_history - Sort: any field, ascending or descending (numeric fields sort numerically)
- Page:
limit/offsetwithhas_more/next_offsetfor inline browsing;fieldsprojects a subset - Export: set
export_pathto write all matched rows (not just the page) tocsvorjson; bare filenames default under~/Desktop, written atomically
Fields: name, type, source, regulator, city, state, fdic_cert, ncua_charter, rssdid, aba_routing, deposit_accounts, total_assets, web_address, charter_type, charter_type_desc, inst_category, parent_rssd, predecessor_count, successor_count, subsidiary_count.
refresh_cache
Rebuilds the local data snapshot from scratch — re-fetches FDIC data from the BankFind API (latest quarter auto-discovered), auto-downloads the newest NCUA quarterly ZIP, and re-reads the local FFIEC ZIPs. Runs the full NIC enrichment pipeline. Reports the data_as_of date for each source.
refresh_if_changed
Cost-effective conditional refresh: fingerprints all sources (FFIEC ZIP content hashes + latest FDIC/NCUA quarter) and rebuilds only when something actually changed, otherwise skips the expensive reprocessing and returns changed: false. This is the tool the monthly scheduler runs — see Scheduled updates.
Data Sources
All data is public regulatory data. No licensed or proprietary sources.
| Source | Data | Refresh |
|---|---|---|
| FDIC BankFind API | ~4,274 active banks: name, location, cert, RSSD, web address | API call |
| FDIC Financials API | Deposit account counts from most recent quarter | API call |
| NCUA Quarterly ZIP | ~4,336 active credit unions; deposit counts from FS220A; web addresses from FS220D | Manual download |
| FFIEC NIC Active Attributes | ABA primary routing numbers; joined via RSSD/cert/charter | Manual download |
| FFIEC NIC Closed Attributes | Historical institution names for 161,950 defunct entities | Manual download |
| FFIEC NIC Transformations | 59,071 merger/acquisition/rebrand/failure events | Manual download |
| FFIEC NIC Relationships | Parent/subsidiary/branch ownership structure | Manual download |
Total universe: 8,610 active institutions + 223,750 name-resolved historical records
Architecture
Claude Code / Claude Desktop (any MCP host)
|
| MCP stdio transport
v
server.py (FastMCP 3.4.2)
|
+-- search_institutions
+-- get_institution_profile
+-- reconcile_institution --> reconciler.py
+-- crosswalk_identifiers
+-- get_institution_history --> nic_names lookup (223,750 records)
+-- get_recent_changes --> CSV_TRANSFORMATIONS.zip
+-- get_top_institutions
+-- export_institutions
+-- list_institutions --> full dataset: search / filter / sort / export
+-- refresh_cache --> full rebuild (FDIC live + NCUA auto-download + FFIEC)
+-- refresh_if_changed --> conditional rebuild (monthly launchd job)
|
v
data_loader.py nic_loader.py
| |
+-- cache/fdic_institutions.json (NIC-enriched)
+-- cache/ncua_institutions.json (NIC-enriched)
+-- cache/call-report-data-*.zip
+-- cache/CSV_ATTRIBUTES_ACTIVE.zip
+-- cache/CSV_ATTRIBUTES_CLOSED.zip
+-- cache/CSV_TRANSFORMATIONS.zip
+-- cache/CSV_RELATIONSHIPS.zip
Key design decisions:
- Local cache first: runs fully offline after initial build; warm start skips live API calls
- NIC enrichment at save time: predecessor/successor/parent/subsidiary fields are written into the JSON cache so subsequent warm starts load enriched data instantly
- Atomic cache writes:
.tmprename pattern prevents corruption on interrupted writes - Stderr-only logging: never pollutes the MCP stdio JSON channel
- Abbreviation-aware normalization: improves recall on dirty external records
Local Data Setup
The cache/ directory is not committed to Git — populate it manually before first run.
Required downloads
| File | Source |
|---|---|
cache/CSV_ATTRIBUTES_ACTIVE.zip |
FFIEC NIC Data Download — Active Attributes |
cache/CSV_ATTRIBUTES_CLOSED.zip |
FFIEC NIC Data Download — Closed Attributes |
cache/CSV_TRANSFORMATIONS.zip |
FFIEC NIC Data Download — Transformations |
cache/CSV_RELATIONSHIPS.zip |
FFIEC NIC Data Download — Relationships |
FDIC is fetched live from the FDIC BankFind API (latest quarter auto-discovered) and NCUA quarterly ZIPs are now auto-downloaded — neither needs a manual download. Only the four FFIEC NIC ZIPs above must be placed in cache/ by hand, because FFIEC's bulk download is gated against scripted requests.
Scheduled updates
Each record carries a data_as_of date, and the snapshot keeps itself current with a cost-aware refresh strategy:
- FDIC / NCUA — auto-fetch the newest published quarter on every refresh.
- FFIEC — refreshed by dropping new ZIPs into
cache/(the bulk download is 403-gated to scripts, so it can't be auto-pulled). A content hash detects the change. refresh_if_changedrebuilds only when a source actually changed; a no-op run does cheap fingerprint checks (~0.3s CPU) and skips the expensive NIC reprocessing.
A monthly launchd job runs scheduled_refresh.py (which calls refresh_if_changed) at 03:00 on the 1st, logging to cache/refresh.log:
# Install / reload the monthly agent
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.fi-lookup.monthly-refresh.plist
# Run it once on demand
launchctl kickstart -k gui/$(id -u)/com.fi-lookup.monthly-refresh
# Remove it
launchctl bootout gui/$(id -u)/com.fi-lookup.monthly-refresh
Recommended cadence: monthly (bump to weekly only if you depend on the merger change-feed being current within days). The guard makes extra runs nearly free, so erring toward more frequent checks costs little.
Setup
Prerequisites
- Python 3.11+
- An MCP host — Claude Code (CLI) or Claude Desktop
Install
git clone https://github.com/nlsnnvas/fi-lookup-mcp.git
cd fi-lookup-mcp
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Download manual data files
Download the five ZIPs listed in the table above and place them in cache/. FFIEC files are available at ffiec.gov/npw/FinancialReport/DataDownload.
Build the data snapshot
python -c "import asyncio; from data_loader import build_snapshot; asyncio.run(build_snapshot())"
This fetches FDIC data live, reads all local ZIPs, runs NIC enrichment, and writes the JSON cache. Expect 2–3 minutes on first run.
Connect to an MCP host
Claude Code (CLI) — register the server with the venv interpreter:
claude mcp add fi-lookup -- "$(pwd)/.venv/bin/python" "$(pwd)/server.py"
Verify it loaded with claude mcp list, then the tools are available in any claude session in that scope.
Claude Desktop:
fastmcp install claude-desktop server.py --name "fi-lookup"
Then restart Claude Desktop.
Web dashboard (FI Explorer)
A local web UI over the same dataset and tools — no MCP client required. Built with Starlette + uvicorn (both ship with FastMCP, so no extra dependencies).
python web_app.py # serves http://127.0.0.1:8765
python web_app.py --port 9000 # custom port
Four tabs:
- Browse — searchable / filterable / sortable table over all institutions with every metadata field, plus CSV/JSON export (wraps
list_institutions) - Profile & Lineage — enter an RSSD ID for merger/acquisition lineage: predecessors, successors, parent, subsidiaries (wraps
get_institution_history) - Recent Changes — merger/failure/rebrand/split feed with optional portal verification, independent-vs-consumed (wraps
get_recent_changes) - Reconcile — paste a messy record for ranked candidate matches with confidence scores (wraps
reconcile_institution)
It is read-only and bound to 127.0.0.1 (localhost only) by default. It has no authentication, so do not expose it to a network or the internet as-is — see the note below. To reach it from another machine on a trusted LAN for a quick demo, run python web_app.py --host 0.0.0.0 and connect to http://<this-machine-ip>:8765; for anything beyond that, add authentication and serve it behind a proxy/tunnel.
Example Interactions
Reconciliation:
"I have a vendor row that says 'Mtn America FCU, Sandy UT' — what is it?"
reconcile_institution scores ~8,610 institutions and returns Mountain America Credit Union (NCUA #24692) at 0.984 confidence, with ABA routing, deposit account count, and charter type.
Lineage tracing:
"What is the full acquisition history of Bank of America?"
get_institution_history returns 117 predecessor institutions going back to 1960, including the 1998 BankAmerica merger, the 2008 Countrywide acquisition, and the 2009 Merrill Lynch absorption — all with resolved names and dates.
Change feed:
"What bank failures and mergers happened in the last 90 days?"
get_recent_changes returns 108 events grouped by type: 1 FDIC-assisted failure, 107 mergers — including Meadows Bank absorbed by AMERICA FIRST Credit Union and two bank-to-credit-union conversions.
Why This Pattern Matters
Financial institution data is notoriously messy. The patterns here are directly applicable to:
- Matching vendor/counterparty records to a canonical institution master
- Tracing merger lineage for compliance, KYC, or data governance
- Building regulatory change feeds for dataset maintenance automation
- Enriching internal datasets with public regulatory metadata
- Onboarding automation that maps free-text institution names to stable IDs
This project re-expresses reconciliation and lineage patterns from production AI agent work, using only public data.
Stack
- Python 3.11
- FastMCP 3.4.2
- rapidfuzz (fuzzy string matching)
- httpx (async HTTP)
- Claude Code / Claude Desktop (MCP host)
Framing Note
This is a tool-use, reconciliation, and lineage-tracing pattern — not RAG. The model calls structured tools that execute deterministic scoring and lookup logic against a pre-built regulatory snapshot and return ranked, explainable results. The NIC enrichment pipeline runs at startup and writes enriched data to the JSON cache, so subsequent tool calls are fast in-memory lookups.
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。