MCP 服务器

fi-lookup-mcp

Resolves messy financial institution records against canonical regulatory identifiers from FDIC, NCUA, and FFIEC public datasets.

README

fi-lookup-mcp

A personal portfolio project demonstrating a tool-use, reconciliation, and lineage-tracing pattern over public regulatory data, implemented as a local MCP (Model Context Protocol) server. It speaks stdio, so it works with any MCP host — Claude Code (CLI) and Claude Desktop are both supported.

Built by Nelson Anievas, with development assisted by Claude Code. Public data only — no proprietary or employer systems involved.

What It Does

This server exposes 11 tools that allow an AI agent to resolve, enrich, and track the history of US financial institution records using canonical regulatory identifiers from FDIC, NCUA, and FFIEC public datasets.

The server handles three distinct patterns:

Reconciliation: given a dirty external record (e.g. "Mtn America FCU, Sandy UT"), return ranked candidate matches with confidence scores and match reasons
Lineage tracing: given an RSSD ID, return the full merger, acquisition, rebrand, and consolidation history — predecessors, successors, parent company, and subsidiaries — with real names resolved across 223,750 active and historical institutions
Change feed: return all transformation events (mergers, failures, rebrands, splits) within a configurable lookback window, filterable by institution type, event type, and state — for dataset maintenance and regulatory monitoring

Tools

`search_institutions`

Free-text name search across all FDIC banks and NCUA credit unions. Supports filtering by institution type and state. Returns ranked candidates with fuzzy match scores.

`get_institution_profile`

Full regulatory profile lookup by any identifier — FDIC cert, NCUA charter number, or RSSD ID. Returns all available metadata including regulator, charter type, ABA routing number, deposit account count, and web address.

`reconcile_institution`

The centerpiece reconciliation tool. Takes a messy external record (name, optional city/state/identifier) and returns ranked candidate matches, each with a confidence score (0–1) and human-readable match reasons.

Scoring blends:

Name similarity (0.6 weight): token-set ratio + Jaro-Winkler, with abbreviation expansion (FCU → federal credit union, Mtn → mountain, N.A. → national association)
Geographic agreement (0.4 weight): state match (0.6) + city match (0.4)
Exact identifier override: if a cert, charter, or RSSD is provided and matches, confidence is set to 1.0

`crosswalk_identifiers`

Translates between FDIC cert, NCUA charter number, and RSSD ID. Explains regulatory boundaries (e.g. why a credit union has no FDIC cert).

`get_institution_history`

Returns the full merger, acquisition, and rebrand lineage for any institution by RSSD ID. Resolves real names for both active and defunct predecessor/successor institutions using a 223,750-record historical name lookup built from FFIEC NIC active and closed attributes files. Includes parent company and subsidiary relationships.

Example output for JPMorgan Chase (RSSD 852218): 52 predecessors including Washington Mutual (FDIC-assisted, 2008), Bank One (merger, 2004), and Bear Stearns entities — all with resolved names and dates.

`get_recent_changes`

A configurable regulatory change feed built from FFIEC NIC Transformations data. Returns mergers, failures, rebrands, splits, and other structural events within a lookback window. Useful for identifying institutions that have changed status and may need dataset updates.

Each event carries the full metadata of both the predecessor and successor (name, type, regulator, city/state, FDIC cert / NCUA charter, ABA routing, deposit accounts, web address). For every predecessor with a portal on record, the tool also fetches its home/login URL and classifies whether it is still operating independently or has been consumed by the acquirer:

independent_portal_live — still served on its own domain
consumed_by_acquirer — redirects to the acquirer's domain
redirects_elsewhere — redirects to a third domain (rebrand/division site)
unreachable — portal did not respond (likely retired)

Portal checks run concurrently and are reported in a portal_summary tally. Lookups use a one-time RSSD index (O(1)), so the data-only path is near-instant; portal checks are the only network cost and can be tuned or disabled.

Parameters:

days: lookback window (default 365, max 3650)
institution_type: "bank", "cu", or "all"
event_type: "merger", "failure", "split", "rebrand", or "all"
state: optional 2-letter state filter
check_portals: fetch and classify predecessor portals (default true; set false for an instant data-only feed)
max_portal_checks: cap on portals fetched, most-recent first (default 50)

`get_top_institutions`

Returns the top N institutions ranked by deposit account count, with individual and cumulative market share percentages. Supports filtering by institution type.

`export_institutions`

Exports the full institution dataset to a CSV file with configurable filters, sorting, and market share calculations.

`list_institutions`

General-purpose browse/query tool over the complete FDIC + NCUA dataset, exposing all 21 metadata fields per institution (every other tool returns a trimmed projection). One tool that is searchable, filterable, sortable, and exportable:

Search: case-insensitive substring across any subset of fields (search_fields, or "all")
Filter: institution type; state (accepts UT or Utah); min/max deposit accounts; has_routing, has_rssd, has_history
Sort: any field, ascending or descending (numeric fields sort numerically)
Page: limit/offset with has_more/next_offset for inline browsing; fields projects a subset
Export: set export_path to write all matched rows (not just the page) to csv or json; bare filenames default under ~/Desktop, written atomically

Fields: name, type, source, regulator, city, state, fdic_cert, ncua_charter, rssdid, aba_routing, deposit_accounts, total_assets, web_address, charter_type, charter_type_desc, inst_category, parent_rssd, predecessor_count, successor_count, subsidiary_count.

`refresh_cache`

Rebuilds the local data snapshot from scratch — re-fetches FDIC data from the BankFind API (latest quarter auto-discovered), auto-downloads the newest NCUA quarterly ZIP, and re-reads the local FFIEC ZIPs. Runs the full NIC enrichment pipeline. Reports the data_as_of date for each source.

`refresh_if_changed`

Cost-effective conditional refresh: fingerprints all sources (FFIEC ZIP content hashes + latest FDIC/NCUA quarter) and rebuilds only when something actually changed, otherwise skips the expensive reprocessing and returns changed: false. This is the tool the monthly scheduler runs — see Scheduled updates.

Data Sources

All data is public regulatory data. No licensed or proprietary sources.

Source	Data	Refresh
FDIC BankFind API	~4,274 active banks: name, location, cert, RSSD, web address	API call
FDIC Financials API	Deposit account counts from most recent quarter	API call
NCUA Quarterly ZIP	~4,336 active credit unions; deposit counts from FS220A; web addresses from FS220D	Manual download
FFIEC NIC Active Attributes	ABA primary routing numbers; joined via RSSD/cert/charter	Manual download
FFIEC NIC Closed Attributes	Historical institution names for 161,950 defunct entities	Manual download
FFIEC NIC Transformations	59,071 merger/acquisition/rebrand/failure events	Manual download
FFIEC NIC Relationships	Parent/subsidiary/branch ownership structure	Manual download

Total universe: 8,610 active institutions + 223,750 name-resolved historical records

Architecture

Claude Code / Claude Desktop (any MCP host)

| MCP stdio transport

server.py (FastMCP 3.4.2)

+-- search_institutions

+-- get_institution_profile

+-- reconcile_institution --> reconciler.py

+-- crosswalk_identifiers

+-- get_institution_history --> nic_names lookup (223,750 records)

+-- get_recent_changes --> CSV_TRANSFORMATIONS.zip

+-- get_top_institutions

+-- export_institutions

+-- list_institutions --> full dataset: search / filter / sort / export

+-- refresh_cache --> full rebuild (FDIC live + NCUA auto-download + FFIEC)

+-- refresh_if_changed --> conditional rebuild (monthly launchd job)

data_loader.py nic_loader.py

| |

+-- cache/fdic_institutions.json (NIC-enriched)

+-- cache/ncua_institutions.json (NIC-enriched)

+-- cache/call-report-data-*.zip

+-- cache/CSV_ATTRIBUTES_ACTIVE.zip

+-- cache/CSV_ATTRIBUTES_CLOSED.zip

+-- cache/CSV_TRANSFORMATIONS.zip

+-- cache/CSV_RELATIONSHIPS.zip

Key design decisions:

Local cache first: runs fully offline after initial build; warm start skips live API calls
NIC enrichment at save time: predecessor/successor/parent/subsidiary fields are written into the JSON cache so subsequent warm starts load enriched data instantly
Atomic cache writes: .tmp rename pattern prevents corruption on interrupted writes
Stderr-only logging: never pollutes the MCP stdio JSON channel
Abbreviation-aware normalization: improves recall on dirty external records

Local Data Setup

The cache/ directory is not committed to Git — populate it manually before first run.

Required downloads

File	Source
`cache/CSV_ATTRIBUTES_ACTIVE.zip`	FFIEC NIC Data Download — Active Attributes
`cache/CSV_ATTRIBUTES_CLOSED.zip`	FFIEC NIC Data Download — Closed Attributes
`cache/CSV_TRANSFORMATIONS.zip`	FFIEC NIC Data Download — Transformations
`cache/CSV_RELATIONSHIPS.zip`	FFIEC NIC Data Download — Relationships

FDIC is fetched live from the FDIC BankFind API (latest quarter auto-discovered) and NCUA quarterly ZIPs are now auto-downloaded — neither needs a manual download. Only the four FFIEC NIC ZIPs above must be placed in cache/ by hand, because FFIEC's bulk download is gated against scripted requests.

Scheduled updates

Each record carries a data_as_of date, and the snapshot keeps itself current with a cost-aware refresh strategy:

FDIC / NCUA — auto-fetch the newest published quarter on every refresh.
FFIEC — refreshed by dropping new ZIPs into cache/ (the bulk download is 403-gated to scripts, so it can't be auto-pulled). A content hash detects the change.
refresh_if_changed rebuilds only when a source actually changed; a no-op run does cheap fingerprint checks (~0.3s CPU) and skips the expensive NIC reprocessing.

A monthly launchd job runs scheduled_refresh.py (which calls refresh_if_changed) at 03:00 on the 1st, logging to cache/refresh.log:

# Install / reload the monthly agent
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.fi-lookup.monthly-refresh.plist

# Run it once on demand
launchctl kickstart -k gui/$(id -u)/com.fi-lookup.monthly-refresh

# Remove it
launchctl bootout gui/$(id -u)/com.fi-lookup.monthly-refresh

Recommended cadence: monthly (bump to weekly only if you depend on the merger change-feed being current within days). The guard makes extra runs nearly free, so erring toward more frequent checks costs little.

Setup

Prerequisites

Python 3.11+
An MCP host — Claude Code (CLI) or Claude Desktop

Install

git clone https://github.com/nlsnnvas/fi-lookup-mcp.git
cd fi-lookup-mcp
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Download manual data files

Download the five ZIPs listed in the table above and place them in cache/. FFIEC files are available at ffiec.gov/npw/FinancialReport/DataDownload.

Build the data snapshot

python -c "import asyncio; from data_loader import build_snapshot; asyncio.run(build_snapshot())"

This fetches FDIC data live, reads all local ZIPs, runs NIC enrichment, and writes the JSON cache. Expect 2–3 minutes on first run.

Connect to an MCP host

Claude Code (CLI) — register the server with the venv interpreter:

claude mcp add fi-lookup -- "$(pwd)/.venv/bin/python" "$(pwd)/server.py"

Verify it loaded with claude mcp list, then the tools are available in any claude session in that scope.

Claude Desktop:

fastmcp install claude-desktop server.py --name "fi-lookup"

Then restart Claude Desktop.

Web dashboard (FI Explorer)

A local web UI over the same dataset and tools — no MCP client required. Built with Starlette + uvicorn (both ship with FastMCP, so no extra dependencies).

python web_app.py                 # serves http://127.0.0.1:8765
python web_app.py --port 9000     # custom port

Four tabs:

Browse — searchable / filterable / sortable table over all institutions with every metadata field, plus CSV/JSON export (wraps list_institutions)
Profile & Lineage — enter an RSSD ID for merger/acquisition lineage: predecessors, successors, parent, subsidiaries (wraps get_institution_history)
Recent Changes — merger/failure/rebrand/split feed with optional portal verification, independent-vs-consumed (wraps get_recent_changes)
Reconcile — paste a messy record for ranked candidate matches with confidence scores (wraps reconcile_institution)

It is read-only and bound to 127.0.0.1 (localhost only) by default. It has no authentication, so do not expose it to a network or the internet as-is — see the note below. To reach it from another machine on a trusted LAN for a quick demo, run python web_app.py --host 0.0.0.0 and connect to http://<this-machine-ip>:8765; for anything beyond that, add authentication and serve it behind a proxy/tunnel.

Example Interactions

Reconciliation:

"I have a vendor row that says 'Mtn America FCU, Sandy UT' — what is it?"

reconcile_institution scores ~8,610 institutions and returns Mountain America Credit Union (NCUA #24692) at 0.984 confidence, with ABA routing, deposit account count, and charter type.

Lineage tracing:

"What is the full acquisition history of Bank of America?"

get_institution_history returns 117 predecessor institutions going back to 1960, including the 1998 BankAmerica merger, the 2008 Countrywide acquisition, and the 2009 Merrill Lynch absorption — all with resolved names and dates.

Change feed:

"What bank failures and mergers happened in the last 90 days?"

get_recent_changes returns 108 events grouped by type: 1 FDIC-assisted failure, 107 mergers — including Meadows Bank absorbed by AMERICA FIRST Credit Union and two bank-to-credit-union conversions.

Why This Pattern Matters

Financial institution data is notoriously messy. The patterns here are directly applicable to:

Matching vendor/counterparty records to a canonical institution master
Tracing merger lineage for compliance, KYC, or data governance
Building regulatory change feeds for dataset maintenance automation
Enriching internal datasets with public regulatory metadata
Onboarding automation that maps free-text institution names to stable IDs

This project re-expresses reconciliation and lineage patterns from production AI agent work, using only public data.

Stack

Python 3.11
FastMCP 3.4.2
rapidfuzz (fuzzy string matching)
httpx (async HTTP)
Claude Code / Claude Desktop (MCP host)

Framing Note

This is a tool-use, reconciliation, and lineage-tracing pattern — not RAG. The model calls structured tools that execute deterministic scoring and lookup logic against a pre-built regulatory snapshot and return ranked, explainable results. The NIC enrichment pipeline runs at startup and writes enriched data to the JSON cache, so subsequent tool calls are fast in-memory lookups.

fi-lookup-mcp

README

fi-lookup-mcp

What It Does

Tools

search_institutions

get_institution_profile

reconcile_institution

crosswalk_identifiers

get_institution_history

get_recent_changes

get_top_institutions

export_institutions

list_institutions

refresh_cache

refresh_if_changed

Data Sources

Architecture

Local Data Setup

Required downloads

Scheduled updates

Setup

Prerequisites

Install

Download manual data files

Build the data snapshot

Connect to an MCP host

Web dashboard (FI Explorer)

Example Interactions

Why This Pattern Matters

Stack

Framing Note

推荐服务器

`search_institutions`

`get_institution_profile`

`reconcile_institution`

`crosswalk_identifiers`

`get_institution_history`

`get_recent_changes`

`get_top_institutions`

`export_institutions`

`list_institutions`

`refresh_cache`

`refresh_if_changed`