MCP 服务器

brainlayer

Local-first persistent memory layer for AI agents. Provides hybrid search (FTS5 keyword + vector embeddings) over 223K+ knowledge chunks via MCP. Tools: brain_search, brain_store, brain_entity, brain_subscribe. Features pub/sub with stable agent identity, delivery tracking, and Claude --channels integration. SQLite + BrainBar Swift daemon on Unix socket.

README

BrainLayer

Persistent memory and knowledge graph for AI agents — 9 MCP tools, real-time indexing hooks, and a native macOS daemon for always-on recall across every conversation.

224,000+ chunks indexed · 1,002 Python + 28 Swift tests · Real-time indexing hooks · 9 MCP tools · BrainBar daemon (209KB) · Zero cloud dependencies

Your AI agent forgets everything between sessions. Every architecture decision, every debugging session, every preference you've expressed — gone. You repeat yourself constantly.

BrainLayer fixes this. It's a local-first memory layer that gives any MCP-compatible AI agent the ability to remember, think, and recall across conversations. Includes BrainBar — a 209KB native macOS daemon that provides always-on memory access.

"What approach did I use for auth last month?"     →  brain_search
"Show me everything about this file's history"     →  brain_recall
"What was I working on yesterday?"                 →  brain_recall
"Remember this decision for later"                 →  brain_store
"Ingest this meeting transcript"                   →  brain_digest
"What do we know about this person?"               →  brain_get_person
"Look up the Domica project entity"                →  brain_entity

Quick Start

pip install brainlayer
brainlayer init              # Interactive setup wizard
brainlayer index             # Index your Claude Code conversations

Then add to your editor's MCP config:

Claude Code (~/.claude.json):

{
  "mcpServers": {
    "brainlayer": {
      "command": "brainlayer-mcp"
    }
  }
}

<details> <summary>Other editors (Cursor, Zed, VS Code)</summary>

Cursor (MCP settings):

{
  "mcpServers": {
    "brainlayer": {
      "command": "brainlayer-mcp"
    }
  }
}

Zed (settings.json):

{
  "context_servers": {
    "brainlayer": {
      "command": { "path": "brainlayer-mcp" }
    }
  }
}

VS Code (.vscode/mcp.json):

{
  "servers": {
    "brainlayer": {
      "command": "brainlayer-mcp"
    }
  }
}

</details>

That's it. Your agent now has persistent memory across every conversation.

Architecture

graph LR
    A["Claude Code / Cursor / Zed"] -->|MCP| B["BrainLayer MCP Server<br/>9 tools"]
    B --> C["Hybrid Search<br/>semantic + keyword (RRF)"]
    C --> D["SQLite + sqlite-vec<br/>single .db file"]
    B --> KG["Knowledge Graph<br/>entities + relations"]
    KG --> D

    E["Claude Code JSONL<br/>conversations"] --> F["Pipeline"]
    F -->|extract → classify → chunk → embed| D
    G["Local LLM<br/>Ollama / MLX"] -->|enrich| D

    H["Real-time Hooks"] -->|live per-message| D
    I["BrainBar<br/>macOS daemon"] -->|Unix socket MCP| B

Everything runs locally. No cloud accounts, no API keys, no Docker, no database servers.

Component	Implementation
Storage	SQLite + sqlite-vec (single `.db` file, WAL mode)
Embeddings	`bge-large-en-v1.5` via sentence-transformers (1024 dims, runs on CPU/MPS)
Search	Hybrid: vector similarity + FTS5 keyword, merged with Reciprocal Rank Fusion
Enrichment	Local LLM via Ollama or MLX — 10-field metadata per chunk
MCP Server	stdio-based, MCP SDK v1.26+, compatible with any MCP client
Clustering	Leiden + UMAP for brain graph visualization (optional)
BrainBar	Native macOS daemon (209KB Swift binary) — always-on MCP over Unix socket

MCP Tools (9)

Core (4)

Tool	Description
`brain_search`	Semantic search — unified search across query, file_path, chunk_id, filters.
`brain_store`	Persist memories — ideas, decisions, learnings, mistakes. Auto-type/auto-importance.
`brain_recall`	Proactive retrieval — current context, sessions, session summaries.
`brain_tags`	Browse and filter by tag — discover what's in memory without a search query.

Knowledge Graph (5)

Tool	Description
`brain_digest`	Ingest raw content — entity extraction, relations, sentiment, action items.
`brain_entity`	Look up entities in the knowledge graph — type, relations, evidence.
`brain_expand`	Expand a chunk_id with N surrounding chunks for full context.
`brain_update`	Update, archive, or merge existing memories.
`brain_get_person`	Person lookup — entity details, interactions, preferences (~200-500ms).

Backward Compatibility

All 14 old brainlayer_* names still work as aliases.

Enrichment

BrainLayer enriches each chunk with 10 structured metadata fields using a local LLM:

Field	Example
`summary`	"Debugging Telegram bot message drops under load"
`tags`	"telegram, debugging, performance"
`importance`	8 (architectural decision) vs 2 (directory listing)
`intent`	`debugging`, `designing`, `implementing`, `configuring`, `deciding`, `reviewing`
`primary_symbols`	"TelegramBot, handleMessage, grammy"
`resolved_query`	"How does the Telegram bot handle rate limiting?"
`epistemic_level`	`hypothesis`, `substantiated`, `validated`
`version_scope`	"grammy 1.32, Node 22"
`debt_impact`	`introduction`, `resolution`, `none`
`external_deps`	"grammy, Supabase, Railway"

Three enrichment backends (auto-detect: MLX → Ollama → Groq, override via BRAINLAYER_ENRICH_BACKEND):

Backend	Best for	Speed
Groq (cloud)	When local LLMs are unavailable	~1-2s/chunk
MLX (Apple Silicon)	M1/M2/M3 Macs (preferred)	21-87% faster than Ollama
Ollama	Any platform	~1s/chunk (short), ~13s (long)

brainlayer enrich                              # Default backend (auto-detects)
BRAINLAYER_ENRICH_BACKEND=groq brainlayer enrich --batch-size=100

Why BrainLayer?

	BrainLayer	Mem0	Zep/Graphiti	Letta	LangChain Memory
MCP native	9 tools	1 server	1 server	No	No
Think / Recall	Yes	No	No	No	No
Local-first	SQLite	Cloud-first	Cloud-only	Docker+PG	Framework
Zero infra	`pip install`	API key	API key	Docker	Multiple deps
Multi-source	7 sources	API only	API only	API only	API only
Enrichment	10 fields	Basic	Temporal	Self-write	None
Session analysis	Yes	No	No	No	No
Real-time	Per-message hooks	No	No	No	No
Open source	Apache 2.0	Apache 2.0	Source-available	Apache 2.0	MIT

BrainLayer is the only memory layer that:

Thinks before answering — categorizes past knowledge by intent (decisions, bugs, patterns) instead of raw search results
Runs on a single file — no database servers, no Docker, no cloud accounts
Works with every MCP client — 9 tools, instant integration, zero SDK
Knowledge graph — entities, relations, and person lookup across all indexed data

CLI Reference

brainlayer init               # Interactive setup wizard
brainlayer index              # Index new conversations
brainlayer search "query"     # Semantic + keyword search
brainlayer enrich             # Run LLM enrichment on new chunks
brainlayer enrich-sessions    # Session-level analysis (decisions, learnings)
brainlayer stats              # Database statistics
brainlayer brain-export       # Generate brain graph JSON
brainlayer export-obsidian    # Export to Obsidian vault
brainlayer dashboard          # Interactive TUI dashboard

Configuration

All configuration is via environment variables:

Variable	Default	Description
`BRAINLAYER_DB`	`~/.local/share/brainlayer/brainlayer.db`	Database file path
`BRAINLAYER_ENRICH_BACKEND`	auto-detect (MLX → Ollama → Groq)	Enrichment LLM backend (`mlx`, `ollama`, or `groq`)
`BRAINLAYER_ENRICH_MODEL`	`glm-4.7-flash`	Ollama model name
`BRAINLAYER_MLX_MODEL`	`mlx-community/Qwen2.5-Coder-14B-Instruct-4bit`	MLX model identifier
`BRAINLAYER_OLLAMA_URL`	`http://127.0.0.1:11434/api/generate`	Ollama API endpoint
`BRAINLAYER_MLX_URL`	`http://127.0.0.1:8080/v1/chat/completions`	MLX server endpoint
`BRAINLAYER_STALL_TIMEOUT`	`300`	Seconds before killing a stuck enrichment chunk
`BRAINLAYER_HEARTBEAT_INTERVAL`	`25`	Log progress every N chunks during enrichment
`BRAINLAYER_SANITIZE_EXTRA_NAMES`	(empty)	Comma-separated names to redact from indexed content
`BRAINLAYER_SANITIZE_USE_SPACY`	`true`	Use spaCy NER for PII detection
`GROQ_API_KEY`	(unset)	Groq API key for cloud enrichment backend
`BRAINLAYER_GROQ_URL`	`https://api.groq.com/openai/v1/chat/completions`	Groq API endpoint
`BRAINLAYER_GROQ_MODEL`	`llama-3.3-70b-versatile`	Groq model for enrichment

Optional Extras

pip install "brainlayer[brain]"     # Brain graph visualization (Leiden + UMAP) + FAISS
pip install "brainlayer[cloud]"     # Cloud backfill (Gemini Batch API)
pip install "brainlayer[youtube]"   # YouTube transcript indexing
pip install "brainlayer[ast]"       # AST-aware code chunking (tree-sitter)
pip install "brainlayer[kg]"        # GliNER entity extraction (209M params, EN+HE)
pip install "brainlayer[style]"     # ChromaDB vector store (alternative backend)
pip install "brainlayer[dev]"       # Development: pytest, ruff

Data Sources

BrainLayer can index conversations from multiple sources:

Source	Format	Indexer
Claude Code	JSONL (`~/.claude/projects/`)	`brainlayer index`
Claude Desktop	JSON export	`brainlayer index --source desktop`
WhatsApp	Exported `.txt` chat	`brainlayer index --source whatsapp`
YouTube	Transcripts via yt-dlp	`brainlayer index --source youtube`
Codex CLI	JSONL (`~/.codex/sessions`)	`brainlayer ingest-codex`
Markdown	Any `.md` files	`brainlayer index --source markdown`
Manual	Via MCP tool	`brain_store`
Real-time	Claude Code hooks	Live per-message indexing (zero-lag)

Testing

pip install -e ".[dev]"
pytest tests/                           # Full suite (1,002 Python tests)
pytest tests/ -m "not integration"      # Unit tests only (fast)
ruff check src/                         # Linting
# BrainBar (Swift): 28 tests via Xcode

Roadmap

See docs/roadmap.md for planned features including boot context loading, compact search, pinned memories, and MCP Registry listing.

Contributing

Contributions welcome! See CONTRIBUTING.md for dev setup, testing, and PR guidelines.

License

Apache 2.0 — see LICENSE.

Origin

BrainLayer was originally developed as "Zikaron" (Hebrew: memory) inside a personal AI agent ecosystem. It was extracted into a standalone project because every developer deserves persistent AI memory — not just the ones building their own agent systems.