brainlayer
Local-first persistent memory layer for AI agents. Provides hybrid search (FTS5 keyword + vector embeddings) over 223K+ knowledge chunks via MCP. Tools: brain_search, brain_store, brain_entity, brain_subscribe. Features pub/sub with stable agent identity, delivery tracking, and Claude --channels integration. SQLite + BrainBar Swift daemon on Unix socket.
README
BrainLayer
Persistent memory and knowledge graph for AI agents — 9 MCP tools, real-time indexing hooks, and a native macOS daemon for always-on recall across every conversation.
224,000+ chunks indexed · 1,002 Python + 28 Swift tests · Real-time indexing hooks · 9 MCP tools · BrainBar daemon (209KB) · Zero cloud dependencies
Your AI agent forgets everything between sessions. Every architecture decision, every debugging session, every preference you've expressed — gone. You repeat yourself constantly.
BrainLayer fixes this. It's a local-first memory layer that gives any MCP-compatible AI agent the ability to remember, think, and recall across conversations. Includes BrainBar — a 209KB native macOS daemon that provides always-on memory access.
"What approach did I use for auth last month?" → brain_search
"Show me everything about this file's history" → brain_recall
"What was I working on yesterday?" → brain_recall
"Remember this decision for later" → brain_store
"Ingest this meeting transcript" → brain_digest
"What do we know about this person?" → brain_get_person
"Look up the Domica project entity" → brain_entity
Quick Start
pip install brainlayer
brainlayer init # Interactive setup wizard
brainlayer index # Index your Claude Code conversations
Then add to your editor's MCP config:
Claude Code (~/.claude.json):
{
"mcpServers": {
"brainlayer": {
"command": "brainlayer-mcp"
}
}
}
<details> <summary>Other editors (Cursor, Zed, VS Code)</summary>
Cursor (MCP settings):
{
"mcpServers": {
"brainlayer": {
"command": "brainlayer-mcp"
}
}
}
Zed (settings.json):
{
"context_servers": {
"brainlayer": {
"command": { "path": "brainlayer-mcp" }
}
}
}
VS Code (.vscode/mcp.json):
{
"servers": {
"brainlayer": {
"command": "brainlayer-mcp"
}
}
}
</details>
That's it. Your agent now has persistent memory across every conversation.
Architecture
graph LR
A["Claude Code / Cursor / Zed"] -->|MCP| B["BrainLayer MCP Server<br/>9 tools"]
B --> C["Hybrid Search<br/>semantic + keyword (RRF)"]
C --> D["SQLite + sqlite-vec<br/>single .db file"]
B --> KG["Knowledge Graph<br/>entities + relations"]
KG --> D
E["Claude Code JSONL<br/>conversations"] --> F["Pipeline"]
F -->|extract → classify → chunk → embed| D
G["Local LLM<br/>Ollama / MLX"] -->|enrich| D
H["Real-time Hooks"] -->|live per-message| D
I["BrainBar<br/>macOS daemon"] -->|Unix socket MCP| B
Everything runs locally. No cloud accounts, no API keys, no Docker, no database servers.
| Component | Implementation |
|---|---|
| Storage | SQLite + sqlite-vec (single .db file, WAL mode) |
| Embeddings | bge-large-en-v1.5 via sentence-transformers (1024 dims, runs on CPU/MPS) |
| Search | Hybrid: vector similarity + FTS5 keyword, merged with Reciprocal Rank Fusion |
| Enrichment | Local LLM via Ollama or MLX — 10-field metadata per chunk |
| MCP Server | stdio-based, MCP SDK v1.26+, compatible with any MCP client |
| Clustering | Leiden + UMAP for brain graph visualization (optional) |
| BrainBar | Native macOS daemon (209KB Swift binary) — always-on MCP over Unix socket |
MCP Tools (9)
Core (4)
| Tool | Description |
|---|---|
brain_search |
Semantic search — unified search across query, file_path, chunk_id, filters. |
brain_store |
Persist memories — ideas, decisions, learnings, mistakes. Auto-type/auto-importance. |
brain_recall |
Proactive retrieval — current context, sessions, session summaries. |
brain_tags |
Browse and filter by tag — discover what's in memory without a search query. |
Knowledge Graph (5)
| Tool | Description |
|---|---|
brain_digest |
Ingest raw content — entity extraction, relations, sentiment, action items. |
brain_entity |
Look up entities in the knowledge graph — type, relations, evidence. |
brain_expand |
Expand a chunk_id with N surrounding chunks for full context. |
brain_update |
Update, archive, or merge existing memories. |
brain_get_person |
Person lookup — entity details, interactions, preferences (~200-500ms). |
Backward Compatibility
All 14 old brainlayer_* names still work as aliases.
Enrichment
BrainLayer enriches each chunk with 10 structured metadata fields using a local LLM:
| Field | Example |
|---|---|
summary |
"Debugging Telegram bot message drops under load" |
tags |
"telegram, debugging, performance" |
importance |
8 (architectural decision) vs 2 (directory listing) |
intent |
debugging, designing, implementing, configuring, deciding, reviewing |
primary_symbols |
"TelegramBot, handleMessage, grammy" |
resolved_query |
"How does the Telegram bot handle rate limiting?" |
epistemic_level |
hypothesis, substantiated, validated |
version_scope |
"grammy 1.32, Node 22" |
debt_impact |
introduction, resolution, none |
external_deps |
"grammy, Supabase, Railway" |
Three enrichment backends (auto-detect: MLX → Ollama → Groq, override via BRAINLAYER_ENRICH_BACKEND):
| Backend | Best for | Speed |
|---|---|---|
| Groq (cloud) | When local LLMs are unavailable | ~1-2s/chunk |
| MLX (Apple Silicon) | M1/M2/M3 Macs (preferred) | 21-87% faster than Ollama |
| Ollama | Any platform | ~1s/chunk (short), ~13s (long) |
brainlayer enrich # Default backend (auto-detects)
BRAINLAYER_ENRICH_BACKEND=groq brainlayer enrich --batch-size=100
Why BrainLayer?
| BrainLayer | Mem0 | Zep/Graphiti | Letta | LangChain Memory | |
|---|---|---|---|---|---|
| MCP native | 9 tools | 1 server | 1 server | No | No |
| Think / Recall | Yes | No | No | No | No |
| Local-first | SQLite | Cloud-first | Cloud-only | Docker+PG | Framework |
| Zero infra | pip install |
API key | API key | Docker | Multiple deps |
| Multi-source | 7 sources | API only | API only | API only | API only |
| Enrichment | 10 fields | Basic | Temporal | Self-write | None |
| Session analysis | Yes | No | No | No | No |
| Real-time | Per-message hooks | No | No | No | No |
| Open source | Apache 2.0 | Apache 2.0 | Source-available | Apache 2.0 | MIT |
BrainLayer is the only memory layer that:
- Thinks before answering — categorizes past knowledge by intent (decisions, bugs, patterns) instead of raw search results
- Runs on a single file — no database servers, no Docker, no cloud accounts
- Works with every MCP client — 9 tools, instant integration, zero SDK
- Knowledge graph — entities, relations, and person lookup across all indexed data
CLI Reference
brainlayer init # Interactive setup wizard
brainlayer index # Index new conversations
brainlayer search "query" # Semantic + keyword search
brainlayer enrich # Run LLM enrichment on new chunks
brainlayer enrich-sessions # Session-level analysis (decisions, learnings)
brainlayer stats # Database statistics
brainlayer brain-export # Generate brain graph JSON
brainlayer export-obsidian # Export to Obsidian vault
brainlayer dashboard # Interactive TUI dashboard
Configuration
All configuration is via environment variables:
| Variable | Default | Description |
|---|---|---|
BRAINLAYER_DB |
~/.local/share/brainlayer/brainlayer.db |
Database file path |
BRAINLAYER_ENRICH_BACKEND |
auto-detect (MLX → Ollama → Groq) | Enrichment LLM backend (mlx, ollama, or groq) |
BRAINLAYER_ENRICH_MODEL |
glm-4.7-flash |
Ollama model name |
BRAINLAYER_MLX_MODEL |
mlx-community/Qwen2.5-Coder-14B-Instruct-4bit |
MLX model identifier |
BRAINLAYER_OLLAMA_URL |
http://127.0.0.1:11434/api/generate |
Ollama API endpoint |
BRAINLAYER_MLX_URL |
http://127.0.0.1:8080/v1/chat/completions |
MLX server endpoint |
BRAINLAYER_STALL_TIMEOUT |
300 |
Seconds before killing a stuck enrichment chunk |
BRAINLAYER_HEARTBEAT_INTERVAL |
25 |
Log progress every N chunks during enrichment |
BRAINLAYER_SANITIZE_EXTRA_NAMES |
(empty) | Comma-separated names to redact from indexed content |
BRAINLAYER_SANITIZE_USE_SPACY |
true |
Use spaCy NER for PII detection |
GROQ_API_KEY |
(unset) | Groq API key for cloud enrichment backend |
BRAINLAYER_GROQ_URL |
https://api.groq.com/openai/v1/chat/completions |
Groq API endpoint |
BRAINLAYER_GROQ_MODEL |
llama-3.3-70b-versatile |
Groq model for enrichment |
Optional Extras
pip install "brainlayer[brain]" # Brain graph visualization (Leiden + UMAP) + FAISS
pip install "brainlayer[cloud]" # Cloud backfill (Gemini Batch API)
pip install "brainlayer[youtube]" # YouTube transcript indexing
pip install "brainlayer[ast]" # AST-aware code chunking (tree-sitter)
pip install "brainlayer[kg]" # GliNER entity extraction (209M params, EN+HE)
pip install "brainlayer[style]" # ChromaDB vector store (alternative backend)
pip install "brainlayer[dev]" # Development: pytest, ruff
Data Sources
BrainLayer can index conversations from multiple sources:
| Source | Format | Indexer |
|---|---|---|
| Claude Code | JSONL (~/.claude/projects/) |
brainlayer index |
| Claude Desktop | JSON export | brainlayer index --source desktop |
Exported .txt chat |
brainlayer index --source whatsapp |
|
| YouTube | Transcripts via yt-dlp | brainlayer index --source youtube |
| Codex CLI | JSONL (~/.codex/sessions) |
brainlayer ingest-codex |
| Markdown | Any .md files |
brainlayer index --source markdown |
| Manual | Via MCP tool | brain_store |
| Real-time | Claude Code hooks | Live per-message indexing (zero-lag) |
Testing
pip install -e ".[dev]"
pytest tests/ # Full suite (1,002 Python tests)
pytest tests/ -m "not integration" # Unit tests only (fast)
ruff check src/ # Linting
# BrainBar (Swift): 28 tests via Xcode
Roadmap
See docs/roadmap.md for planned features including boot context loading, compact search, pinned memories, and MCP Registry listing.
Contributing
Contributions welcome! See CONTRIBUTING.md for dev setup, testing, and PR guidelines.
License
Apache 2.0 — see LICENSE.
Origin
BrainLayer was originally developed as "Zikaron" (Hebrew: memory) inside a personal AI agent ecosystem. It was extracted into a standalone project because every developer deserves persistent AI memory — not just the ones building their own agent systems.
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。