engram
Provides persistent, local-first AI memory across sessions via MCP tools for storing, searching, and retrieving context from past interactions.
README
Engram
<img width="1408" height="768" alt="Gemini_Generated_Image_9nr9z59nr9z59nr9" src="https://github.com/user-attachments/assets/e04422f2-1974-48c2-8568-238bc2641bdf" />
The AI memory layer that never forgets.
Engram is a local-first, open-source AI memory system. It solves one problem: AI sessions end, but your work doesn't. Every decision, debug session, and architecture choice you've had with an AI disappears the moment the conversation closes. Engram makes it permanent, searchable, and retrievable in ~170 tokens.
Everything lives on your machine. No cloud. No API key required to run.
Benchmark Targets
| Benchmark | Metric | Target |
|---|---|---|
| LongMemEval | Single-session QA | ≥ 0.68 F1 |
| LongMemEval | Multi-session QA | ≥ 0.61 F1 |
| LoCoMo | Entity recall | ≥ 0.72 |
| LoCoMo | Event recall | ≥ 0.69 |
| ES compression | Factual paragraphs | 8–10× |
| ES compression | Code-heavy content | 4–6× |
| ES compression | Mixed | ~6× |
| Cold-start context | L0 + L1 tokens | ≤ 170 |
| Search latency p99 | ChromaDB 100k | < 200ms |
| Search latency p99 | FAISS 100k | < 50ms |
Quick Start
# Install
pip install engram
# Or with optional backends
pip install "engram[faiss]" # FAISS speed backend
pip install "engram[sqlitevec]" # zero-dependency fallback
pip install "engram[all]" # everything
# Initialise your memory château
engram init ~/myproject
# Mine a project directory
engram mine ~/myproject --wing myapp
# Mine a conversation export
engram mine ~/Downloads/claude-export --mode convos --wing myapp
# Search
engram search "auth migration decisions" --wing myapp
# Load cold-start context (~170 tokens)
engram wake-up
Memory Château Architecture
Wing (person or project)
└── Room (named topic: "auth-migration", "ci-pipeline")
├── Hall (memory type: facts | events | discoveries | preferences | advice)
│ ├── Closet (ES-compressed summary — fast AI read)
│ └── Drawer (verbatim original — never summarised)
└── Tunnel (cross-wing link when same room spans multiple wings)
Memory Layers
| Layer | Content | Size | When Loaded |
|---|---|---|---|
| L0 | Identity — who is this AI? | ~50 tokens | Always |
| L1 | Critical facts in ES | ~120 tokens | Always |
| L2 | Room recall — current project | On demand | When topic arises |
| L3 | Deep semantic search | On demand | When explicitly queried |
Total cold-start context: ~170 tokens (L0 + L1 only).
Using with Claude via MCP
Start the MCP server:
python -m engram.mcp_server
Add to ~/.config/claude/claude_desktop_config.json:
{
"mcpServers": {
"engram": {
"command": "python",
"args": ["-m", "engram.mcp_server"]
}
}
}
Claude will then have access to all 22 engram_* tools including engram_wake_up,
engram_search, engram_add_memory, engram_kg_add, engram_replay, and more.
See examples/mcp_setup.md for the full tool list.
Using with Local Models
Engram works entirely offline. The vector backends use local embeddings:
from engram.backends import get_backend
from engram.palace import Palace
from engram.searcher import Searcher
palace = Palace()
backend = get_backend("chromadb") # or "faiss" / "sqlitevec"
searcher = Searcher(backend, palace)
results = searcher.search("auth migration")
for r in results:
print(r["text"], r["final_score"])
No model API required. ChromaDB embeds locally using its bundled models.
Full CLI Reference
Setup
engram init <dir> # guided onboarding + ES bootstrap
Mining
engram mine <dir> # mine project files
engram mine <dir> --mode convos # mine conversation exports
engram mine <dir> --mode convos --wing myapp # tag with a wing
engram mine <dir> --since 2026-01-01 # skip files older than date
engram mine <dir> --plugin obsidian # use Obsidian vault plugin
engram mine <dir> --plugin notion # Notion export
engram mine <dir> --plugin linear # Linear issues export
Watch Mode
engram watch <dir> # auto-mine on file changes
engram watch <dir> --wing myapp --mode convos # tag + conversation mode
Search
engram search "query"
engram search "query" --wing myapp
engram search "query" --room auth-migration
engram search "query" --no-decay # disable recency weighting
engram search "query" --results 20 # max results
Memory Stack
engram wake-up # L0 + L1 context dump (~170 tokens)
engram wake-up --wing myapp # wing-scoped L1
engram wake-up --rebuild # rebuild L1 from drawers
Compression
engram compress # ES compress all closets
engram compress --wing myapp # wing-scoped
engram compress --wing myapp --room auth # room-scoped
Knowledge Graph
engram kg query "Kai"
engram kg query "Kai" --all # include expired triples
engram kg add "Kai" works_on "Orion" --from 2025-06-01
engram kg invalidate "Kai" works_on "Orion" --ended 2026-03-01
engram kg timeline "auth-migration"
Maintenance
engram conflicts # interactive TUI conflict resolver
engram audit # health check
engram audit --fix # auto-resolve safe issues
engram replay --room auth-migration # chronological room story
engram replay --room auth-migration --wing myapp
engram status # château overview
engram split <dir> # split concatenated transcripts
engram split <dir> --dry-run
Configuration
~/.engram/config.json
{
"palace_path": "~/.engram/palace",
"vector_backend": "chromadb",
"decay_factor": 0.005,
"decay_max_days": 90,
"collection_name": "engram_drawers",
"people_map": {}
}
| Key | Default | Description |
|---|---|---|
palace_path |
~/.engram/palace |
Root of the château filesystem |
vector_backend |
chromadb |
chromadb | faiss | sqlitevec |
decay_factor |
0.005 |
Recency boost per day: score * (1 + factor * days) |
decay_max_days |
90 |
Days after which decay levels off |
collection_name |
engram_drawers |
ChromaDB collection name |
~/.engram/identity.txt — plain text, becomes your L0 context.
~/.engram/wing_config.json — generated by engram init.
Module Reference
| File | Description |
|---|---|
engram/palace.py |
Wing/Room/Hall/Closet/Drawer data model |
engram/config.py |
Config loading, ~/.engram/ management |
engram/shorthand.py |
Engram Shorthand (ES) compression dialect |
engram/knowledge_graph.py |
Temporal KG, SQLite backend |
engram/miner.py |
Project file ingest pipeline |
engram/convo_miner.py |
Conversation export ingest (Claude, ChatGPT, Slack) |
engram/searcher.py |
Semantic search + recency weighting |
engram/layers.py |
L0–L3 memory stack |
engram/watcher.py |
FSEvents/inotify watch mode |
engram/conflict.py |
Contradiction detection + TUI resolver |
engram/audit.py |
Memory health audit |
engram/replay.py |
Session/room replay |
engram/agents.py |
Specialist agent diary system |
engram/palace_graph.py |
Room navigation graph |
engram/onboarding.py |
Guided init + ES bootstrap |
engram/cli.py |
Typer CLI entry point |
engram/mcp_server.py |
MCP server — 22 tools |
engram/backends/base.py |
Abstract VectorBackend interface |
engram/backends/chromadb_backend.py |
ChromaDB backend (default) |
engram/backends/faiss_backend.py |
FAISS backend (speed-optimised) |
engram/backends/sqlitevec_backend.py |
sqlite-vec backend (zero-dependency) |
engram/plugins/obsidian.py |
Obsidian vault plugin miner |
engram/plugins/notion.py |
Notion export plugin miner |
engram/plugins/linear.py |
Linear issues plugin miner |
engram/hooks/engram_save_hook.sh |
Claude Code auto-save hook |
engram/hooks/engram_precompact_hook.sh |
Claude Code pre-compact hook |
Engram Shorthand (ES)
ES is a lossless compression dialect that any LLM can read without a decoder.
from engram.shorthand import compress, decompress
text = (
"The authentication module is a critical component that has a dependency "
"on the database and is responsible for verifying user credentials."
)
es = compress(text, confidence=4)
# → "auth module:★★ component + dependency db & responsible verifying user credentials [★★★★]"
decompress(es)
# → expands symbols back to natural language
# Code-aware compression
code = "def authenticate(user: str, token: str) -> bool:"
compress(code, is_code=True)
# → "fn:authenticate(user:str,token:str)->bool"
# Diff notation
compress("+add_middleware()\n-manual_verify()", is_diff=True, diff_filename="auth.py")
# → "CHANGE:auth.py add:add_middleware() rm:manual_verify()"
Recency Weighting
final_score = semantic_score × (1 + recency_boost)
recency_boost = decay_factor × max(0, decay_max_days − age_days)
Pinned drawers (engram_pin) bypass decay entirely.
Claude Code Hooks
Copy hooks to ~/.engram/hooks/ then add to ~/.claude/settings.json:
{
"hooks": {
"PostToolUse": [{
"matcher": "Write|Edit",
"hooks": [{"type": "command", "command": "~/.engram/hooks/engram_save_hook.sh"}]
}],
"PreCompact": [{
"hooks": [{"type": "command", "command": "~/.engram/hooks/engram_precompact_hook.sh"}]
}]
}
}
Set ENGRAM_WING=myapp and ENGRAM_ROOM=current-task in your environment.
Requirements
- Python 3.9+
- chromadb ≥ 0.4.0
- typer ≥ 0.9.0
- rich ≥ 13.0.0
- watchdog ≥ 3.0.0
- questionary ≥ 2.0.0
- pyyaml ≥ 6.0
Optional:
faiss-cpu— FAISS backendsqlite-vec— sqlite-vec backend
No API key. No internet after install.
Contributing
- Fork the repo
pip install -e ".[dev]"pre-commit install- Run tests:
pytest tests/ -v - Submit a PR
License
MIT © 2026 Tushae Thomas
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。