Memory Palace
Persistent semantic memory for AI agents, enabling storage, semantic search, knowledge graph connections, and inter-instance messaging across conversations using local models via Ollama.
README
Memory Palace
Persistent semantic memory for AI agents. Store facts, decisions, insights, and context across conversations. Search by meaning, not just keywords. Build a knowledge graph of connected memories. Share across models, instances, and providers.
The Problem
Every AI session starts as a blank slate. Context windows are finite. Sessions end, knowledge dies.
Current solutions are all vendor-locked: ChatGPT's memory only works with OpenAI. Anthropic's projects only work with Claude. Switch providers and you start over. Your accumulated context — decisions, preferences, project history — belongs to the vendor, not to you.
Meanwhile, the industry races to build bigger context windows. 128K. 200K. 1M tokens. But you don't solve human amnesia by giving someone a bigger whiteboard. Memory doesn't belong inside the model — it belongs alongside it.
Memory Palace is a persistent semantic memory layer that any MCP-compatible AI can access. It separates memory from the model, the same way databases separated data from applications decades ago. The context window becomes working memory. Memory Palace is long-term storage. That's how actual brains work.
Quick Start
# Clone and install
git clone https://github.com/jeffpierce/memory-palace.git
cd memory-palace
pip install -e .
# Run setup wizard (detects GPU, downloads models)
python -m setup.first_run
Platform-specific installers are also available: install.bat / install.ps1 (Windows), ./install.sh (macOS/Linux).
See docs/README.md for detailed installation and configuration instructions.
Requirements
- Python 3.10+
- Ollama (local model serving)
- NVIDIA GPU with 4GB+ VRAM (recommended, not required)
Model Selection
Models are auto-detected by the setup wizard. Defaults are chosen to run everywhere:
| VRAM | Embedding | LLM | Quality |
|---|---|---|---|
| Any (CPU ok) | nomic-embed-text | qwen3:1.7b | Good — runs on anything |
| 6-8GB | nomic-embed-text | qwen3:8b | Better reasoning |
| 12GB+ | snowflake-arctic-embed | qwen3:14b | Best extraction quality |
See docs/models.md for the full model guide with VRAM budgets and upgrade paths.
Features
- Semantic Search — Find memories by meaning using local embedding models via Ollama
- Knowledge Graph — Typed, weighted, directional edges between memories with automatic graph context in retrieval
- Centrality-Weighted Ranking — Retrieval scores combine semantic similarity, access frequency, and graph centrality
- Auto-Linking — New memories automatically link to similar existing ones (configurable thresholds)
- Multi-Project Support — Memories can belong to multiple projects simultaneously
- Foundational Memories — Core memories protected from archival and decay
- Code Indexing — Index source files as prose descriptions for natural language code search
- Inter-Instance Messaging — Unified pub/sub messaging between AI instances with channels, priorities, and push notifications via OpenClaw gateway wake
- Transcript Reflection — Automatically extract memories from conversation logs
- Named Databases — Domain partitions (life, work, per-project) with auto-derivation and runtime management
- OpenClaw Native Plugin — 13 tools registered directly with the gateway, zero MCP overhead, real-time pubsub wake
- Multi-Backend — SQLite for personal use, PostgreSQL + pgvector for teams
- Local Processing — All embeddings, extraction, and synthesis run locally via Ollama
- MCP Integration — Works natively with any MCP-compatible client (Claude Desktop, Claude Code, etc.)
- TOON Encoding — Token-efficient structured responses that reduce context window usage
How Does It Compare?
The MCP memory space is active. Here's how Memory Palace stacks up against the most capable alternatives:
| Feature | Memory Palace | Mem0 | Cognee | Memento | Zep/Graphiti | doobidoo |
|---|---|---|---|---|---|---|
| Persistent memory | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Semantic search | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Knowledge graph (typed edges) | ✅ | Partial | ✅ | ✅ | ✅ | ❌ |
| Centrality-weighted ranking | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Multi-instance messaging | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Lifecycle management (audit, dedup, contradictions) | ✅ | Partial | Partial | ❌ | Partial | Partial |
| Semantic code search (prose-based) | ✅ | ❌ | AST-based | ❌ | ❌ | ❌ |
| Transcript extraction | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Fully local (no cloud LLMs) | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| MCP native | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
What's actually different
Semantic code search that isn't grep or AST parsing. code_remember uses a local LLM to generate a prose description of what a source file does, embeds that prose, and stores the raw code separately. When you search "how does authentication work", you're matching against natural language descriptions, not token sequences or syntax trees. The only other tool using this technique is Greptile (cloud SaaS, $30/dev/mo). Everyone else — Cursor, Sourcegraph, Cognee, GitLab — embeds raw code chunks or parses ASTs.
Multi-instance messaging built into the memory layer. Different AI instances (desktop app, code editor, web) can send typed messages to each other through the palace — handoffs, status updates, questions, context sharing. No other MCP memory server has this. Agent orchestration frameworks (A2A, Agent-MCP) exist but they're not memory systems.
Everything runs locally. Embeddings, search synthesis, relationship classification, transcript extraction — all via Ollama on your hardware. Most competitors require cloud LLM calls for at least some operations.
Tools (13)
Core Memory
| Tool | Description |
|---|---|
memory_set |
Store a new memory with optional auto-linking to similar memories |
memory_recall |
Semantic search with centrality-weighted ranking and graph context |
memory_get |
Retrieve memories by ID with optional graph traversal (BFS) |
memory_recent |
Get the last X memories — title-card format by default, verbose on request |
memory_archive |
Archive memories with foundational/centrality protection (soft delete) |
Knowledge Graph
| Tool | Description |
|---|---|
memory_link |
Create a typed, weighted, optionally bidirectional edge between memories |
memory_unlink |
Remove edges between memories |
Standard relationship types: relates_to, derived_from, contradicts, exemplifies, refines, supersedes. Custom types are also supported.
Messaging
| Tool | Description |
|---|---|
message |
Unified inter-instance messaging — send, get, mark read/unread, subscribe to channels |
Replaces the old handoff_send / handoff_get / handoff_mark_read tools with a single action-based interface supporting channels, priorities (0-10), and pub/sub patterns.
Code Indexing
| Tool | Description |
|---|---|
code_remember_tool |
Index a source file into the palace (creates linked prose + code memories) |
Queries hit the prose description via semantic search, then graph traversal retrieves the actual source code. This separation produces far better search results than embedding raw code.
Maintenance
| Tool | Description |
|---|---|
memory_audit |
Health checks — duplicates, stale memories, orphan edges, contradictions, missing embeddings |
memory_reembed |
Regenerate embeddings (backfill missing, refresh stale, re-embed after model change) |
memory_stats |
Overview statistics — counts by type, instance, project, most accessed, recently added |
Processing
| Tool | Description |
|---|---|
memory_reflect |
Extract memories from conversation transcripts (JSONL or TOON format) |
Key Concepts
Graph Context in Retrieval
Both memory_recall and memory_get automatically include depth-1 graph context (incoming/outgoing edges) by default. This shows how memories connect without separate graph traversal calls.
memory_recall— Graph context for top N results (default 5, configurable viagraph_top_n)memory_get— Graph context for ALL requested memories (targeted fetches get full context)
Auto-Linking
When storing a new memory, the system automatically finds similar existing memories and creates typed edges:
- Auto-linked (>= 0.75 similarity) — Edges created automatically with LLM-classified relationship types
- Suggested (0.675-0.75 similarity) — Surfaced for human review, no edges created
Configurable per-instance. Can be scoped to same-project only.
Multi-Project Memories
Memories can belong to one or more projects simultaneously. Queries can filter by single project (contains) or multiple projects (union). Stats explode multi-project memories across each project for accurate counts.
Centrality-Weighted Ranking
Recall results are ranked by a weighted combination of:
score = (semantic_similarity x 0.7) + (log(access_count + 1) x 0.15) + (in_degree_centrality x 0.15)
Frequently accessed, well-connected memories rank higher than isolated ones at the same similarity score.
Configuration
Configuration loads from ~/.memory-palace/config.json with environment variable overrides.
| Variable | Description | Default |
|---|---|---|
MEMORY_PALACE_DATA_DIR |
Data directory | ~/.memory-palace |
MEMORY_PALACE_DATABASE_URL |
Database connection URL (overrides config file) | None |
OLLAMA_HOST |
Ollama server URL | http://localhost:11434 |
MEMORY_PALACE_EMBEDDING_MODEL |
Embedding model | Auto-detected |
MEMORY_PALACE_LLM_MODEL |
LLM for synthesis/extraction | Auto-detected |
MEMORY_PALACE_INSTANCE_ID |
Default instance ID | unknown |
MEMORY_PALACE_NOTIFY_COMMAND |
Post-send notification command template | None |
MEMORY_PALACE_INSTANCE_ROUTES |
Instance route map (JSON string) for push notifications | {} |
{
"database": {
"type": "postgres",
"url": "postgresql://localhost:5432/memory_palace"
},
"databases": {
"default": {"type": "postgres", "url": "postgresql://localhost:5432/memory_palace"},
"life": {"type": "postgres", "url": "postgresql://localhost:5432/memory_palace_life"}
},
"default_database": "default",
"ollama_url": "http://localhost:11434",
"embedding_model": null,
"llm_model": null,
"synthesis": {
"enabled": true
},
"auto_link": {
"enabled": true,
"link_threshold": 0.75,
"suggest_threshold": 0.675
},
"toon_output": true,
"extensions": ["mcp_server.extensions.db_manager"],
"instances": ["support", "engineering", "analytics"],
"notify_command": null,
"instance_routes": {
"support": {
"gateway": "http://localhost:18789",
"token": "your-gateway-token-here"
}
}
}
The databases key enables named databases (domain partitions). If absent, the single database key is used. See docs/POSTGRES.md for PostgreSQL setup and docs/architecture.md for backend details.
Architecture
memory-palace/
├── mcp_server/ # MCP server package
│ ├── server.py # Server entry point
│ ├── toon_wrapper.py # TOON response encoding
│ ├── tools/ # 13 core tool implementations
│ └── extensions/ # Extension tools (db_manager, switch_db)
├── memory_palace/ # Core library
│ ├── models_v3.py # SQLAlchemy models (Memory, MemoryEdge, Message)
│ ├── database_v3.py # Named engine registry (SQLite / PostgreSQL)
│ ├── bridge.py # OpenClaw bridge subprocess (NDJSON protocol)
│ ├── embeddings.py # Ollama embedding client
│ ├── llm.py # LLM integration (synthesis, classification)
│ ├── config_v2.py # Configuration with named databases + auto-link
│ ├── services/ # Business logic layer
│ │ ├── memory_service.py # remember, recall, archive, stats
│ │ ├── graph_service.py # link, unlink, traverse
│ │ ├── message_service.py # pub/sub messaging
│ │ ├── maintenance_service.py # audit, reembed, cleanup
│ │ ├── code_service.py # code indexing + retrieval
│ │ └── reflection_service.py # transcript extraction
│ └── migrations/ # Schema migration scripts
├── openclaw_plugin/ # Native OpenClaw plugin (TypeScript)
│ ├── src/index.ts # Plugin registration + session registry + wake dispatch
│ ├── src/bridge.ts # PalaceBridge class (NDJSON over stdin/stdout)
│ └── src/tools/ # 13 tool definitions mapped to bridge methods
├── setup/ # Setup wizard
├── extensions/ # Optional extensions (Moltbook gateway, TOON converter)
├── examples/ # Integration examples and walkthroughs
├── docs/ # Documentation
└── tests/ # Test suite
See docs/architecture.md for the full design vision, knowledge graph details, and scaling roadmap.
Extensions
Memory Palace includes optional extensions that operate as standalone tools or additional MCP servers:
| Extension | Description | Type |
|---|---|---|
| moltbook-gateway | Standalone MCP server for Moltbook submission with 6 mechanical interlocks | MCP Server |
| toon-converter | CLI + optional MCP server for converting JSONL to TOON format | CLI / MCP Server |
Extensions are independent from the core Memory Palace server and can be used separately.
Examples
| Example | Description |
|---|---|
| agent-prompt.md | Template for adding memory instructions to agent system prompts |
| soul-file.md | Template for integrating memory into character/persona files |
| centrality_weighted_search.py | Python example of centrality-weighted search |
| test_graph_context_mcp.md | Walkthrough for testing graph context via MCP |
| test_maintenance_mcp.md | Walkthrough for testing maintenance tools via MCP |
Documentation
| Document | Description |
|---|---|
| docs/README.md | Detailed installation, configuration, and usage guide |
| docs/OPENCLAW.md | OpenClaw native plugin guide — bridge protocol, pubsub wake, session discovery |
| docs/POSTGRES.md | PostgreSQL setup, named databases, LISTEN/NOTIFY mechanics |
| docs/architecture.md | Design vision, knowledge graph, backends, scaling roadmap |
| docs/models.md | Model selection guide with VRAM budgets |
| docs/use-cases.md | Real-world use cases from personal to enterprise |
| docs/centrality-weighted-retrieval.md | Centrality ranking deep-dive |
| docs/QUICKSTART_CENTRALITY.md | Centrality-weighted retrieval quickstart |
| docs/MAINTENANCE.md | Maintenance design document |
| docs/MAINTENANCE_QUICKREF.md | Maintenance quick reference |
| docs/TESTING_MAINTENANCE.md | Testing maintenance tools guide |
| docs/MIGRATION_2.0.md | v1.0 to v2.0 migration guide |
License
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。