Muninn
A local-first persistent memory server that provides AI agents with deterministic, multimodal information retrieval across different sessions and projects. It enables long-term memory continuity using a 5-signal hybrid search engine and cognitive reasoning loops designed for complex development workflows.
README
<img src="assets/muninn_banner.jpeg" alt="Muninn — Persistent Memory MCP" width="100%"/>
Muninn
"Muninn flies each day over the world to bring Odin knowledge of what happens." — Prose Edda
Local-first persistent memory infrastructure for coding agents and MCP-compatible tools.
Muninn provides deterministic, explainable memory retrieval with robust transport behavior and production-grade operational controls. Designed for long-running development workflows where continuity, auditability, and measurable quality matter — across sessions, across assistants, and across projects.
🚩 Status
Current Version: v3.24.0 (Phase 26 COMPLETE) Stability: Production Beta Test Suite: 1422+ passing, 0 failing
What's New in v3.24.0
- Cognitive Architecture (CoALA): Integration of a proactive reasoning loop bridging memory with active decision-making.
- Knowledge Distillation: Background synthesis of episodic memories into structured semantic manuals for long-term wisdom.
- Epistemic Foraging: Active inference-driven search to resolve ambiguities and fill information gaps autonomously.
- Omission Filtering: Automated detection of missing context required for successful task execution.
- Elo-Rated SNIPS Governance: Dynamic memory retention system mapping retrieval success to Elo ratings for usage-driven decay.
Previous Milestones
| Version | Phase | Key Feature |
|---|---|---|
| v3.24.0 | 26 | Cognitive Architecture Complete |
| v3.23.0 | 23 | Elo-Rated SNIPS Governance |
| v3.22.0 | 22 | Temporal Knowledge Graph |
| v3.19.0 | 20 | Multimodal Hive Mind Operations |
| v3.18.3 | 19 | Bulk legacy import, NLI conflict detection, uncapped discovery |
| v3.18.1 | 19 | Scout synthesis, hunt mode |
🚀 Features
Core Memory Engine
- Local-First: Zero cloud dependency — all data stays on your machine
- Multimodal: Native support for Text, Image, Audio, Video, and Sensor data
- 5-Signal Hybrid Retrieval: Dense vector · BM25 lexical · Graph traversal · Temporal relevance · Goal relevance
- Explainable Recall Traces: Per-signal score attribution on every search result
- Bi-Temporal Reasoning: Support for "Valid Time" vs "Transaction Time" via Temporal Knowledge Graph
- Project Isolation:
scope="project"memories never cross repo boundaries;scope="global"memories are always available - Cross-Session Continuity: Memories survive session ends, assistant switches, and tool restarts
- Bi-Temporal Records:
created_at(real-world event time) vsingested_at(system intake time)
Memory Lifecycle
- Elo-Rated Governance: Dynamic retention driven by retrieval feedback (SNIPS) and usage statistics
- Consolidation Daemon: Background process for decay, deduplication, promotion, and shadowing — inspired by sleep consolidation
- Zero-Trust Ingestion: Isolated subprocess parsing for PDF/DOCX to neutralize document-based exploits
- ColBERT Multi-Vector: Native Qdrant multi-vector storage for MaxSim scoring
- NL Temporal Query Expansion: Natural-language time phrases ("last week", "before the refactor") parsed into structured time ranges
- Goal Compass: Retrieval signal for project objectives and constraint drift
- NLI Conflict Detection: Transformer-based contradiction detection (
cross-encoder/nli-deberta-v3-small) for memory integrity - Bulk Legacy Import: One-click ingestion of all discovered legacy sources (batched, error-isolated) via dashboard or API
Operational Controls
- MCP Transport Hardening: Framed + line JSON-RPC, timeout-window guardrails, protocol negotiation
- Runtime Profile Control:
get_model_profiles/set_model_profilesfor dynamic model routing - Profile Audit Log: Immutable event ledger for profile policy mutations
- Browser Control Center: Web UI for search, ingestion, consolidation, and admin at
http://localhost:42069 - OpenTelemetry: GenAI semantic convention tracing (feature-gated via
MUNINN_OTEL_ENABLED)
Multi-Assistant Interop
- Handoff Bundles: Export/import memory checkpoints with checksum verification and idempotent replay
- Legacy Migration: Discover and import memories from prior assistant sessions (JSONL chat history, SQLite state) — uncapped provider limits
- Bulk Import:
POST /ingest/legacy/import-allingests all discovered sources in batches of 50 with per-batch error isolation - Hive Mind Federation: Push-based low-latency memory synchronization across assistant runtimes
- MCP 2025-11 Compliant: Full protocol negotiation, lifecycle gating, schema annotations
Quick Start
git clone https://github.com/wjohns989/Muninn.git
cd Muninn
pip install -e .
Set the auth token (shared between server and MCP wrapper):
# Windows (persists across sessions)
setx MUNINN_AUTH_TOKEN "your-token-here"
# Linux/macOS
export MUNINN_AUTH_TOKEN="your-token-here"
Start the backend:
python server.py
Verify it's running:
curl http://localhost:42069/health
# {"status":"ok","memory_count":0,...,"backend":"muninn-native"}
Runtime Modes
| Mode | Command | Description |
|---|---|---|
| Muninn MCP | python mcp_wrapper.py |
stdio MCP server for active assistant/IDE sessions |
| Huginn Standalone | python muninn_standalone.py |
Browser-first UX for direct ingestion/search/admin |
| REST API | python server.py |
FastAPI backend at http://localhost:42069 |
| Packaged App | python scripts/build_standalone.py |
PyInstaller executable (Huginn Control Center) |
All modes use the same memory engine and data directory.
MCP Client Configuration
Claude Code (recommended — bakes auth token into registration):
claude mcp add -s user muninn \
-e MUNINN_AUTH_TOKEN="your-token-here" \
-- python /absolute/path/to/mcp_wrapper.py
Generic MCP client (claude_desktop_config.json or equivalent):
{
"mcpServers": {
"muninn": {
"command": "python",
"args": ["/absolute/path/to/mcp_wrapper.py"],
"env": {
"MUNINN_AUTH_TOKEN": "your-token-here"
}
}
}
}
Important: Both
server.pyandmcp_wrapper.pymust share the sameMUNINN_AUTH_TOKEN. If either process generates a random token (when the env var is unset), all MCP tool calls fail with 401.
MCP Tools
| Tool | Description |
|---|---|
add_memory |
Store a memory with optional scope, project, namespace, media_type |
search_memory |
Hybrid 5-signal search with media_type filtering and recall traces |
get_all_memories |
Paginated memory listing with filters |
update_memory |
Update content or metadata of an existing memory |
delete_memory |
Remove a memory by ID |
set_project_goal |
Set the current project's objective and constraints |
get_project_goal |
Retrieve the active project goal |
set_project_instruction |
Store a project-scoped rule (scope="project" by default) |
get_model_profiles |
Get active model routing profiles |
set_model_profiles |
Update model routing profiles |
get_model_profile_events |
Audit log for profile policy changes |
export_handoff |
Export a memory handoff bundle |
import_handoff |
Import a handoff bundle (idempotent) |
ingest_sources |
Ingest files/folders into memory |
discover_legacy_sources |
Find prior assistant session files for migration |
ingest_legacy_sources |
Import discovered legacy memories |
record_retrieval_feedback |
Submit outcome signal for adaptive calibration |
Python SDK
from muninn import Memory
# Sync client
client = Memory(base_url="http://127.0.0.1:42069", auth_token="your-token-here")
client.add(
content="Always use typed Pydantic models for API payloads",
metadata={"project": "muninn", "scope": "project"}
)
results = client.search("API payload patterns", limit=5)
for r in results:
print(r.content, r.recall_trace)
Async client:
from muninn import AsyncMemory
async def main():
async with AsyncMemory(base_url="http://127.0.0.1:42069", auth_token="your-token-here") as client:
await client.add(content="...", metadata={})
results = await client.search("...", limit=5)
REST API
| Method | Path | Description |
|---|---|---|
GET |
/health |
Server health + memory/vector/graph counts |
POST |
/add |
Add a memory (supports media_type) |
POST |
/search |
Hybrid search (supports media_type filtering) |
GET |
/get_all |
Paginated memory listing |
PUT |
/update |
Update a memory |
DELETE |
/delete/{memory_id} |
Delete a memory |
POST |
/ingest |
Ingest files/folders |
POST |
/ingest/legacy/discover |
Discover legacy session files |
POST |
/ingest/legacy/import |
Import selected legacy memories |
POST |
/ingest/legacy/import-all |
Discover and import ALL legacy sources (batched) |
GET |
/ingest/legacy/status |
Legacy discovery scheduler status |
GET |
/ingest/legacy/catalog |
Paginated cached catalog of discovered sources |
GET |
/profiles/model |
Get model routing profiles |
POST |
/profiles/model |
Set model routing profiles |
GET |
/profiles/model/events |
Profile audit log |
GET |
/profile/user/get |
Get user profile |
POST |
/profile/user/set |
Update user profile |
POST |
/handoff/export |
Export handoff bundle |
POST |
/handoff/import |
Import handoff bundle |
POST |
/feedback/retrieval |
Submit retrieval feedback |
GET |
/goal/get |
Get project goal |
POST |
/goal/set |
Set project goal |
Auth: Authorization: Bearer <MUNINN_AUTH_TOKEN> required on all non-health endpoints.
Configuration
Key environment variables:
| Variable | Default | Description |
|---|---|---|
MUNINN_AUTH_TOKEN |
random | Shared secret between server and MCP wrapper |
MUNINN_SERVER_URL |
http://localhost:42069 |
Backend URL for MCP wrapper |
MUNINN_PROJECT_SCOPE_STRICT |
off | =1 disables cross-project fallback entirely |
MUNINN_MCP_SEARCH_PROJECT_FALLBACK |
off | =1 enables global-scope fallback on empty results |
MUNINN_OPERATOR_MODEL_PROFILE |
balanced |
Default model routing profile |
MUNINN_OTEL_ENABLED |
off | =1 enables OpenTelemetry tracing |
MUNINN_OTEL_ENDPOINT |
http://localhost:4318 |
OTLP HTTP endpoint for trace export |
MUNINN_CHAINS_ENABLED |
off | =1 enables graph memory chain detection (PRECEDES/CAUSES edges) |
MUNINN_COLBERT_MULTIVEC |
off | =1 enables native ColBERT multi-vector storage |
MUNINN_FEDERATION_ENABLED |
off | =1 enables P2P memory synchronization |
MUNINN_FEDERATION_PEERS |
- | Comma-separated list of peer base URLs |
MUNINN_FEDERATION_SYNC_ON_ADD |
off | =1 enables real-time push-on-add to peers |
MUNINN_TEMPORAL_QUERY_EXPANSION |
off | =1 enables NL time-phrase parsing in search |
Evaluation & Quality Gates
Muninn includes an evaluation toolchain for measurable quality enforcement:
# Run full benchmark dev-cycle
python -m eval.ollama_local_benchmark dev-cycle
# Check phase hygiene gates
python -m eval.phase_hygiene
# Emit SOTA+ signed verdict artifact
python -m eval.ollama_local_benchmark sota-verdict \
--longmemeval-report path/to/lme_report.json \
--min-longmemeval-ndcg 0.60 \
--min-longmemeval-recall 0.65 \
--signing-key "$SOTA_SIGNING_KEY"
# Run LongMemEval adapter selftest (no server needed)
python eval/longmemeval_adapter.py --selftest
# Run StructMemEval adapter selftest (no server needed)
python eval/structmemeval_adapter.py --selftest
# Run StructMemEval against a live server
python eval/structmemeval_adapter.py \
--dataset path/to/structmemeval.jsonl \
--server-url http://localhost:42069 \
--auth-token "$MUNINN_AUTH_TOKEN"
Metrics tracked: nDCG@k, Recall@k, MRR@k, Exact Match, token-F1, p50/p95 latency, significance testing (Bonferroni/BH correction), effect-size analysis.
The sota-verdict command emits a signed JSON artifact with commit_sha, SHA256 file hashes, and HMAC-SHA256 promotion_signature — enabling auditable, commit-pinned SOTA+ evidence.
Data & Security
- Default data dir:
~/.local/share/AntigravityLabs/muninn/(Linux/macOS) ·%LOCALAPPDATA%\AntigravityLabs\muninn\(Windows) - Storage: SQLite (metadata) + Qdrant (vectors) + KuzuDB (memory chains graph)
- No cloud dependency: All data local by default
- Auth: Bearer token required on all API calls; token shared via env var
- Namespace isolation:
user_id+namespace+projectboundaries enforced at every retrieval layer
Documentation Index
| Document | Description |
|---|---|
SOTA_PLUS_PLAN.md |
Active development phases and roadmap |
HANDOFF.md |
Operational setup, auth flow, known issues |
docs/ARCHITECTURE.md |
System architecture deep-dive |
docs/MUNINN_COMPREHENSIVE_ROADMAP.md |
Full feature roadmap (v3.1→v3.3+) |
docs/AGENT_CONTINUATION_RUNBOOK.md |
How to resume development across sessions |
docs/PYTHON_SDK.md |
Python SDK reference |
docs/INGESTION_PIPELINE.md |
Ingestion pipeline internals |
docs/OTEL_GENAI_OBSERVABILITY.md |
OpenTelemetry integration guide |
docs/PLAN_GAP_EVALUATION.md |
Gap analysis against SOTA memory systems |
Licensing
- Code: Apache License 2.0 (
LICENSE) - Third-party dependency licenses remain with their respective owners
- Attribution: See
NOTICE
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。