qualixar/superlocalmemory
Persistent AI memory MCP server with 4-channel retrieval (semantic, BM25, entity graph, temporal). 74.8% on LoCoMo benchmark with zero cloud dependency. Works with Claude, Cursor, VS Code Copilot, and 17+ AI tools. EU AI Act compliant. MIT license.
README
<p align="center"> <img src="https://superlocalmemory.com/assets/logo-mark.png" alt="SuperLocalMemory" width="200"/> </p>
<h1 align="center">SuperLocalMemory V3</h1> <p align="center"><strong>The first local-only AI memory to break 74% retrieval on LoCoMo.<br/>No cloud. No APIs. No data leaves your machine.</strong></p>
<p align="center"> <code>+16pp vs Mem0 (zero cloud)</code> · <code>85% Open-Domain (best of any system)</code> · <code>EU AI Act Ready</code> </p>
<p align="center"> <a href="https://arxiv.org/abs/2603.14588"><img src="https://img.shields.io/badge/arXiv-2603.14588-b31b1b?style=for-the-badge&logo=arxiv&logoColor=white" alt="arXiv Paper"/></a> <a href="https://pypi.org/project/superlocalmemory/"><img src="https://img.shields.io/pypi/v/superlocalmemory?style=for-the-badge&logo=pypi&logoColor=white" alt="PyPI"/></a> <a href="https://www.npmjs.com/package/superlocalmemory"><img src="https://img.shields.io/npm/v/superlocalmemory?style=for-the-badge&logo=npm&logoColor=white" alt="npm"/></a> <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-green?style=for-the-badge" alt="MIT License"/></a> <a href="#eu-ai-act-compliance"><img src="https://img.shields.io/badge/EU_AI_Act-Compliant-brightgreen?style=for-the-badge" alt="EU AI Act"/></a> <a href="https://superlocalmemory.com"><img src="https://img.shields.io/badge/Web-superlocalmemory.com-ff6b35?style=for-the-badge" alt="Website"/></a> </p>
Why SuperLocalMemory?
Every major AI memory system — Mem0, Zep, Letta, EverMemOS — sends your data to cloud LLMs for core operations. That means latency on every query, cost on every interaction, and after August 2, 2026, a compliance problem under the EU AI Act.
SuperLocalMemory V3 takes a different approach: mathematics instead of cloud compute. Three techniques from differential geometry, algebraic topology, and stochastic analysis replace the work that other systems need LLMs to do — similarity scoring, contradiction detection, and lifecycle management. The result is an agent memory that runs entirely on your machine, on CPU, with no API keys, and still outperforms funded alternatives.
The numbers (evaluated on LoCoMo, the standard long-conversation memory benchmark):
| System | Score | Cloud Required | Open Source | Funding |
|---|---|---|---|---|
| EverMemOS | 92.3% | Yes | No | — |
| Hindsight | 89.6% | Yes | No | — |
| SLM V3 Mode C | 87.7% | Optional | Yes (MIT) | $0 |
| Zep v3 | 85.2% | Yes | Deprecated | $35M |
| SLM V3 Mode A | 74.8% | No | Yes (MIT) | $0 |
| Mem0 | 64.2% | Yes | Partial | $24M |
Mode A scores 74.8% with zero cloud dependency — outperforming Mem0 by 16 percentage points without a single API call. On open-domain questions, Mode A scores 85.0% — the highest of any system in the evaluation, including cloud-powered ones. Mode C reaches 87.7%, matching enterprise cloud systems.
Mathematical layers contribute +12.7 percentage points on average across 6 conversations (n=832 questions), with up to +19.9pp on the most challenging dialogues. This isn't more compute — it's better math.
Upgrading from V2 (2.8.6)? V3 is a complete architectural reinvention — new mathematical engine, new retrieval pipeline, new storage schema. Your existing data is preserved but requires migration. After installing V3, run
slm migrateto upgrade your data. Read the Migration Guide before upgrading. Backup is created automatically.
Quick Start
Install via npm (recommended)
npm install -g superlocalmemory
slm setup # Choose mode (A/B/C)
slm warmup # Pre-download embedding model (~500MB, optional)
Install via pip
pip install superlocalmemory
First Use
slm remember "Alice works at Google as a Staff Engineer"
slm recall "What does Alice do?"
slm status
MCP Integration (Claude, Cursor, Windsurf, VS Code, etc.)
{
"mcpServers": {
"superlocalmemory": {
"command": "slm",
"args": ["mcp"]
}
}
}
24 MCP tools available. Works with Claude Code, Cursor, Windsurf, VS Code Copilot, Continue, Cody, ChatGPT Desktop, Gemini CLI, JetBrains, Zed, and 17+ AI tools.
Three Operating Modes
| Mode | What | Cloud? | EU AI Act | Best For |
|---|---|---|---|---|
| A | Local Guardian | None | Compliant | Privacy-first, air-gapped, enterprise |
| B | Smart Local | Local only (Ollama) | Compliant | Better answers, data stays local |
| C | Full Power | Cloud LLM | Partial | Maximum accuracy, research |
slm mode a # Zero-cloud (default)
slm mode b # Local Ollama
slm mode c # Cloud LLM
Mode A is the only agent memory that operates with zero cloud dependency while achieving competitive retrieval accuracy on a standard benchmark. All data stays on your device. No API keys. No GPU. Runs on 2 vCPUs + 4GB RAM.
Architecture
Query ──► Strategy Classifier ──► 4 Parallel Channels:
├── Semantic (Fisher-Rao geodesic distance)
├── BM25 (keyword matching)
├── Entity Graph (spreading activation, 3 hops)
└── Temporal (date-aware retrieval)
│
RRF Fusion (k=60)
│
Scene Expansion + Bridge Discovery
│
Cross-Encoder Reranking
│
◄── Top-K Results with channel scores
Mathematical Foundations
Three novel contributions replace cloud LLM dependency with mathematical guarantees:
-
Fisher-Rao Retrieval Metric — Similarity scoring derived from the Fisher information structure of diagonal Gaussian families. Graduated ramp from cosine to geodesic distance over the first 10 accesses. The first application of information geometry to agent memory retrieval.
-
Sheaf Cohomology for Consistency — Algebraic topology detects contradictions by computing coboundary norms on the knowledge graph. The first algebraic guarantee for contradiction detection in agent memory.
-
Riemannian Langevin Lifecycle — Memory positions evolve on the Poincare ball via discretized Langevin SDE. Frequently accessed memories stay active; neglected memories self-archive. No hardcoded thresholds.
These three layers collectively yield +12.7pp average improvement over the engineering-only baseline, with the Fisher metric alone contributing +10.8pp on the hardest conversations.
Benchmarks
Evaluated on LoCoMo — 10 multi-session conversations, 1,986 total questions, 4 scored categories.
Mode A (Zero-Cloud, 10 Conversations, 1,276 Questions)
| Category | Score | vs. Mem0 (64.2%) |
|---|---|---|
| Single-Hop | 72.0% | +3.0pp |
| Multi-Hop | 70.3% | +8.6pp |
| Temporal | 80.0% | +21.7pp |
| Open-Domain | 85.0% | +35.0pp |
| Aggregate | 74.8% | +10.6pp |
Mode A achieves 85.0% on open-domain questions — the highest of any system in the evaluation, including cloud-powered ones.
Math Layer Impact (6 Conversations, n=832)
| Conversation | With Math | Without | Delta |
|---|---|---|---|
| Easiest | 78.5% | 71.2% | +7.3pp |
| Hardest | 64.2% | 44.3% | +19.9pp |
| Average | 71.7% | 58.9% | +12.7pp |
Mathematical layers help most where heuristic methods struggle — the harder the conversation, the bigger the improvement.
Ablation (What Each Component Contributes)
| Removed | Impact |
|---|---|
| Cross-encoder reranking | -30.7pp |
| Fisher-Rao metric | -10.8pp |
| All math layers | -7.6pp |
| BM25 channel | -6.5pp |
| Sheaf consistency | -1.7pp |
| Entity graph | -1.0pp |
Full ablation details in the Wiki.
EU AI Act Compliance
The EU AI Act (Regulation 2024/1689) takes full effect August 2, 2026. Every AI memory system that sends personal data to cloud LLMs for core operations has a compliance question to answer.
| Requirement | Mode A | Mode B | Mode C |
|---|---|---|---|
| Data sovereignty (Art. 10) | Pass | Pass | Requires DPA |
| Right to erasure (GDPR Art. 17) | Pass | Pass | Pass |
| Transparency (Art. 13) | Pass | Pass | Pass |
| No network calls during memory ops | Yes | Yes | No |
To the best of our knowledge, no existing agent memory system addresses EU AI Act compliance. Modes A and B pass all checks by architectural design — no personal data leaves the device during any memory operation.
Built-in compliance tools: GDPR Article 15/17 export + complete erasure, tamper-proof SHA-256 audit chain, data provenance tracking, ABAC policy enforcement.
Web Dashboard
slm dashboard # Opens at http://localhost:8765
<details open> <summary><strong>Dashboard Screenshots</strong> (click to collapse)</summary> <p align="center"><img src="docs/screenshots/01-dashboard-main.png" alt="Dashboard" width="600"/></p> <p align="center"> <img src="docs/screenshots/02-knowledge-graph.png" alt="Graph" width="190"/> <img src="docs/screenshots/03-math-health.png" alt="Math" width="190"/> <img src="docs/screenshots/05-trust-dashboard.png" alt="Trust" width="190"/> </p> <p align="center"> <img src="docs/screenshots/04-recall-lab.png" alt="Recall" width="190"/> <img src="docs/screenshots/06-settings.png" alt="Settings" width="190"/> <img src="docs/screenshots/07-memories-blurred.png" alt="Memories" width="190"/> </p> </details>
17 tabs: Dashboard, Recall Lab, Knowledge Graph, Memories, Trust Scores, Math Health, Compliance, Learning, IDE Connections, Settings, and more. Runs locally — no data leaves your machine.
Features
Retrieval
- 4-channel hybrid: Semantic (Fisher-Rao) + BM25 + Entity Graph + Temporal
- RRF fusion + cross-encoder reranking
- Agentic sufficiency verification (auto-retry on weak results)
- Adaptive ranking with LightGBM (learns from usage)
Intelligence
- 11-step ingestion pipeline (entity resolution, fact extraction, emotional tagging, scene building)
- Automatic contradiction detection via sheaf cohomology
- Self-organizing memory lifecycle (no hardcoded thresholds)
- Behavioral pattern detection and outcome tracking
Trust & Security
- Bayesian Beta-distribution trust scoring (per-agent, per-fact)
- Trust gates (block low-trust agents from writing/deleting)
- ABAC (Attribute-Based Access Control) with DB-persisted policies
- Tamper-proof hash-chain audit trail (SHA-256 linked entries)
Infrastructure
- 17-tab web dashboard with real-time visualization
- 17+ IDE integrations (Claude, Cursor, Windsurf, VS Code, JetBrains, Zed, etc.)
- 24 MCP tools + 6 MCP resources
- Profile isolation (independent memory spaces)
- 1400+ tests, MIT license, cross-platform (Mac/Linux/Windows)
- CPU-only — no GPU required
CLI Reference
| Command | What It Does |
|---|---|
slm remember "..." |
Store a memory |
slm recall "..." |
Search memories |
slm forget "..." |
Delete matching memories |
slm trace "..." |
Recall with per-channel score breakdown |
slm status |
System status |
slm health |
Math layer health (Fisher, Sheaf, Langevin) |
slm mode a/b/c |
Switch operating mode |
slm setup |
Interactive first-time wizard |
slm warmup |
Pre-download embedding model |
slm migrate |
V2 to V3 migration |
slm dashboard |
Launch web dashboard |
slm mcp |
Start MCP server (for IDE integration) |
slm connect |
Configure IDE integrations |
slm profile list/create/switch |
Profile management |
Research Papers
V3: Information-Geometric Foundations
SuperLocalMemory V3: Information-Geometric Foundations for Zero-LLM Enterprise Agent Memory Varun Pratap Bhardwaj (2026) arXiv:2603.14588 · Zenodo DOI: 10.5281/zenodo.19038659
V2: Architecture & Engineering
SuperLocalMemory: A Structured Local Memory Architecture for Persistent AI Agent Context Varun Pratap Bhardwaj (2026) arXiv:2603.02240 · Zenodo DOI: 10.5281/zenodo.18709670
Cite This Work
@article{bhardwaj2026slmv3,
title={Information-Geometric Foundations for Zero-LLM Enterprise Agent Memory},
author={Bhardwaj, Varun Pratap},
journal={arXiv preprint arXiv:2603.14588},
year={2026},
url={https://arxiv.org/abs/2603.14588}
}
Prerequisites
| Requirement | Version | Why |
|---|---|---|
| Node.js | 14+ | npm package manager |
| Python | 3.11+ | V3 engine runtime |
All Python dependencies install automatically during npm install. If anything fails, the installer shows exact fix commands. BM25 keyword search works even without embeddings — you're never fully blocked.
| Component | Size | When |
|---|---|---|
| Core libraries (numpy, scipy, networkx) | ~50MB | During install |
| Search engine (sentence-transformers, torch) | ~200MB | During install |
| Embedding model (nomic-embed-text-v1.5, 768d) | ~500MB | First use or slm warmup |
Contributing
See CONTRIBUTING.md for guidelines. Wiki for detailed documentation.
License
MIT License. See LICENSE.
Attribution
Part of Qualixar · Author: Varun Pratap Bhardwaj
<p align="center"> <sub>Built with mathematical rigor. Not in the race — here to help everyone build better AI memory systems.</sub> </p>
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。