memory-mcp

memory-mcp

A self-organizing, persistent semantic memory layer that enables AI agents to store, categorize, and retrieve information using hybrid vector and keyword search. It features autonomous chunking, deduplication, and hierarchical taxonomy management through a PostgreSQL-backed MCP server.

Category
访问服务器

README

memory-mcp

Persistent, self-organizing semantic memory for AI agents — served as an MCP server.

License: MIT Docker Python 3.11+


What is this?

memory-mcp is a Model Context Protocol server that gives AI agents durable, searchable memory backed by PostgreSQL and pgvector. Drop it into any MCP-compatible client (Claude Code, Cursor, Windsurf, etc.) and your agent gains the ability to remember, retrieve, and reason over information across sessions — without you managing any schema or storage logic.

What it does autonomously:

  • Chunks and embeds incoming text
  • Categorizes memories into a hierarchical taxonomy (ltree dot-paths)
  • Deduplicates against existing memories and resolves conflicts
  • Synthesizes a System Primer — a compressed, always-current summary of everything it knows — and surfaces it at session start
  • Expires stale memories via TTL and prompts for verification of aging facts

Why memory-mcp?

memory-mcp Simple vector DB LangChain / LlamaIndex memory
Schema management Automatic Manual Manual
Deduplication Semantic + LLM None None
Taxonomy Auto-assigned ltree None None
Session bootstrap System Primer Manual RAG Manual
Conflict resolution LLM-evaluated None None
Ephemeral context Built-in (TTL store) No No
Self-hostable Yes (Docker) Varies No
MCP-native Yes No No

Architecture

AI Agent (Claude Code / Cursor / Windsurf)
        │  HTTP (MCP — Streamable HTTP)
        ▼
┌──────────────────────────────────────────┐
│              server.py                    │
│  ┌─────────────────┐ ┌─────────────────┐ │
│  │ Production MCP  │ │   Admin MCP     │ │
│  │   :8766/mcp     │ │   :8767/mcp     │ │
│  └────────┬────────┘ └────────┬────────┘ │
│           │  tools/           │           │
│  ┌────────▼──────────────────▼────────┐  │
│  │  ingestion · search · context      │  │
│  │  crud · admin_tools · context_store│  │
│  └────────────────┬───────────────────┘  │
│                   │                       │
│  ┌────────────────▼───────────────────┐  │
│  │         Background Workers          │  │
│  │  Ingestion Queue · TTL Daemon       │  │
│  │  System Primer Auto-Regeneration    │  │
│  └────────────────┬───────────────────┘  │
└───────────────────┼──────────────────────┘
                    │  asyncpg
                    ▼
         PostgreSQL + pgvector
         ┌─────────────────┐
         │ memories        │  chunks, embeddings, ltree paths
         │ memory_edges    │  sequence_next, relates_to, supersedes
         │ ingestion_staging│ async job queue
         │ context_store   │  ephemeral TTL store
         └─────────────────┘
                    │
         ┌──────────▼──────────┐
         │  Backup Service     │  pg_dump → private GitHub repo
         └─────────────────────┘

Two servers, one process:

  • Production (:8766) — tools safe for the agent to call freely
  • Admin (:8767) — superset including destructive tools (delete, prune, bulk-move). Point your agent at production; use admin for maintenance.

Quickstart (Docker)

Prerequisites: Docker + Docker Compose, an OpenAI API key.

# 1. Clone
git clone https://github.com/isaacriehm/memory-mcp.git
cd memory-mcp

# 2. Configure
cp .env.example .env
$EDITOR .env   # set OPENAI_API_KEY and DB_PASSWORD at minimum

# 3. Start
docker compose up -d

# Production MCP endpoint: http://localhost:8766/mcp
# Admin MCP endpoint:      http://localhost:8767/mcp

To rebuild after code changes:

docker compose up -d --build memory-api

Connecting to an MCP Client

Claude Code

Add to your project's .claude/settings.json or ~/.claude/settings.json:

{
  "mcpServers": {
    "memory": {
      "type": "http",
      "url": "http://localhost:8766/mcp"
    }
  }
}

Or via the CLI:

claude mcp add memory --transport http http://localhost:8766/mcp

Then add this instruction to your CLAUDE.md so the agent always bootstraps memory at session start:

## Memory
At the start of every session, call `initialize_context` before anything else.
This returns your System Primer — your identity, current knowledge taxonomy, and retrieval guide.
Always consult it before answering questions about prior context.

Cursor / Windsurf

Add to your MCP settings (.cursor/mcp.json or equivalent):

{
  "mcpServers": {
    "memory": {
      "url": "http://localhost:8766/mcp"
    }
  }
}

MCP Tools

Production Tools (:8766)

Tool Description
initialize_context Call first every session. Returns the System Primer + verification prompts for aging memories.
memorize_context Ingest raw text. Automatically chunks, embeds, categorizes, and deduplicates. Supports ttl_days.
check_ingestion_status Poll async ingestion job by job_id. Returns pending, processing, complete, or failed.
search_memory Hybrid vector + BM25 search with Reciprocal Rank Fusion. Filter by category_path.
list_categories Return all occupied taxonomy paths with memory counts.
explore_taxonomy Drill into a collapsed [+N more] branch from list_categories.
fetch_document Reconstruct a full document by following sequence_next edges from a memory ID.
trace_history Inspect the full supersession chain (oldest → newest) for a memory.
confirm_memory_validity Confirm an aging memory is still accurate. Advances its verify_after date.
update_memory Rewrite a memory's content in-place (preserves identity, edges, history).
set_context Write a key/value pair to the ephemeral context store with a TTL.
get_context Retrieve an ephemeral context entry by key.
list_context_keys List active (non-expired) context keys, optionally filtered by scope.
delete_context Explicitly delete a context entry before its TTL expires.
extend_context_ttl Push a context entry's expiry forward by N hours.

Admin-Only Tools (:8767)

Tool Description
delete_memory Hard-delete a memory by ID (cascades edges).
prune_history Batch-delete superseded memories older than N days.
export_memories Export all active memories to JSON.
recategorize_memory Move a single memory to a new taxonomy path.
bulk_move_category Move an entire taxonomy branch (e.g. old.prefixnew.prefix).
update_memory_metadata Patch a memory's metadata JSONB in-place.
run_diagnostics Report on pool health, memory counts, ingestion queue depth.
get_ingestion_stats Breakdown of ingestion job statuses.
flush_staging Clear all completed/failed staging jobs immediately.

Taxonomy

Memories are organized into a dot-path hierarchy using PostgreSQL ltree. The system assigns paths automatically during ingestion. You can override with recategorize_memory or bulk_move_category.

Example paths:

user.profile.personal
user.health.medical
projects.myapp.architecture
projects.myapp.decisions
organizations.acme.business
concepts.ai.behavior
reference.system.primer     ← auto-generated System Primer lives here

Search is subtree-aware — passing category_path: "projects.myapp" returns everything under that branch.


System Primer

initialize_context returns a synthesized summary stored at reference.system.primer. It includes:

  • A compressed user/agent profile
  • The full taxonomy tree with memory counts
  • Retrieval guidance

The primer auto-regenerates in the background when ≥10 new memories are ingested or when the previous primer is older than 1 hour. You can force regeneration via the admin tool synthesize_system_primer.


Environment Variables

Copy .env.example to .env and fill in your values.

Required

Variable Description
DATABASE_URL PostgreSQL connection string (e.g. postgresql://user:pass@localhost:5432/memory)
OPENAI_API_KEY OpenAI API key for embeddings and LLM calls
DB_PASSWORD PostgreSQL password (used by Docker Compose)

Optional — Models & Embeddings

Variable Default Description
EMBEDDING_MODEL text-embedding-3-small OpenAI embedding model
EXTRACT_MODEL gpt-5-mini LLM for semantic section extraction and categorization
CONFLICT_MODEL gpt-5-nano LLM for conflict/dedup evaluation
EMBED_DIM 1536 Embedding vector dimension (must match model)

Optional — Search & Limits

Variable Default Description
DEFAULT_SEARCH_LIMIT 10 Default result count for search_memory
DEFAULT_LIST_LIMIT 50 Default result count for list_categories
DUP_THRESHOLD 0.95 Cosine similarity threshold for deduplication
CONFLICT_THRESHOLD 0.55 Similarity threshold for conflict detection
RELATES_TO_THRESHOLD 0.65 Similarity threshold for relates_to edge creation
MIN_SECTION_LENGTH 100 Minimum character length for a chunk to be stored
MAX_TAXONOMY_PATHS 40 Max taxonomy paths assigned per ingestion

Optional — OpenAI & Concurrency

Variable Default Description
OPENAI_TIMEOUT_S 60 Per-request OpenAI timeout in seconds
OPENAI_MAX_RETRIES 5 Exponential-backoff retry limit
MAX_CONCURRENT_API_CALLS 5 Semaphore for parallel OpenAI requests
EXTRACT_REASONING low Reasoning effort for extraction LLM
CONFLICT_REASONING minimal Reasoning effort for conflict LLM

Optional — Database

Variable Default Description
PG_POOL_MIN 1 asyncpg minimum pool connections
PG_POOL_MAX 10 asyncpg maximum pool connections
STAGING_RETENTION_DAYS 7 Days to retain completed/failed staging jobs

Optional — Server

Variable Default Description
PRODUCTION_PORT 8766 Production MCP server port
ADMIN_PORT 8767 Admin MCP server port
MCP_TRANSPORT streamable-http FastMCP transport mode
FASTMCP_JSON_RESPONSE Set to 1 to force JSON responses
LOG_LEVEL INFO DEBUG / INFO / WARNING

Optional — System Primer

Variable Default Description
PRIMER_UPDATE_MAX_AGE_S 3600 Max seconds before auto primer regeneration

Optional — Context Store

Variable Default Description
CONTEXT_DEFAULT_TTL_HOURS 24 Default TTL for context store entries
CONTEXT_MAX_VALUE_LENGTH 50000 Max character length for context values
CONTEXT_MAX_KEY_LENGTH 200 Max character length for context keys

Optional — Backup Service

Variable Description
GITHUB_PAT GitHub Personal Access Token with repo scope
GITHUB_BACKUP_REPO Target repo in owner/repo format
BACKUP_INTERVAL_SECONDS Seconds between backups (default: 21600 = 6 hours)

Running Locally (Development)

Requirements: Python 3.11+, PostgreSQL with pgvector.

# Create and activate virtual environment
python3.11 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure
cp .env.example .env
$EDITOR .env

# Start the server
python -m server
# Production: http://0.0.0.0:8766
# Admin:      http://0.0.0.0:8767

Backup Service

The backup/ directory contains a containerized PostgreSQL backup job that:

  1. Runs pg_dump on the configured interval (default: every 6 hours)
  2. Commits the dump to a private GitHub repository

The backup service starts automatically with docker compose up. Set GITHUB_PAT and GITHUB_BACKUP_REPO in your .env to enable it. If those variables are unset, the service will error on startup — remove the memory-backup service from docker-compose.yml if you don't need backups.


CLI Scripts

Standalone scripts in scripts/ (require DATABASE_URL in environment):

# Export all memories to a timestamped JSON file
python scripts/export_memories.py

# Generate an interactive graph visualization
python scripts/visualize_memories.py
open memory_map.html

Contributing

See CONTRIBUTING.md.

License

MIT

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选
mcp-server-qdrant

mcp-server-qdrant

这个仓库展示了如何为向量搜索引擎 Qdrant 创建一个 MCP (Managed Control Plane) 服务器的示例。

官方
精选
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选