MCP 服务器

memory-mcp

A self-organizing, persistent semantic memory layer that enables AI agents to store, categorize, and retrieve information using hybrid vector and keyword search. It features autonomous chunking, deduplication, and hierarchical taxonomy management through a PostgreSQL-backed MCP server.

README

memory-mcp

Persistent, self-organizing semantic memory for AI agents — served as an MCP server.

What is this?

memory-mcp is a Model Context Protocol server that gives AI agents durable, searchable memory backed by PostgreSQL and pgvector. Drop it into any MCP-compatible client (Claude Code, Cursor, Windsurf, etc.) and your agent gains the ability to remember, retrieve, and reason over information across sessions — without you managing any schema or storage logic.

What it does autonomously:

Chunks and embeds incoming text
Categorizes memories into a hierarchical taxonomy (ltree dot-paths)
Deduplicates against existing memories and resolves conflicts
Synthesizes a System Primer — a compressed, always-current summary of everything it knows — and surfaces it at session start
Expires stale memories via TTL and prompts for verification of aging facts

Why memory-mcp?

	memory-mcp	Simple vector DB	LangChain / LlamaIndex memory
Schema management	Automatic	Manual	Manual
Deduplication	Semantic + LLM	None	None
Taxonomy	Auto-assigned ltree	None	None
Session bootstrap	System Primer	Manual RAG	Manual
Conflict resolution	LLM-evaluated	None	None
Ephemeral context	Built-in (TTL store)	No	No
Self-hostable	Yes (Docker)	Varies	No
MCP-native	Yes	No	No

Architecture

AI Agent (Claude Code / Cursor / Windsurf)
        │  HTTP (MCP — Streamable HTTP)
        ▼
┌──────────────────────────────────────────┐
│              server.py                    │
│  ┌─────────────────┐ ┌─────────────────┐ │
│  │ Production MCP  │ │   Admin MCP     │ │
│  │   :8766/mcp     │ │   :8767/mcp     │ │
│  └────────┬────────┘ └────────┬────────┘ │
│           │  tools/           │           │
│  ┌────────▼──────────────────▼────────┐  │
│  │  ingestion · search · context      │  │
│  │  crud · admin_tools · context_store│  │
│  └────────────────┬───────────────────┘  │
│                   │                       │
│  ┌────────────────▼───────────────────┐  │
│  │         Background Workers          │  │
│  │  Ingestion Queue · TTL Daemon       │  │
│  │  System Primer Auto-Regeneration    │  │
│  └────────────────┬───────────────────┘  │
└───────────────────┼──────────────────────┘
                    │  asyncpg
                    ▼
         PostgreSQL + pgvector
         ┌─────────────────┐
         │ memories        │  chunks, embeddings, ltree paths
         │ memory_edges    │  sequence_next, relates_to, supersedes
         │ ingestion_staging│ async job queue
         │ context_store   │  ephemeral TTL store
         └─────────────────┘
                    │
         ┌──────────▼──────────┐
         │  Backup Service     │  pg_dump → private GitHub repo
         └─────────────────────┘

Two servers, one process:

Production (:8766) — tools safe for the agent to call freely
Admin (:8767) — superset including destructive tools (delete, prune, bulk-move). Point your agent at production; use admin for maintenance.

Quickstart (Docker)

Prerequisites: Docker + Docker Compose, an OpenAI API key.

# 1. Clone
git clone https://github.com/isaacriehm/memory-mcp.git
cd memory-mcp

# 2. Configure
cp .env.example .env
$EDITOR .env   # set OPENAI_API_KEY and DB_PASSWORD at minimum

# 3. Start
docker compose up -d

# Production MCP endpoint: http://localhost:8766/mcp
# Admin MCP endpoint:      http://localhost:8767/mcp

To rebuild after code changes:

docker compose up -d --build memory-api

Connecting to an MCP Client

Claude Code

Add to your project's .claude/settings.json or ~/.claude/settings.json:

{
  "mcpServers": {
    "memory": {
      "type": "http",
      "url": "http://localhost:8766/mcp"
    }
  }
}

Or via the CLI:

claude mcp add memory --transport http http://localhost:8766/mcp

Then add this instruction to your CLAUDE.md so the agent always bootstraps memory at session start:

## Memory
At the start of every session, call `initialize_context` before anything else.
This returns your System Primer — your identity, current knowledge taxonomy, and retrieval guide.
Always consult it before answering questions about prior context.

Cursor / Windsurf

Add to your MCP settings (.cursor/mcp.json or equivalent):

{
  "mcpServers": {
    "memory": {
      "url": "http://localhost:8766/mcp"
    }
  }
}

MCP Tools

Production Tools (`:8766`)

Tool	Description
`initialize_context`	Call first every session. Returns the System Primer + verification prompts for aging memories.
`memorize_context`	Ingest raw text. Automatically chunks, embeds, categorizes, and deduplicates. Supports `ttl_days`.
`check_ingestion_status`	Poll async ingestion job by `job_id`. Returns `pending`, `processing`, `complete`, or `failed`.
`search_memory`	Hybrid vector + BM25 search with Reciprocal Rank Fusion. Filter by `category_path`.
`list_categories`	Return all occupied taxonomy paths with memory counts.
`explore_taxonomy`	Drill into a collapsed `[+N more]` branch from `list_categories`.
`fetch_document`	Reconstruct a full document by following `sequence_next` edges from a memory ID.
`trace_history`	Inspect the full supersession chain (oldest → newest) for a memory.
`confirm_memory_validity`	Confirm an aging memory is still accurate. Advances its `verify_after` date.
`update_memory`	Rewrite a memory's content in-place (preserves identity, edges, history).
`set_context`	Write a key/value pair to the ephemeral context store with a TTL.
`get_context`	Retrieve an ephemeral context entry by key.
`list_context_keys`	List active (non-expired) context keys, optionally filtered by scope.
`delete_context`	Explicitly delete a context entry before its TTL expires.
`extend_context_ttl`	Push a context entry's expiry forward by N hours.

Admin-Only Tools (`:8767`)

Tool	Description
`delete_memory`	Hard-delete a memory by ID (cascades edges).
`prune_history`	Batch-delete superseded memories older than N days.
`export_memories`	Export all active memories to JSON.
`recategorize_memory`	Move a single memory to a new taxonomy path.
`bulk_move_category`	Move an entire taxonomy branch (e.g. `old.prefix` → `new.prefix`).
`update_memory_metadata`	Patch a memory's metadata JSONB in-place.
`run_diagnostics`	Report on pool health, memory counts, ingestion queue depth.
`get_ingestion_stats`	Breakdown of ingestion job statuses.
`flush_staging`	Clear all completed/failed staging jobs immediately.

Taxonomy

Memories are organized into a dot-path hierarchy using PostgreSQL ltree. The system assigns paths automatically during ingestion. You can override with recategorize_memory or bulk_move_category.

Example paths:

user.profile.personal
user.health.medical
projects.myapp.architecture
projects.myapp.decisions
organizations.acme.business
concepts.ai.behavior
reference.system.primer     ← auto-generated System Primer lives here

Search is subtree-aware — passing category_path: "projects.myapp" returns everything under that branch.

System Primer

initialize_context returns a synthesized summary stored at reference.system.primer. It includes:

A compressed user/agent profile
The full taxonomy tree with memory counts
Retrieval guidance

The primer auto-regenerates in the background when ≥10 new memories are ingested or when the previous primer is older than 1 hour. You can force regeneration via the admin tool synthesize_system_primer.

Environment Variables

Copy .env.example to .env and fill in your values.

Required

Variable	Description
`DATABASE_URL`	PostgreSQL connection string (e.g. `postgresql://user:pass@localhost:5432/memory`)
`OPENAI_API_KEY`	OpenAI API key for embeddings and LLM calls
`DB_PASSWORD`	PostgreSQL password (used by Docker Compose)

Optional — Models & Embeddings

Variable	Default	Description
`EMBEDDING_MODEL`	`text-embedding-3-small`	OpenAI embedding model
`EXTRACT_MODEL`	`gpt-5-mini`	LLM for semantic section extraction and categorization
`CONFLICT_MODEL`	`gpt-5-nano`	LLM for conflict/dedup evaluation
`EMBED_DIM`	`1536`	Embedding vector dimension (must match model)

Optional — Search & Limits

Variable	Default	Description
`DEFAULT_SEARCH_LIMIT`	`10`	Default result count for `search_memory`
`DEFAULT_LIST_LIMIT`	`50`	Default result count for `list_categories`
`DUP_THRESHOLD`	`0.95`	Cosine similarity threshold for deduplication
`CONFLICT_THRESHOLD`	`0.55`	Similarity threshold for conflict detection
`RELATES_TO_THRESHOLD`	`0.65`	Similarity threshold for `relates_to` edge creation
`MIN_SECTION_LENGTH`	`100`	Minimum character length for a chunk to be stored
`MAX_TAXONOMY_PATHS`	`40`	Max taxonomy paths assigned per ingestion

Optional — OpenAI & Concurrency

Variable	Default	Description
`OPENAI_TIMEOUT_S`	`60`	Per-request OpenAI timeout in seconds
`OPENAI_MAX_RETRIES`	`5`	Exponential-backoff retry limit
`MAX_CONCURRENT_API_CALLS`	`5`	Semaphore for parallel OpenAI requests
`EXTRACT_REASONING`	`low`	Reasoning effort for extraction LLM
`CONFLICT_REASONING`	`minimal`	Reasoning effort for conflict LLM

Optional — Database

Variable	Default	Description
`PG_POOL_MIN`	`1`	asyncpg minimum pool connections
`PG_POOL_MAX`	`10`	asyncpg maximum pool connections
`STAGING_RETENTION_DAYS`	`7`	Days to retain completed/failed staging jobs

Optional — Server

Variable	Default	Description
`PRODUCTION_PORT`	`8766`	Production MCP server port
`ADMIN_PORT`	`8767`	Admin MCP server port
`MCP_TRANSPORT`	`streamable-http`	FastMCP transport mode
`FASTMCP_JSON_RESPONSE`	—	Set to `1` to force JSON responses
`LOG_LEVEL`	`INFO`	`DEBUG` / `INFO` / `WARNING`

Optional — System Primer

Variable	Default	Description
`PRIMER_UPDATE_MAX_AGE_S`	`3600`	Max seconds before auto primer regeneration

Optional — Context Store

Variable	Default	Description
`CONTEXT_DEFAULT_TTL_HOURS`	`24`	Default TTL for context store entries
`CONTEXT_MAX_VALUE_LENGTH`	`50000`	Max character length for context values
`CONTEXT_MAX_KEY_LENGTH`	`200`	Max character length for context keys

Optional — Backup Service

Variable	Description
`GITHUB_PAT`	GitHub Personal Access Token with `repo` scope
`GITHUB_BACKUP_REPO`	Target repo in `owner/repo` format
`BACKUP_INTERVAL_SECONDS`	Seconds between backups (default: `21600` = 6 hours)

Running Locally (Development)

Requirements: Python 3.11+, PostgreSQL with pgvector.

# Create and activate virtual environment
python3.11 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure
cp .env.example .env
$EDITOR .env

# Start the server
python -m server
# Production: http://0.0.0.0:8766
# Admin:      http://0.0.0.0:8767

Backup Service

The backup/ directory contains a containerized PostgreSQL backup job that:

Runs pg_dump on the configured interval (default: every 6 hours)
Commits the dump to a private GitHub repository

The backup service starts automatically with docker compose up. Set GITHUB_PAT and GITHUB_BACKUP_REPO in your .env to enable it. If those variables are unset, the service will error on startup — remove the memory-backup service from docker-compose.yml if you don't need backups.

CLI Scripts

Standalone scripts in scripts/ (require DATABASE_URL in environment):

# Export all memories to a timestamped JSON file
python scripts/export_memories.py

# Generate an interactive graph visualization
python scripts/visualize_memories.py
open memory_map.html

Contributing

See CONTRIBUTING.md.

License

MIT

memory-mcp

README

memory-mcp

What is this?

Why memory-mcp?

Architecture

Quickstart (Docker)

Connecting to an MCP Client

Claude Code

Cursor / Windsurf

MCP Tools

Production Tools (:8766)

Admin-Only Tools (:8767)

Taxonomy

System Primer

Environment Variables

Required

Optional — Models & Embeddings

Optional — Search & Limits

Optional — OpenAI & Concurrency

Optional — Database

Optional — Server

Optional — System Primer

Optional — Context Store

Optional — Backup Service

Running Locally (Development)

Backup Service

CLI Scripts

Contributing

License

推荐服务器

Production Tools (`:8766`)

Admin-Only Tools (`:8767`)