CPersona

CPersona

Persistent AI memory server with 3-layer hybrid search (vector + FTS5 + keyword), confidence scoring via Reciprocal Rank Fusion, episodic/profile memory, and 16 tools. Zero LLM dependency. Works standalone with Claude Desktop and Claude Code. MIT licensed.

Category
访问服务器

README

<div align="center">

cpersona

MCP Memory Server

Give Claude persistent memory across sessions. Single SQLite file. 16 tools. Zero LLM dependency.

License: MIT Python Tests

Quick Start · Features · Architecture · All Tools · Zenn Book (JP)

</div>


Standalone repository — This is the standalone version for use with Claude Desktop, Claude Code, and any MCP client. If you are a ClotoCore user, use the version in cloto-mcp-servers instead.

The Problem

Claude forgets everything between sessions. Every conversation starts from zero — no context about your project, your preferences, or what you discussed yesterday.

cpersona fixes this. It's an MCP server that stores memories in a local SQLite file and retrieves them through hybrid search. Claude remembers you.

Quick Start

Prerequisites: Python 3.10+, Git

git clone https://github.com/Cloto-dev/cpersona.git
cd cpersona
python -m venv .venv

# Windows
.venv\Scripts\activate
# macOS / Linux
# source .venv/bin/activate

pip install .

Claude Desktop — add to claude_desktop_config.json:

{
  "mcpServers": {
    "embedding": {
      "command": "/path/to/.venv/bin/python",
      "args": ["/path/to/servers/embedding/server.py"],
      "env": {
        "EMBEDDING_PROVIDER": "onnx_jina_v5_nano",
        "EMBEDDING_HTTP_PORT": "8401"
      }
    },
    "cpersona": {
      "command": "/path/to/.venv/bin/python",
      "args": ["/path/to/cpersona/server.py"],
      "env": {
        "CPERSONA_DB_PATH": "/home/you/.claude/cpersona.db",
        "CPERSONA_EMBEDDING_MODE": "http",
        "CPERSONA_EMBEDDING_URL": "http://127.0.0.1:8401/embed"
      }
    }
  }
}

Windows: use .venv/Scripts/python.exe and C:/Users/you/.claude/cpersona.db

Claude Code:

claude mcp add-json embedding '{"type":"stdio","command":"/path/to/.venv/bin/python","args":["/path/to/servers/embedding/server.py"],"env":{"EMBEDDING_PROVIDER":"onnx_jina_v5_nano","EMBEDDING_HTTP_PORT":"8401"}}' -s user

claude mcp add-json cpersona '{"type":"stdio","command":"/path/to/.venv/bin/python","args":["/path/to/cpersona/server.py"],"env":{"CPERSONA_DB_PATH":"/home/you/.claude/cpersona.db","CPERSONA_EMBEDDING_MODE":"http","CPERSONA_EMBEDDING_URL":"http://127.0.0.1:8401/embed"}}' -s user

That's it. Claude now has persistent memory. Ask it to store something and recall it in a later session.

Features

Hybrid Search — Three independent retrieval strategies run in parallel and merge results via Reciprocal Rank Fusion (RRF):

Layer Method Strength
Vector Cosine similarity (jina-v5-nano, 768d) Semantic meaning
FTS5 SQLite full-text search with trigram tokenizer Exact terms, names, IDs
Keyword Fallback pattern matching Edge cases, partial matches

Memory Types:

  • Declarative memory — Individual facts, decisions, instructions stored via store
  • Episodic memory — Conversation summaries archived via archive_episode
  • Profile memory — Accumulated user/project attributes via update_profile

Confidence Scoring — Each recalled memory gets a confidence score combining:

  • Cosine similarity (semantic relevance)
  • Dynamic time decay (adapts to corpus time range — a 1-year-old corpus and a 1-day-old corpus use different decay curves)
  • Recall boost (frequently useful memories surface more easily, with natural fade-out)
  • Completion factor (resolved topics decay faster)

Zero LLM Dependency — cpersona is a pure data server. It never calls an LLM internally. All summarization and extraction is performed by the calling agent. This means zero API costs from cpersona itself, deterministic behavior, and no hidden latency.

Additional capabilities:

  • Agent namespace isolation — multiple agents share one DB without interference
  • Background task queue — DB-persisted, crash-recoverable async processing
  • JSONL export/import — full memory portability between environments
  • Agent-to-agent memory merge — atomic copy/move with deduplication
  • Auto-calibration — statistical threshold tuning via null distribution z-score (no labels needed)
  • Health check — 15 automated detections with auto-repair (contamination, duplicates, FTS desync, invalid data, stale tasks)
  • stdio + Streamable HTTP transport
  • Single-file SQLite — no external database required

Architecture

                         ┌─────────────────────────────────────┐
                         │            MCP Host                 │
                         │   (Claude Desktop / Claude Code)    │
                         └──────────────┬──────────────────────┘
                                        │ MCP (JSON-RPC)
                         ┌──────────────▼──────────────────────┐
                         │           cpersona                  │
                         │         (server.py)                 │
                         │                                     │
                         │  ┌─────────┐  ┌─────────┐          │
                         │  │  store   │  │ recall  │  ...     │
                         │  └────┬────┘  └────┬────┘          │
                         │       │             │               │
                         │  ┌────▼─────────────▼────────────┐  │
                         │  │         SQLite DB              │  │
                         │  │                                │  │
                         │  │  memories    (content + embed) │  │
                         │  │  episodes    (summaries)       │  │
                         │  │  profiles    (attributes)      │  │
                         │  │  memories_fts (FTS5 index)     │  │
                         │  │  episodes_fts (FTS5 index)     │  │
                         │  │  task_queue   (async jobs)     │  │
                         │  └────────────────────────────────┘  │
                         │                                      │
                         └──────────────┬───────────────────────┘
                                        │ HTTP
                         ┌──────────────▼──────────────────────┐
                         │       Embedding Server              │
                         │  (jina-v5-nano ONNX, 768d)          │
                         └─────────────────────────────────────┘

Recall flow (RRF mode):

Query → ┌── Vector search (cosine similarity)  ──┐
        ├── FTS5 search (episodes + memories)    ──┼── RRF merge → Confidence scoring → Top-K
        └── Keyword fallback                     ──┘

Benchmarks

Tested on LMEB (Long-term Memory Evaluation Benchmark, results) — 22 evaluation tasks measuring memory retrieval quality:

Embedding Model Params Dimensions Mean NDCG@10
MiniLM-L6-v2 22M 384 36.88
e5-small 33M 384 46.36
jina-v5-nano 33M 768 54.14

jina-v5-nano achieves +47% improvement over the MiniLM baseline.

All Tools

Tool Description
store Store a message in agent memory
recall Recall relevant memories (vector + FTS5 + keyword, RRF merge)
get_profile Get current agent profile
update_profile Save pre-computed agent profile
archive_episode Archive conversation episode with summary and keywords
list_memories List recent memories
list_episodes List archived episodes
delete_memory Delete a single memory (ownership enforced)
delete_episode Delete a single episode (ownership enforced)
delete_agent_data Delete all data for an agent
calibrate_threshold Auto-calibrate vector search threshold via z-score
export_memories Export to JSONL (memories, episodes, profiles)
import_memories Import from JSONL (idempotent via msg_id dedup)
merge_memories Merge one agent's data into another (atomic, with dedup)
get_queue_status Background task queue status
check_health 15-point database health check with auto-repair

Configuration

All settings via environment variables with sensible defaults:

Variable Default Description
CPERSONA_DB_PATH ./cpersona.db SQLite database path
CPERSONA_EMBEDDING_MODE http Embedding mode (http or disabled)
CPERSONA_EMBEDDING_URL http://127.0.0.1:8401/embed Embedding server URL
CPERSONA_VECTOR_SEARCH_MODE remote Vector search mode
CPERSONA_SEARCH_MODE rrf Search strategy (rrf or cascade)
CPERSONA_RRF_K 60 RRF smoothing parameter
CPERSONA_CONFIDENCE_ENABLED false Include confidence metadata in results
CPERSONA_AUTO_CALIBRATE false Auto-calibrate on startup
CPERSONA_TASK_QUEUE_ENABLED false Enable background task queue

Stats

  • ~3,000 LOC Python (single file, server.py)
  • 117 tests across 12 test modules
  • Schema v7 (auto-migrating)
  • MIT License

Works With

cpersona is an MCP server — it works with any MCP-compatible host:

Part of ClotoCore

cpersona is the memory layer of ClotoCore, an open-source AI agent platform written in Rust. While cpersona is fully standalone (MIT license), it was designed to give AI agents persistent, searchable memory within the ClotoCore ecosystem.

Learn More

License

MIT — free to use from any MCP host without restriction. </div>

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选