Mnemo

Mnemo

Provides AI assistants with extended memory by loading large codebases, documentation sites, PDFs, and GitHub repos into Gemini's context cache for perfect recall querying without complex RAG pipelines.

Category
访问服务器

README

Mnemo

Extended memory for AI assistants via Gemini context caching.

Mnemo (Greek: memory) gives AI assistants like Claude access to large codebases, documentation sites, PDFs, and more by leveraging Gemini's 1M token context window and context caching features.

Why Mnemo?

Instead of complex RAG pipelines with embeddings and retrieval, Mnemo takes a simpler approach:

  • Load your entire codebase into Gemini's context cache
  • Query it with natural language
  • Let Claude orchestrate while Gemini holds the context

This gives you:

  • Perfect recall - no chunking or retrieval means no lost context
  • Lower latency - cached context is served quickly
  • Cost savings - cached tokens cost 75-90% less than regular input tokens
  • Simplicity - no vector databases, embeddings, or complex retrieval logic

What Can Mnemo Load?

Source Local Server Worker
GitHub repos (public)
GitHub repos (private)
Any URL (docs, articles)
PDF documents
JSON APIs
Local files/directories
Multi-page crawls ✅ unlimited ✅ 40 pages max

Deployment Options

Mnemo can be deployed in three ways depending on your needs.

Option 1: Local Server (Development & Full Features)

Best for development and when you need to load local files.

# Clone and install
git clone https://github.com/logos-flux/mnemo
cd mnemo
bun install

# Set your Gemini API key
export GEMINI_API_KEY=your_key_here

# Start the server
bun run dev

Claude Code MCP config:

{
  "mcpServers": {
    "mnemo": {
      "type": "http",
      "url": "http://localhost:8080/mcp"
    }
  }
}

Option 2: Self-Hosted Cloudflare Worker (Recommended for Claude.ai)

Deploy to your own Cloudflare account. You control your data and costs.

Prerequisites:

# Clone and install
git clone https://github.com/logos-flux/mnemo
cd mnemo/packages/cf-worker

# Configure secrets
bunx wrangler secret put GEMINI_API_KEY
bunx wrangler secret put MNEMO_AUTH_TOKEN  # Optional but recommended

# Create D1 database
bunx wrangler d1 create mnemo-cache

# Deploy
bunx wrangler deploy

Claude.ai MCP config:

{
  "mcpServers": {
    "mnemo": {
      "type": "http",
      "url": "https://mnemo.<your-subdomain>.workers.dev/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_AUTH_TOKEN"
      }
    }
  }
}

Why use this? Claude.ai can't connect to localhost. The Worker gives you an external endpoint that Claude.ai can reach.


Option 3: Managed Hosting (VIP)

Don't want to manage infrastructure? We offer fully managed Mnemo hosting for select clients.

Includes:

  • Dedicated Worker deployment
  • Priority support
  • Custom domain
  • Usage monitoring

Contact: lf@logosflux.io for pricing and availability.


Usage Examples

# Load a GitHub repo
curl -X POST http://localhost:8080/tools/context_load \
  -H "Content-Type: application/json" \
  -d '{"source": "https://github.com/honojs/hono", "alias": "hono"}'

# Load a documentation site (crawls up to token target)
curl -X POST http://localhost:8080/tools/context_load \
  -H "Content-Type: application/json" \
  -d '{"source": "https://hono.dev/docs", "alias": "hono-docs"}'

# Load a PDF
curl -X POST http://localhost:8080/tools/context_load \
  -H "Content-Type: application/json" \
  -d '{"source": "https://arxiv.org/pdf/2303.08774.pdf", "alias": "gpt4-paper"}'

# Load a private repo (with GitHub token)
curl -X POST http://localhost:8080/tools/context_load \
  -H "Content-Type: application/json" \
  -d '{"source": "https://github.com/owner/private-repo", "alias": "private", "githubToken": "ghp_xxx"}'

# Load multiple sources into one cache
curl -X POST http://localhost:8080/tools/context_load \
  -H "Content-Type: application/json" \
  -d '{"sources": ["https://github.com/owner/repo", "https://docs.example.com"], "alias": "combined"}'

# Query the cache
curl -X POST http://localhost:8080/tools/context_query \
  -H "Content-Type: application/json" \
  -d '{"alias": "hono", "query": "How do I add middleware?"}'

# List active caches
curl -X POST http://localhost:8080/tools/context_list \
  -H "Content-Type: application/json" -d '{}'

# Get usage stats with cost tracking
curl -X POST http://localhost:8080/tools/context_stats \
  -H "Content-Type: application/json" -d '{}'

# Evict when done
curl -X POST http://localhost:8080/tools/context_evict \
  -H "Content-Type: application/json" \
  -d '{"alias": "hono"}'

CLI

# Start server
mnemo serve

# Start MCP stdio transport (for Claude Desktop)
mnemo stdio

# Load a project
mnemo load ./my-project my-proj

# Query
mnemo query my-proj "What's the main entry point?"

# List caches
mnemo list

# Remove cache
mnemo evict my-proj

MCP Tools

Tool Description
context_load Load GitHub repos, URLs, PDFs, or local dirs into Gemini cache
context_query Query a cached context with natural language
context_list List all active caches with token counts and expiry
context_evict Remove a cache
context_stats Get usage statistics with cost tracking
context_refresh Reload a cache with fresh content

context_load Parameters

Parameter Description
source Single source: GitHub URL, any URL, or local path
sources Multiple sources to combine into one cache
alias Friendly name for this cache (1-64 chars)
ttl Time to live in seconds (60-86400, default 3600)
githubToken GitHub token for private repos
systemInstruction Custom system prompt for queries

Configuration

Variable Description Default
GEMINI_API_KEY Your Gemini API key Required
MNEMO_PORT Server port (local only) 8080
MNEMO_DIR Data directory (local only) ~/.mnemo
MNEMO_AUTH_TOKEN Auth token for protected endpoints None

Authentication

When MNEMO_AUTH_TOKEN is configured, the /mcp and /tools/* endpoints require authentication:

# Set auth token (Workers)
bunx wrangler secret put MNEMO_AUTH_TOKEN

# Requests must include header:
Authorization: Bearer your-token-here

Public endpoints (no auth required):

  • GET /health - Health check
  • GET / - Service info
  • GET /tools - List available tools

Costs

You always pay for Gemini API usage regardless of deployment option. Mnemo uses Gemini's context caching which is significantly cheaper than standard input:

Resource Cost
Cache storage ~$4.50 per 1M tokens per hour
Cached input 75-90% discount vs regular input
Regular input ~$0.075 per 1M tokens (Flash)

Example: 100K token codebase cached for 1 hour with 10 queries ≈ $0.47

Cloudflare costs (self-hosted):

  • Workers: Free tier includes 100K requests/day
  • D1: Free tier includes 5M reads/day
  • Likely $0 for moderate usage

Architecture

┌─────────────────────────────────────────────────────────────┐
│                         Mnemo                                │
├─────────────────────────────────────────────────────────────┤
│  MCP Tools                                                   │
│  • context_load    - Load into Gemini cache                 │
│  • context_query   - Query cached context                   │
│  • context_list    - Show active caches                     │
│  • context_evict   - Remove cache                           │
│  • context_stats   - Token usage, costs                     │
│  • context_refresh - Reload cache                           │
├─────────────────────────────────────────────────────────────┤
│  Adapters (v0.2)                                             │
│  • GitHub repos (via API)                                   │
│  • URL loading (HTML, PDF, JSON, text)                      │
│  • Token-targeted crawling                                  │
│  • robots.txt compliance                                    │
├─────────────────────────────────────────────────────────────┤
│  Packages                                                    │
│  • @mnemo/core      - Gemini client, loaders, adapters      │
│  • @mnemo/mcp-server - MCP protocol handling                │
│  • @mnemo/cf-worker - Cloudflare Workers deployment         │
│  • @mnemo/local     - Bun-based local server                │
└─────────────────────────────────────────────────────────────┘

License

MIT

Credits

Built by Logos Flux | Voltage Labs

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选