MCP RAG Server
Enables AI assistants to search and retrieve information from your knowledge base using RAG (Retrieval-Augmented Generation) with hybrid search, document indexing, and ChromaDB vector storage.
README
MCP RAG Server
An MCP (Model Context Protocol) server that exposes RAG capabilities to Claude Code and other MCP clients.
This is a standalone extraction from my production portfolio site. See it in action at danmonteiro.com.
The Problem
You're using Claude Code but:
- No access to your documents — Claude can't search your knowledge base
- Context is manual — you're copy-pasting relevant docs into prompts
- RAG is disconnected — your vector database isn't accessible to AI tools
- Integration is custom — every project builds its own RAG bridge
The Solution
MCP RAG Server provides:
- Standard MCP interface — works with Claude Code, Claude Desktop, and any MCP client
- Full RAG pipeline — hybrid search, query expansion, semantic chunking built-in
- Simple tools —
rag_query,rag_search,index_document,get_stats - Zero config — point at ChromaDB and go
# In Claude Code, after configuring the server:
"Search my knowledge base for articles about RAG architecture"
# Claude automatically uses rag_query tool and gets relevant context
Results
From production usage:
| Without MCP RAG | With MCP RAG |
|---|---|
| Manual context copy-paste | Automatic retrieval |
| No document search | Hybrid search built-in |
| Static knowledge | Live vector database |
| Custom integration per project | Standard MCP protocol |
Design Philosophy
Why MCP?
MCP (Model Context Protocol) standardizes how AI applications connect to external tools:
┌──────────────┐ MCP Protocol ┌──────────────┐
│ MCP Client │◀────────────────────▶│ MCP Server │
│ (Claude Code)│ │ (This repo) │
└──────────────┘ └──────────────┘
│
┌──────▼──────┐
│ RAG Pipeline │
│ (ChromaDB) │
└─────────────┘
Instead of building custom integrations, MCP provides a universal interface that any MCP-compatible client can use.
Tools Exposed
| Tool | Description |
|---|---|
rag_query |
Query with hybrid search, returns formatted context |
rag_search |
Raw similarity search, returns chunks with scores |
index_document |
Add a single document |
index_documents_batch |
Batch index multiple documents |
delete_by_source |
Delete all docs from a source |
get_stats |
Collection statistics |
clear_collection |
Clear all data (requires confirmation) |
Quick Start
1. Prerequisites
# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma
# Set OpenAI API key (for embeddings)
export OPENAI_API_KEY="sk-..."
2. Install & Build
git clone https://github.com/0xrdan/mcp-rag-server.git
cd mcp-rag-server
npm install
npm run build
3. Configure Claude Code
Add to your Claude Code MCP configuration (~/.claude/mcp.json or project .mcp.json):
{
"mcpServers": {
"rag": {
"command": "node",
"args": ["/path/to/mcp-rag-server/dist/server.js"],
"env": {
"OPENAI_API_KEY": "sk-...",
"CHROMA_URL": "http://localhost:8000",
"CHROMA_COLLECTION": "my_knowledge_base"
}
}
}
}
4. Use in Claude Code
# Restart Claude Code to load the server
claude
# Now Claude has access to RAG tools:
"Index this document into my knowledge base: [paste content]"
"Search for information about transformer architectures"
"What do my docs say about error handling?"
API Reference
rag_query
Query the knowledge base with hybrid search. Returns formatted context suitable for LLM prompts.
// Input
{
question: string; // Required: the question to search for
topK?: number; // Optional: number of results (default: 5)
threshold?: number; // Optional: min similarity 0-1 (default: 0.5)
filters?: object; // Optional: metadata filters
}
// Output
{
context: string; // Formatted context for LLM
chunks: [{
content: string;
score: number;
metadata: object;
}];
stats: {
totalChunks: number;
avgSimilarity: number;
};
}
rag_search
Raw similarity search without context formatting.
// Input
{
query: string; // Required: search query
topK?: number; // Optional: number of results (default: 10)
filters?: object; // Optional: metadata filters
}
// Output: Array of chunks with scores
index_document
Add a document to the knowledge base.
// Input
{
id: string; // Required: unique identifier
title: string; // Required: document title
content: string; // Required: document content
source: string; // Required: source identifier
category?: string; // Optional: category
tags?: string[]; // Optional: tags array
}
// Output
{
success: boolean;
documentId: string;
chunksIndexed: number;
}
get_stats
Get collection statistics.
// Output
{
totalChunks: number;
totalDocuments: number;
// ... other stats from RAG pipeline
}
Configuration
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
OPENAI_API_KEY |
Yes | - | OpenAI API key for embeddings |
CHROMA_URL |
No | http://localhost:8000 |
ChromaDB URL |
CHROMA_COLLECTION |
No | mcp_knowledge_base |
Collection name |
EMBEDDING_MODEL |
No | text-embedding-3-large |
Embedding model |
EMBEDDING_DIMENSIONS |
No | Native | Reduced dimensions |
Project Structure
mcp-rag-server/
├── src/
│ ├── server.ts # Main MCP server implementation
│ └── index.ts # Exports
├── mcp-config.example.json # Example Claude Code configuration
├── package.json
└── README.md
Advanced Usage
Programmatic Server Creation
import { createServer } from 'mcp-rag-server';
const server = await createServer({
vectorDB: {
host: 'http://custom-chroma:8000',
collectionName: 'my_collection',
},
rag: {
topK: 10,
enableHybridSearch: true,
},
});
Using with Claude Desktop
Same configuration works with Claude Desktop's MCP support:
// ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
{
"mcpServers": {
"rag": {
"command": "node",
"args": ["/path/to/mcp-rag-server/dist/server.js"]
}
}
}
Part of the Context Continuity Stack
This repo exposes context continuity as a protocol-level capability — giving any MCP client access to persistent semantic memory.
| Layer | Role | This Repo |
|---|---|---|
| Intra-session | Short-term memory | — |
| Document-scoped | Injected content | — |
| Retrieved | Long-term semantic memory via MCP | mcp-rag-server |
| Progressive | Staged responses | — |
MCP RAG Server bridges the gap between vector databases and AI assistants. Instead of building custom integrations, any MCP-compatible tool (Claude Code, Claude Desktop, custom clients) gets instant access to your knowledge base.
Related repos:
- rag-pipeline — The underlying RAG implementation
- mcp-client-example — Reference client for connecting to this server
- chatbot-widget — Session cache, Research Mode, conversation export
- ai-orchestrator — Multi-model LLM routing
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feat/add-new-tool) - Make changes with semantic commits
- Open a PR with clear description
License
MIT License - see LICENSE for details.
Acknowledgments
Built with Claude Code.
Co-Authored-By: Claude <noreply@anthropic.com>
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。