MCP RAG System

MCP RAG System

A Retrieval-Augmented Generation system that enables uploading, processing, and semantic search of PDF documents using vector embeddings and FAISS indexing for context-aware question answering.

Category
访问服务器

README

MCP RAG System

A comprehensive Retrieval-Augmented Generation (RAG) system built using the Model Context Protocol (MCP) for storing, processing, and searching PDF documents.

Features

🔧 Tools

  • upload_pdf: Upload and process PDF files with automatic text extraction and chunking
  • search_documents: Semantic search across all uploaded documents using vector embeddings
  • list_documents: View all uploaded documents and their metadata
  • delete_document: Remove documents and their associated chunks from the system
  • get_rag_stats: Get comprehensive statistics about the RAG system

📦 Resources

  • rag://documents: List all documents in the system
  • rag://document/{document_id}: Get full content of a specific document
  • rag://stats: Get system statistics

💬 Prompts

  • rag_query_prompt: Generate prompts for RAG-based question answering
  • document_summary_prompt: Create document summarization prompts
  • search_suggestions_prompt: Generate better search query suggestions

Installation

  1. Install dependencies:

    pip install -r requirements.txt
    
  2. Download required models: The system will automatically download the sentence-transformers model on first use.

Usage

Starting the Server

python mcp_server.py

The server will start on http://localhost:8000 with SSE (Server-Sent Events) transport.

Using the Client

Demo Mode

python mcp_client.py
# Choose option 1 for demo mode

Interactive Mode

python mcp_client.py
# Choose option 2 for interactive mode

Available commands in interactive mode:

  • upload - Upload a PDF file
  • search - Search documents with a query
  • list - List all uploaded documents
  • stats - Show system statistics
  • quit - Exit the client

Example Workflow

  1. Upload a PDF:

    # Via tool call
    result = await session.call_tool("upload_pdf", arguments={
        "file_path": "/path/to/document.pdf",
        "document_name": "My Research Paper"
    })
    
  2. Search documents:

    # Via tool call
    result = await session.call_tool("search_documents", arguments={
        "query": "machine learning applications",
        "top_k": 5
    })
    
  3. Use RAG prompt:

    # Get search results first, then use in prompt
    prompt = await session.get_prompt("rag_query_prompt", arguments={
        "query": "What are the key findings?",
        "context_chunks": search_results_text
    })
    

System Architecture

Document Processing Pipeline

  1. PDF Upload → Text extraction using PyMuPDF/PyPDF2
  2. Text Chunking → Split into overlapping chunks (1000 chars, 200 overlap)
  3. Embedding Generation → Create vector embeddings using SentenceTransformers
  4. Storage → Store in FAISS index with metadata

Storage Structure

rag_storage/
├── documents/          # Original extracted text
├── chunks/            # Individual text chunks
├── embeddings/        # Numpy arrays of embeddings
├── faiss_index.bin    # FAISS vector index
└── metadata.json      # Document and chunk metadata

Vector Search

  • Model: all-MiniLM-L6-v2 (384-dimensional embeddings)
  • Index: FAISS IndexFlatIP (Inner Product similarity)
  • Search: Cosine similarity for semantic matching

Configuration

Chunk Settings

Modify in mcp_server.py:

def _create_text_chunks(text: str, chunk_size: int = 1000, overlap: int = 200):

Embedding Model

Change the model in RAGSystem.__init__():

self.embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

Storage Location

Set custom storage directory:

rag_system = RAGSystem(storage_dir="custom_rag_storage")

API Reference

Tools

upload_pdf

  • Parameters: file_path (str), document_name (optional str)
  • Returns: Document ID, chunk count, success status

search_documents

  • Parameters: query (str), top_k (optional int, default 5)
  • Returns: Ranked list of relevant chunks with scores

list_documents

  • Parameters: None
  • Returns: List of all documents with metadata

delete_document

  • Parameters: document_id (str)
  • Returns: Success status and confirmation message

get_rag_stats

  • Parameters: None
  • Returns: System statistics (documents, chunks, storage size)

Resources

rag://documents

Returns formatted list of all documents in the system.

rag://document/{document_id}

Returns full text content of specified document with metadata header.

rag://stats

Returns formatted system statistics.

Prompts

rag_query_prompt

  • Parameters: query (str), context_chunks (str)
  • Returns: Structured prompt for RAG-based QA

document_summary_prompt

  • Parameters: document_content (str)
  • Returns: Prompt for document summarization

search_suggestions_prompt

  • Parameters: query (str), available_documents (str)
  • Returns: Prompt for generating better search queries

Performance Considerations

Memory Usage

  • Embeddings: ~1.5KB per chunk (384 float32 values)
  • FAISS index: Scales linearly with number of chunks
  • Text storage: Depends on document size and chunking

Search Speed

  • FAISS IndexFlatIP: O(n) search time
  • For large collections, consider IndexIVFFlat or IndexHNSW

Optimization Tips

  1. Batch uploads for multiple documents
  2. Adjust chunk size based on document type
  3. Use GPU with faiss-gpu for large datasets
  4. Implement caching for frequent queries

Troubleshooting

Common Issues

  1. PDF text extraction fails:

    • Ensure PDF is not password-protected
    • Try different PDF files to isolate the issue
    • Check PyMuPDF and PyPDF2 installation
  2. Memory errors with large documents:

    • Reduce chunk size
    • Process documents in batches
    • Monitor system memory usage
  3. Search returns no results:

    • Verify documents are uploaded successfully
    • Check query similarity to document content
    • Try broader search terms
  4. Server connection issues:

    • Ensure server is running on correct port
    • Check firewall settings
    • Verify MCP client configuration

Debug Mode

Enable detailed logging by modifying the server:

import logging
logging.basicConfig(level=logging.DEBUG)

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Submit a pull request

License

This project is licensed under the MIT License. #� �M�C�P� � �

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选