Wikidata MCP Server

Wikidata MCP Server

Connects LLMs to Wikidata's structured knowledge base using a hybrid architecture that optimizes for both fast entity searches and complex relational queries. It provides tools for entity and property retrieval, metadata lookups, and direct SPARQL execution to ground AI responses in verified data.

Category
访问服务器

README

Wikidata MCP Server - Optimized Hybrid Architecture

A Model Context Protocol (MCP) server with Server-Sent Events (SSE) transport that connects Large Language Models to Wikidata's structured knowledge base. Features an optimized hybrid architecture that balances speed, accuracy, and verifiability by using fast basic tools for simple queries and advanced orchestration only for complex temporal/relational queries.

Architecture Highlights

  • 🚀 Fast Basic Tools: 140-250ms for simple entity/property searches
  • 🧠 Advanced Orchestration: 1-11s for complex temporal queries (when needed)
  • ⚡ 50x Performance Difference: Empirically measured and optimized
  • 🔄 Hybrid Approach: Right tool for each query type
  • 🛡️ Graceful Degradation: Works with or without Vector DB API key

MCP Tools

Basic Tools (Fast & Reliable)

  • search_wikidata_entity: Find entities by name (140-250ms)
  • search_wikidata_property: Find properties by name (~200ms)
  • get_wikidata_metadata: Entity labels, descriptions (~200ms)
  • get_wikidata_properties: All entity properties (~200ms)
  • execute_wikidata_sparql: Direct SPARQL queries (~200ms)

Advanced Tool (Complex Queries)

  • query_wikidata_complex: Temporal/relational queries (1-11s)
    • ✅ "last 3 popes", "recent presidents of France"
    • ❌ Simple entity searches (use basic tools instead)

Live Demo

The server is deployed and accessible at:

Usage with Claude Desktop

To use this server with Claude Desktop:

  1. Install mcp-remote (if not already installed):

    npm install -g @modelcontextprotocol/mcp-remote
    
  2. Edit the Claude Desktop configuration file located at:

    ~/Library/Application Support/Claude/claude_desktop_config.json
    
  3. Configure it to use the remote MCP server:

    {
      "mcpServers": {
        "Wikidata MCP": {
          "command": "npx",
          "args": [
            "mcp-remote",
            "https://wikidata-mcp-mirror.onrender.com/mcp"
          ]
        }
      }
    }
    
  4. Restart Claude Desktop

  5. When using Claude, you can now access Wikidata knowledge through the configured MCP server.

Deployment

Deploying to Render

  1. Create a new Web Service in your Render dashboard
  2. Connect your GitHub repository
  3. Configure the service:
    • Build Command: pip install -e .
    • Start Command: python -m wikidata_mcp.api
  4. Set Environment Variables:
    • Add all variables from .env.example
    • For production, set DEBUG=false
    • Make sure to set a proper WIKIDATA_VECTORDB_API_KEY
  5. Deploy

The service will be available at https://your-service-name.onrender.com

Environment Setup

Prerequisites

  • Python 3.10+
  • Virtual environment tool (venv, conda, etc.)
  • Vector DB API key (for enhanced semantic search)

Environment Variables

Create a .env file in the project root with the following variables:

# Required for Vector DB integration

1. Clone the repository:
   ```bash
   git clone https://github.com/yourusername/wikidata-mcp-mirror.git
   cd wikidata-mcp-mirror
  1. Create and activate a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: .\venv\Scripts\activate
    
  2. Install the required dependencies:

    pip install -e .
    
  3. Create a .env file based on .env.example and configure your environment variables:

    cp .env.example .env
    # Edit .env with your configuration
    
  4. Run the application:

    # Development
    python -m wikidata_mcp.api
    
    # Production (with Gunicorn)
    gunicorn --bind 0.0.0.0:8000 --workers 4 --timeout 120 --keep-alive 5 --worker-class uvicorn.workers.UvicornWorker wikidata_mcp.api:app
    

    The server will start on http://localhost:8000 by default with the following endpoints:

    • GET /health - Health check
    • GET /messages/ - SSE endpoint for MCP communication
    • GET /docs - Interactive API documentation (if enabled)
    • GET /metrics - Prometheus metrics (if enabled)

Environment Variables

Variable Default Description
PORT 8000 Port to run the server on
WORKERS 4 Number of worker processes
TIMEOUT 120 Worker timeout in seconds
KEEPALIVE 5 Keep-alive timeout in seconds
DEBUG false Enable debug mode
LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
USE_VECTOR_DB true Enable/disable vector DB integration
USE_CACHE true Enable/disable caching system
USE_FEEDBACK true Enable/disable feedback system
CACHE_TTL_SECONDS 3600 Cache time-to-live in seconds
CACHE_MAX_SIZE 1000 Maximum number of items in cache
WIKIDATA_VECTORDB_API_KEY API key for the vector DB service

Running with Docker

  1. Build the Docker image:

    docker build -t wikidata-mcp .
    
  2. Run the container:

    docker run -p 8000:8000 --env-file .env wikidata-mcp
    

Running with Docker Compose

  1. Start the application:

    docker-compose up --build
    
  2. For production, use the production compose file:

    docker-compose -f docker-compose.prod.yml up --build -d
    

Monitoring

The service exposes Prometheus metrics at /metrics when the PROMETHEUS_METRICS environment variable is set to true.

Health Check

curl http://localhost:8000/health

Metrics

curl http://localhost:8000/metrics

Testing

Running Tests

Run the test suite with:

# Run all tests
pytest

# Run specific test file
pytest tests/orchestration/test_query_orchestrator.py -v

# Run with coverage report
pytest --cov=wikidata_mcp tests/

Integration Tests

To test the Vector DB integration, you'll need to set the WIKIDATA_VECTORDB_API_KEY environment variable:

WIKIDATA_VECTORDB_API_KEY=your_key_here pytest tests/orchestration/test_vectordb_integration.py -v

Test Client

You can also test the server using the included test client:

python test_mcp_client.py

Or manually with curl:

# Connect to SSE endpoint
curl -N -H "Accept: text/event-stream" https://wikidata-mcp-mirror.onrender.com/messages/

# Send a message (replace SESSION_ID with the one received from the SSE endpoint)
curl -X POST "https://wikidata-mcp-mirror.onrender.com/messages/?session_id=YOUR_SESSION_ID" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test-client","version":"0.1.0"}},"id":0}'

Deployment on Render.com

This server is configured for deployment on Render.com using the render.yaml file.

Deployment Configuration

  • Build Command: pip install -r requirements.txt
  • Start Command: gunicorn -k uvicorn.workers.UvicornWorker server_sse:app
  • Environment Variables:
    • PORT: 10000
  • Health Check Path: /health

Docker Support

The repository includes a Dockerfile that's used by Render.com for containerized deployment. This allows the server to run in a consistent environment with all dependencies properly installed.

How to Deploy

  1. Fork or clone this repository to your GitHub account
  2. Create a new Web Service on Render.com
  3. Connect your GitHub repository
  4. Render will automatically detect the render.yaml file and configure the deployment
  5. Click "Create Web Service"

After deployment, you can access your server at the URL provided by Render.com.

Architecture

The server is built using:

  • FastAPI: For handling HTTP requests and routing
  • SSE Transport: For bidirectional communication with clients
  • MCP Framework: For implementing the Model Context Protocol
  • Wikidata API: For accessing Wikidata's knowledge base

Key Components

  • server_sse.py: Main server implementation with SSE transport
  • wikidata_api.py: Functions for interacting with Wikidata's API and SPARQL endpoint
  • requirements.txt: Dependencies for the project
  • Dockerfile: Container configuration for Docker deployment on Render
  • render.yaml: Configuration for deployment on Render.com
  • test_mcp_client.py: Test client for verifying server functionality

Available MCP Tools

The server provides the following MCP tools:

  • search_wikidata_entity: Search for entities by name
  • search_wikidata_property: Search for properties by name
  • get_wikidata_metadata: Get entity metadata (label, description)
  • get_wikidata_properties: Get all properties for an entity
  • execute_wikidata_sparql: Execute a SPARQL query
  • find_entity_facts: Search for an entity and find its facts
  • get_related_entities: Find entities related to a given entity

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Based on the Model Context Protocol (MCP) specification
  • Uses Wikidata as the knowledge source
  • Inspired by the MCP examples from the official documentation

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选