MCP SPARQL Server

MCP SPARQL Server

Enables AI assistants to query semantic data via SPARQL endpoints, with support for multiple output formats and caching.

Category
访问服务器

README

MCP SPARQL Server

<div align="center">

License: AGPL-3.0 Python: 3.8+

A flexible and powerful SPARQL-enabled server for MCP (Model Context Protocol)

</div>

🌟 Overview

MCP SPARQL Server is a high-performance, configurable server that connects to any SPARQL endpoint and provides enhanced functionality including result formatting and caching. It's built on top of the FastMCP framework implementing the Model Context Protocol (MCP) to provide a seamless interface for AI assistants to query semantic data.

✨ Features

  • Universal Endpoint Support: Connect to any SPARQL-compliant endpoint
  • Full SPARQL Support: Execute any valid SPARQL query (SELECT, ASK, CONSTRUCT, DESCRIBE)
  • Intelligent Result Formatting:
    • Standard JSON (compatible with standard SPARQL clients)
    • Simplified JSON (easier to work with in applications)
    • Tabular format (ready for display in UI tables)
  • High-Performance Caching:
    • Multiple cache strategies (LRU, LFU, FIFO)
    • Configurable TTL (time-to-live)
    • Cache management tools
  • Flexible Deployment Options:
    • Run in foreground mode with stdio or HTTP transport
    • Run as a background daemon
    • Deploy as a systemd service
    • HTTP server mode for nginx reverse proxy integration
  • Comprehensive Configuration:
    • Command-line arguments
    • Environment variables
    • No hardcoded values

📋 Requirements

  • Python 3.8 or newer
  • SPARQLWrapper library
  • fastmcp framework
  • pydantic for configuration
  • python-daemon for background execution

🚀 Installation

From Source

# Clone the repository
git clone https://github.com/yet-market/yet-sparql-mcp-server.git
cd yet-sparql-mcp-server

# Install dependencies
pip install -r requirements.txt

# Install the package
pip install -e .

From PyPI

pip install mcp-server-sparql

Using the Installation Script

For a full installation with systemd service setup:

# Download the repository
git clone https://github.com/yet-market/yet-sparql-mcp-server.git
cd yet-sparql-mcp-server

# Run the installation script (as root for systemd service)
sudo ./install.sh

The installer automatically:

  • Sets up virtual environment and dependencies
  • Creates and starts systemd services
  • Provides management scripts for easy control
  • Configures the HTTP service for nginx integration

Quick Management

After installation, use the provided scripts:

# Check status
./status.sh

# Start/stop services
./start.sh [stdio|http]    # Start service (http is default)
./stop.sh [stdio|http]     # Stop service
./logs.sh [stdio|http]     # View logs

# Examples
./start.sh http            # Start HTTP service for nginx
./logs.sh http             # View HTTP service logs
./test.sh                  # Test server functionality
./stop.sh                  # Stop all services

Testing Your Installation

Test your server with the built-in test suite:

# Run comprehensive tests
./test.sh

# Run specific tests
./test.sh endpoint         # Test HTTP connectivity
./test.sh service          # Test service status
./test.sh config           # Test configuration
./test.sh mcp              # Test MCP protocol
./test.sh client           # Test with FastMCP client
./test.sh info             # Show connection info

The test script validates:

  • ✅ Service status and health
  • ✅ HTTP endpoint connectivity
  • ✅ MCP protocol compliance
  • ✅ Configuration validity
  • ✅ SPARQL endpoint reachability
  • ✅ FastMCP client integration

🔍 Usage

Basic Usage (stdio transport)

Start the server by specifying a SPARQL endpoint:

python server.py --endpoint https://dbpedia.org/sparql

HTTP Server Mode (for nginx/web integration)

Run the server as an HTTP server:

# Basic HTTP server
python server.py --transport http --endpoint https://dbpedia.org/sparql

# Custom host and port
python server.py --transport http --host 0.0.0.0 --port 8080 --endpoint https://dbpedia.org/sparql

# Using environment variables
export MCP_TRANSPORT=http
export MCP_HOST=0.0.0.0
export MCP_PORT=8000
export SPARQL_ENDPOINT=https://dbpedia.org/sparql
python server.py

Running as a Daemon

To run the server as a background process (stdio transport):

python server.py --endpoint https://dbpedia.org/sparql --daemon \
  --log-file /var/log/mcp-sparql.log \
  --pid-file /var/run/mcp-sparql.pid

To run the HTTP server as a daemon:

python server.py --transport http --host 0.0.0.0 --port 8000 \
  --endpoint https://dbpedia.org/sparql --daemon \
  --log-file /var/log/mcp-sparql.log \
  --pid-file /var/run/mcp-sparql.pid

Using with Systemd

If installed with systemd support:

  1. Configure your endpoint in the environment file:

    sudo nano /etc/mcp-sparql/env
    
  2. Start the service:

    sudo systemctl start sparql-server
    
  3. Enable on boot:

    sudo systemctl enable sparql-server
    

Client Query Examples

After starting the server, you can use it with any MCP-compatible client or through the FastMCP client:

Using FastMCP Client with stdio transport (Python)

import asyncio
from fastmcp.client import Client, PythonStdioTransport

async def query_server():
    # Connect to the server
    transport = PythonStdioTransport(
        script_path="server.py",
        args=["--endpoint", "https://dbpedia.org/sparql"]
    )
    
    async with Client(transport) as client:
        # Execute a SPARQL query
        result = await client.call_tool("query", {
            "query_string": "SELECT * WHERE { ?s ?p ?o } LIMIT 5",
            "format": "simplified"
        })
        
        print(result[0].text)

asyncio.run(query_server())

Using FastMCP Client with HTTP transport (Python)

import asyncio
from fastmcp.client import Client, HttpTransport

async def query_server():
    # Connect to HTTP server
    transport = HttpTransport("http://localhost:8000")
    
    async with Client(transport) as client:
        # Execute a SPARQL query
        result = await client.call_tool("query", {
            "query_string": "SELECT * WHERE { ?s ?p ?o } LIMIT 5",
            "format": "simplified"
        })
        
        print(result[0].text)

asyncio.run(query_server())

Query with Different Formats

# JSON format (default)
result = await client.call_tool("query", {
    "query_string": "SELECT * WHERE { ?s ?p ?o } LIMIT 5",
    "format": "json"
})

# Tabular format
result = await client.call_tool("query", {
    "query_string": "SELECT * WHERE { ?s ?p ?o } LIMIT 5",
    "format": "tabular"
})

Cache Management

# Get cache statistics
cache_stats = await client.call_tool("cache", {"action": "stats"})

# Clear the cache
cache_clear = await client.call_tool("cache", {"action": "clear"})

⚙️ Configuration

Command-line Arguments

<table> <thead> <tr> <th>Argument</th> <th>Description</th> <th>Default</th> </tr> </thead> <tbody> <tr> <td><code>--endpoint URL</code></td> <td>SPARQL endpoint URL</td> <td>Required</td> </tr> <tr> <td><code>--timeout SECONDS</code></td> <td>Request timeout in seconds</td> <td>30</td> </tr> <tr> <td><code>--format FORMAT</code></td> <td>Result format (json, simplified, tabular)</td> <td>json</td> </tr> <tr> <td><code>--cache-enabled BOOL</code></td> <td>Enable result caching</td> <td>true</td> </tr> <tr> <td><code>--cache-ttl SECONDS</code></td> <td>Cache time-to-live in seconds</td> <td>300</td> </tr> <tr> <td><code>--cache-max-size SIZE</code></td> <td>Maximum cache size</td> <td>100</td> </tr> <tr> <td><code>--cache-strategy STRATEGY</code></td> <td>Cache replacement strategy (lru, lfu, fifo)</td> <td>lru</td> </tr> <tr> <td><code>--pretty-print</code></td> <td>Pretty print JSON output</td> <td>false</td> </tr> <tr> <td><code>--include-metadata BOOL</code></td> <td>Include query metadata in results</td> <td>true</td> </tr> <tr> <td><code>--daemon</code></td> <td>Run as a background daemon</td> <td>false</td> </tr> <tr> <td><code>--log-file FILE</code></td> <td>Log file location when running as a daemon</td> <td>/var/log/mcp-sparql-server.log</td> </tr> <tr> <td><code>--pid-file FILE</code></td> <td>PID file location when running as a daemon</td> <td>/var/run/mcp-sparql-server.pid</td> </tr> <tr> <td><code>--transport TRANSPORT</code></td> <td>Transport type (stdio or http)</td> <td>stdio</td> </tr> <tr> <td><code>--host HOST</code></td> <td>Host to bind HTTP server to</td> <td>localhost</td> </tr> <tr> <td><code>--port PORT</code></td> <td>Port to bind HTTP server to</td> <td>8000</td> </tr> </tbody> </table>

Environment Variables

<table> <thead> <tr> <th>Variable</th> <th>Description</th> <th>Default</th> </tr> </thead> <tbody> <tr> <td><code>SPARQL_ENDPOINT</code></td> <td>SPARQL endpoint URL</td> <td>None (required)</td> </tr> <tr> <td><code>SPARQL_TIMEOUT</code></td> <td>Request timeout in seconds</td> <td>30</td> </tr> <tr> <td><code>SPARQL_FORMAT</code></td> <td>Default result format</td> <td>json</td> </tr> <tr> <td><code>SPARQL_CACHE_ENABLED</code></td> <td>Enable caching</td> <td>true</td> </tr> <tr> <td><code>SPARQL_CACHE_TTL</code></td> <td>Cache time-to-live in seconds</td> <td>300</td> </tr> <tr> <td><code>SPARQL_CACHE_MAX_SIZE</code></td> <td>Maximum cache size</td> <td>100</td> </tr> <tr> <td><code>SPARQL_CACHE_STRATEGY</code></td> <td>Cache replacement strategy</td> <td>lru</td> </tr> <tr> <td><code>SPARQL_PRETTY_PRINT</code></td> <td>Pretty print JSON output</td> <td>false</td> </tr> <tr> <td><code>SPARQL_INCLUDE_METADATA</code></td> <td>Include query metadata in results</td> <td>true</td> </tr> <tr> <td><code>MCP_TRANSPORT</code></td> <td>Transport type (stdio or http)</td> <td>stdio</td> </tr> <tr> <td><code>MCP_HOST</code></td> <td>Host to bind HTTP server to</td> <td>localhost</td> </tr> <tr> <td><code>MCP_PORT</code></td> <td>Port to bind HTTP server to</td> <td>8000</td> </tr> </tbody> </table>

🌐 Nginx Integration

When using HTTP transport, you can integrate the server with nginx as a reverse proxy:

1. Start the server in HTTP mode

python server.py --transport http --host localhost --port 8000 --endpoint https://your-sparql-endpoint.com/sparql

2. Configure nginx

Add this configuration to your nginx server block:

upstream mcp_sparql {
    server localhost:8000;
    # Add more servers for load balancing if needed
    # server localhost:8001;
    # server localhost:8002;
}

server {
    listen 80;
    server_name your-domain.com;

    location /api/sparql {
        proxy_pass http://mcp_sparql;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Optional: Add CORS headers
        add_header Access-Control-Allow-Origin *;
        add_header Access-Control-Allow-Methods "GET, POST, OPTIONS";
        add_header Access-Control-Allow-Headers "Content-Type, Authorization";
    }
}

3. Production deployment with systemd

Create a systemd service for HTTP mode:

# /etc/systemd/system/mcp-sparql-http.service
[Unit]
Description=MCP SPARQL Server (HTTP)
After=network.target

[Service]
Type=simple
User=mcp-sparql
Group=mcp-sparql
WorkingDirectory=/opt/mcp-sparql
Environment=MCP_TRANSPORT=http
Environment=MCP_HOST=localhost
Environment=MCP_PORT=8000
Environment=SPARQL_ENDPOINT=https://your-sparql-endpoint.com/sparql
ExecStart=/opt/mcp-sparql/venv/bin/python server.py
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Enable and start the service:

sudo systemctl enable mcp-sparql-http
sudo systemctl start mcp-sparql-http

📊 Result Formats

The server supports three different output formats:

1. JSON Format (default)

Returns the standard SPARQL JSON results format with optional metadata.

{
  "head": {
    "vars": ["s", "p", "o"]
  },
  "results": {
    "bindings": [
      {
        "s": { "type": "uri", "value": "http://example.org/resource" },
        "p": { "type": "uri", "value": "http://example.org/property" },
        "o": { "type": "literal", "value": "Example Value" }
      }
    ]
  },
  "metadata": {
    "variables": ["s", "p", "o"],
    "count": 1,
    "query": "SELECT * WHERE { ?s ?p ?o } LIMIT 1"
  }
}

2. Simplified Format

Returns a simplified JSON structure that's easier to work with, converting variable bindings into simple key-value objects.

{
  "type": "SELECT",
  "results": [
    {
      "s": "http://example.org/resource",
      "p": "http://example.org/property",
      "o": "Example Value"
    }
  ],
  "metadata": {
    "variables": ["s", "p", "o"],
    "count": 1,
    "query": "SELECT * WHERE { ?s ?p ?o } LIMIT 1"
  }
}

3. Tabular Format

Returns results in a tabular format with columns and rows, suitable for table display.

{
  "type": "SELECT",
  "columns": [
    { "name": "s", "label": "s" },
    { "name": "p", "label": "p" },
    { "name": "o", "label": "o" }
  ],
  "rows": [
    [
      "http://example.org/resource",
      "http://example.org/property",
      "Example Value"
    ]
  ],
  "metadata": {
    "variables": ["s", "p", "o"],
    "count": 1,
    "query": "SELECT * WHERE { ?s ?p ?o } LIMIT 1"
  }
}

🔄 Cache Strategies

The server supports three cache replacement strategies:

1. LRU (Least Recently Used)

Evicts the least recently accessed items first. This is the default strategy and works well for most scenarios, as it prioritizes keeping recently accessed items in the cache.

2. LFU (Least Frequently Used)

Evicts the least frequently accessed items first. This strategy is good for scenarios where some queries are much more common than others, as it prioritizes keeping frequently accessed items in the cache.

3. FIFO (First In First Out)

Evicts the oldest items first, regardless of access patterns. This strategy is simpler and can be useful when you want a purely time-based caching approach.

🔍 Advanced SPARQL Examples

The server supports all SPARQL features. Here are some example queries you can try:

Basic Triple Pattern

SELECT * WHERE { ?s ?p ?o } LIMIT 10

Filtering by Property Type

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?subject ?label
WHERE {
    ?subject rdf:type rdfs:Class ;
             rdfs:label ?label .
}
LIMIT 10

Using Regular Expressions

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?person ?name
WHERE {
    ?person foaf:name ?name .
    FILTER(REGEX(?name, "Smith", "i"))
}
LIMIT 10

Complex Query with Multiple Patterns

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?city ?name ?population ?country ?countryName
WHERE {
    ?city a dbo:City ;
          rdfs:label ?name ;
          dbo:population ?population ;
          dbo:country ?country .
    ?country rdfs:label ?countryName .
    FILTER(?population > 1000000)
    FILTER(LANG(?name) = 'en')
    FILTER(LANG(?countryName) = 'en')
}
ORDER BY DESC(?population)
LIMIT 10

⚠️ Troubleshooting

Common Issues

  • Connection refused: Check that the SPARQL endpoint URL is correct and accessible
  • Query timeout: Increase the timeout value with --timeout option
  • Memory issues with large result sets: Add LIMIT clause to your queries or reduce cache size
  • Permission denied for log/pid files: Check directory permissions or run with appropriate privileges

Logging

When running in foreground mode, logs are output to the console. When running as a daemon, logs are written to the specified log file (default: /var/log/mcp-sparql-server.log).

To increase verbosity, you can set the Python logging level in the source code.

🛠️ Development

Project Structure

mcp-server-sparql/
├── sparql_server/             # Main package
│   ├── core/                  # Core functionality
│   │   ├── __init__.py        # Package exports
│   │   ├── config.py          # Configuration management
│   │   └── server.py          # Main SPARQL server
│   ├── formatters/            # Result formatters
│   │   ├── __init__.py        # Package exports
│   │   ├── formatter.py       # Base formatter class
│   │   ├── json_formatter.py  # JSON formatter
│   │   ├── simplified_formatter.py # Simplified JSON formatter
│   │   └── tabular_formatter.py # Tabular formatter
│   ├── cache/                 # Caching implementation
│   │   ├── __init__.py        # Package exports
│   │   ├── query_cache.py     # Base cache interface
│   │   ├── lru_cache.py       # LRU cache implementation
│   │   ├── lfu_cache.py       # LFU cache implementation
│   │   └── fifo_cache.py      # FIFO cache implementation
│   └── __init__.py            # Package exports
├── server.py                  # Main entry point
├── setup.py                   # Package setup
├── install.sh                 # Installation script
├── requirements.txt           # Python dependencies
├── sparql-server.service      # Systemd service file
├── README.md                  # This file
└── LICENSE                    # License file

Running Tests

# Test stdio transport
python test_sparql_server.py

# Test HTTP transport (start server first)
# Terminal 1:
python server.py --transport http --endpoint https://dbpedia.org/sparql
# Terminal 2:
# Run HTTP-specific tests (if available)

🔒 Security Considerations

  • The server doesn't implement authentication or authorization - it relies on the security of the underlying SPARQL endpoint
  • For production use, consider deploying behind a secure proxy
  • Be careful with untrusted queries as they could potentially be resource-intensive

📄 License

This project is licensed under a dual-license model:

  • Open Source: GNU Affero General Public License v3.0 (AGPL-3.0) for open source use
  • Commercial: Proprietary commercial license available for commercial or proprietary use

See the LICENSE file for complete details.

This software was imagined and developed by Temkit Sid-Ali for Yet.lu with AI assistance from Claude (Anthropic), GitHub Copilot/Codex (OpenAI), and GPT-o3 (OpenAI).

🚀 Roadmap

We have exciting plans for the future! Check out our detailed roadmap to see what's coming next, including:

  • 🔒 Enhanced security and authentication
  • 🌐 Web interface for query exploration
  • 🤖 AI-powered natural language to SPARQL conversion
  • 📊 Advanced data visualization and analytics
  • 🏢 Enterprise features and scalability improvements

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

See our Contributing Guide for detailed development setup and guidelines.

🙏 Acknowledgments

This project is built on top of several excellent open source projects:

  • FastMCP - High-level Python framework for building MCP servers and clients by Joel Lowin
  • SPARQLWrapper - SPARQL endpoint interface for Python by the RDFLib team
  • Pydantic - Data validation using Python type hints by Samuel Colvin and the Pydantic team
  • python-daemon - Library for making Unix daemon processes

Special thanks to:

  • The Model Context Protocol (MCP) community for developing the protocol specification
  • Anthropic for their work on AI assistants and the MCP ecosystem
  • The semantic web and RDF communities for their foundational work
  • W3C for the SPARQL specification and standards
  • AI Development Partners:
    • Claude (Anthropic) - Primary development assistant for architecture, code implementation, and documentation
    • GitHub Copilot/Codex (OpenAI) - Code completion and development acceleration
    • GPT-o3 (OpenAI) - Advanced reasoning and problem-solving assistance

📬 Contact


<div align="center"> <sub>Built with ❤️ by Temkit Sid-Ali for <a href="https://yet.lu">Yet.lu</a></sub> <br> <sub>Co-developed with Claude (Anthropic), GitHub Copilot/Codex & GPT-o3 (OpenAI)</sub> <br> <sub>© 2025 <a href="https://yet.lu">Yet.lu</a> - All rights reserved</sub> </div>

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选