VAST DB MCP Server

VAST DB MCP Server

Enables AI agents to securely query VAST Data databases for schema, metadata, and sample data via read-only SQL and MCP resources.

Category
访问服务器

README

VAST DB MCP Server

This project implements a Model Context Protocol (MCP) server designed to act as an interface between AI agents/LLMs and a VAST Data database.

Project Goal

To provide a secure and structured way for AI models to query information (schema, data samples) and execute read-only queries against a VAST DB instance using the MCP standard.

Core Technology

  • Python: >=3.9
  • MCP SDK: modelcontextprotocol/python-sdk (mcp-sdk)
  • VAST DB SDK: vast-data/vastdb_sdk (vastdb)
  • ASGI Server: uvicorn
  • MCP Implementation: FastMCP from the mcp-sdk
  • Configuration: python-dotenv
  • Testing: pytest, pytest-asyncio, pytest-mock, httpx
  • Rate Limiting: slowapi

Project Structure

/
├── sdk/                     # Cloned SDKs (ignored by git)
│   ├── python-sdk/
│   └── vastdb_sdk/
├── src/
│   └── vast_mcp_server/     # Main Python package
│       ├── __init__.py
│       ├── server.py          # FastMCP application setup
│       ├── config.py          # Loads connection details from .env
│       ├── utils.py           # Shared utility functions (e.g., auth header extraction)
│       ├── resources/         # MCP Resource handlers
│       │   ├── __init__.py
│       │   ├── schema.py      # Handler for vast://schemas
│       │   ├── table_data.py  # Handler for vast://tables/{table_name}
│       │   └── metadata.py    # Handler for vast://metadata/tables/{table_name}
│       ├── tools/             # MCP Tool handlers
│       │   ├── __init__.py
│       │   └── query.py       # Handler for vast_sql_query tool
│       └── vast_integration/  # VAST DB interaction logic
│           ├── __init__.py
│           └── db_ops.py      # Connection & query execution (async wrappers)
├── tests/                   # Pytest unit/integration tests
│   ├── __init__.py
│   ├── test_db_ops.py     # Tests for VAST DB interaction logic
│   ├── test_resources.py  # Tests for MCP resource handlers
│   └── test_tools.py      # Tests for MCP tool handlers
├── scripts/
│   └── run_server.py        # Script to start the server via uvicorn
├── .gitignore
├── .env.example             # Example environment file
├── README.md                # This file
└── pyproject.toml           # Project metadata and dependencies

Implemented MCP Features

Authentication: All resources and tools require the following HTTP headers to be sent with the request:

  • X-Vast-Access-Key: Your VAST DB access key.
  • X-Vast-Secret-Key: Your VAST DB secret key. Failure to provide these headers, or providing invalid credentials, will result in an UNAUTHENTICATED (401) error response.

Rate Limiting: All resource and tool handlers are rate-limited based on the client IP address. The default limit is configurable via the MCP_DEFAULT_RATE_LIMIT environment variable (e.g., "10/minute", see .env.example). Exceeding the limit will result in a 429 Too Many Requests error.

  • Resource: Database Schema
    • URI: vast://schemas
    • Description: Returns a formatted string describing all discovered tables and their columns (name and type). Requires authentication headers.
    • Error Handling: Returns an McpResponse with an error status code (UNAUTHENTICATED, SERVICE_UNAVAILABLE, INTERNAL_SERVER_ERROR) and a plain text error message body (ERROR: [ErrorType] Message).
  • Resource: List Tables
    • URI: vast://tables?format=FMT
    • Description: Returns a list of available table names. Requires authentication headers.
    • Parameters:
      • format (string, optional, default: json): Output format (json or csv/'list'). csv or list returns a newline-separated string.
    • Format: JSON array of strings, or a newline-separated list.
    • Error Handling: Returns an McpResponse with an error status code (UNAUTHENTICATED, BAD_REQUEST, SERVICE_UNAVAILABLE, INTERNAL_SERVER_ERROR) and a formatted error body (JSON or plain text based on format).
  • Resource: Table Metadata
    • URI: vast://metadata/tables/{table_name}
    • Description: Returns detailed metadata for a specific table. Requires authentication headers. Attempts to include column name, data type, nullability (is_nullable: e.g., 'YES'/'NO'), key information (key: e.g., 'PRI'), and default value (default). Fields beyond name/type might be null if not available from the database introspection.
    • Format: JSON object containing table_name (string) and columns (list of objects, each potentially with name, type, is_nullable, key, default keys).
    • Example Response (Enhanced):
      {
        "table_name": "my_table",
        "columns": [
          {
            "name": "id",
            "type": "INTEGER",
            "is_nullable": "NO",
            "key": "PRI",
            "default": null
          },
          {
            "name": "data_column",
            "type": "VARCHAR",
            "is_nullable": "YES",
            "key": "",
            "default": "'Default Text'"
          },
          {
            "name": "timestamp",
            "type": "TIMESTAMP",
            "is_nullable": "YES",
            "key": null,
            "default": null
          }
        ]
      }
      
    • Error Handling: Returns an McpResponse with an error status code (UNAUTHENTICATED, NOT_FOUND, BAD_REQUEST, SERVICE_UNAVAILABLE, INTERNAL_SERVER_ERROR) and a JSON error body ({\"error\": ...}).
  • Resource: Table Sample Data
    • URI: vast://tables/{table_name}?limit=N&format=FMT
    • Description: Returns a sample of data from the specified table_name. Requires authentication headers.
    • Parameters:
      • limit (integer, optional, default: 10): Maximum number of rows.
      • format (string, optional, default: csv): Output format (csv or json).
    • Format: CSV or JSON string (array of objects), including header row for CSV.
    • Error Handling: Returns an McpResponse with an error status code (UNAUTHENTICATED, BAD_REQUEST, SERVICE_UNAVAILABLE, INTERNAL_SERVER_ERROR) and a formatted error body (JSON or plain text based on format).
  • Tool: SQL Query Executor
    • Name: vast_sql_query
    • Arguments:
      • sql (string, required): The SQL query to execute.
      • format (string, optional, default: csv): Output format (csv or json).
      • headers (dict, required): Dictionary containing request headers, must include X-Vast-Access-Key and X-Vast-Secret-Key.
    • Description: Executes the provided SQL query against VAST DB using credentials from the headers argument.
    • Format: Returns results as a CSV or JSON string (array of objects) or an error message string (JSON or plain text based on format).
    • Safety: Allowed statement types controlled by MCP_ALLOWED_SQL_TYPES env var (defaults to SELECT).
    • Error Handling: Returns a formatted error string (JSON or plain text) on failure. Errors include missing/invalid headers (AuthenticationError), disallowed query types (InvalidInputError), connection issues (DatabaseConnectionError), and query execution problems (QueryExecutionError).

AI Agent Interaction Notes

When integrating this MCP server with an AI agent framework (e.g., LangChain, LlamaIndex, custom agents), consider the following:

  1. Agent Prompting/Configuration:

    • The agent's system prompt or configuration must include descriptions of the available resources and tools (similar to the "Implemented MCP Features" section above). This enables the LLM to choose the correct action based on user requests.
    • Since MCP lacks standard discovery, explicitly list the URIs (vast://schemas, vast://tables?..., vast://metadata/tables/{name}, vast://tables/{name}?....) and the tool name (vast_sql_query) with their capabilities.
  2. Authentication Handling (Security Critical):

    • NEVER include VAST DB credentials (access_key, secret_key) in prompts sent to the LLM.
    • The agent's orchestrator (the code running the agent logic, not the LLM) is responsible for managing credentials.
    • Load credentials securely on the client-side (using environment variables, secrets managers like Vault/AWS/Azure/GCP Secrets Manager, etc.).
    • For Resources: When the LLM generates a target URI, the orchestrator must:
      • Retrieve the stored credentials.
      • Construct the headers dictionary: {'X-Vast-Access-Key': '...', 'X-Vast-Secret-Key': '...'}.
      • Map the vast:// URI scheme to the actual HTTP URL of the running MCP server (e.g., http://localhost:8088/).
      • Make the HTTP GET request using an MCP client or standard HTTP client, passing the constructed headers.
    • For Tools: When the LLM decides to use the vast_sql_query tool and provides the sql and format arguments, the orchestrator must:
      • Retrieve the stored credentials.
      • Construct the headers dictionary as above.
      • Inject this headers dictionary into the arguments passed to the tool execution function. The LLM should not generate the headers argument itself.
  3. Request Construction:

    • The LLM needs to generate the correct URI path parameters ({table_name}) and query parameters (?format=, ?limit=) for resources.
    • The LLM generates the sql and format arguments for the vast_sql_query tool.
  4. Response Handling:

    • The client-side agent code needs to handle different response Content-Types (e.g., text/plain, application/json, text/csv).
    • It must check response status codes (especially for resources) to detect errors (e.g., 401, 404, 500, 503).
    • It needs to parse error messages from the response body (plain text or JSON) and potentially report them back to the user or use them for retries/alternative actions.

Potential Next Steps

  • Implement robust logging. (Done)
  • Add unit tests. (Done)
  • Refine output formats (e.g., offer JSON alongside CSV). (Done)
  • Enhance error handling and reporting. (Done - basic custom exceptions and formatting)
  • Add more granular resources/tools (e.g., list only tables, get table metadata). (Done - list tables, table metadata)
  • Implement more sophisticated query validation/sandboxing for the vast_sql_query tool. (Done - using sqlparse)
  • Make query restrictions (e.g., allowing non-SELECT) configurable. (Done - via MCP_ALLOWED_SQL_TYPES env var)
  • Add integration tests that require a running VAST DB instance or mock server. (Done - Added ASGI/mocked tests for all current resources/tools)
  • Consider adding authentication/authorization layer if needed. (Done - Header-based authentication)
  • Refactor shared code (e.g., extract_auth_headers, formatters) into utils.py. (Done)
  • Implement Rate Limiting. (Done - IP-based via slowapi on handlers)

How to Run

  1. Clone the repository (if you haven't already).
  2. Set up Environment: Copy .env.example to .env and fill in your VAST DB endpoint, access key, and secret key.
    cp .env.example .env
    # Edit .env with your details
    
  3. Install Dependencies: Using a virtual environment is recommended.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选