data-explore

data-explore

An MCP server for dataset exploration and analysis, enabling LLM clients to perform summary, correlation, distribution, missing value analysis, data cleaning, and statistical tests directly on CSV files.

Category
访问服务器

README

MCP Data Exploration Server

An MCP (Model Context Protocol) server that provides dataset exploration and analysis tools for any LLM client. The server performs actual data analysis and returns formatted results, eliminating the need for users to write or execute code.

Features

  • Dataset Analysis: Comprehensive summary, correlation analysis, distribution analysis, and missing value detection
  • Data Cleaning: Automated cleaning operations with detailed results
  • Statistical Testing: Normality tests, correlation significance tests, and t-tests
  • MCP Compatible: Works with any MCP-compatible client (Claude Desktop, custom clients, etc.)

Quick Start

The fastest way to get started:

# 1. Navigate to project directory
cd /path/to/mcp-data-explore

# 2. Install dependencies
uv sync

# 3. Add MCP server to Claude Code (project scope - shared with team)
claude mcp add data-explore -s project -- uv --directory . run python main.py

# 4. Start Claude Code
claude

# 5. Try it out in Claude Code
# "Analyze the test_data.csv file"
# "What's the correlation between age and salary?"

Installation

Prerequisites

  • Python 3.13 or higher
  • uv package manager

Setup

  1. Clone or download this repository:
git clone <repository-url>
cd mcp-data-explore
  1. Install dependencies:
uv sync
  1. Test the server:
python main.py

Connecting to MCP Clients

Claude Code

Claude Code has excellent built-in MCP support. Here's how to connect:

  1. Project MCP Configuration (Recommended)

    The project already includes a .mcp.json file that's shared with everyone:

    {
      "mcpServers": {
        "data-explore": {
          "command": "uv",
          "args": [
            "--directory",
            ".",
            "run",
            "python",
            "main.py"
          ],
          "env": {}
        }
      }
    }
    

    This file is automatically detected by Claude Code when you start it in the project directory.

    Environment Variable Support in .mcp.json:

    You can use environment variables in your .mcp.json for flexibility:

    {
      "mcpServers": {
        "data-explore": {
          "command": "${UV_COMMAND:-uv}",
          "args": [
            "--directory",
            "${PROJECT_DIR:-.}",
            "run",
            "python",
            "main.py"
          ],
          "env": {
            "PYTHONPATH": "${CUSTOM_PYTHON_PATH:-}"
          }
        }
      }
    }
    

    Security Note: Claude Code will prompt for approval before using project-scoped servers from .mcp.json files for security.

  2. Using Claude Code CLI to Add MCP Server

    You can add the MCP server using Claude Code CLI commands:

    # Add MCP server to project scope (shared with team via .mcp.json)
    claude mcp add data-explore -s project -- uv --directory . run python main.py
    
    # Add MCP server to user scope (available across all your projects)
    claude mcp add data-explore -s user -- uv --directory /Users/ida/Documents/eric/mcp-data-explore run python main.py
    
    # Add MCP server to local scope (private to you in this project) - DEFAULT
    claude mcp add data-explore -- uv --directory . run python main.py
    # or explicitly specify local scope
    claude mcp add data-explore -s local -- uv --directory . run python main.py
    
    # Add with environment variables if needed
    claude mcp add data-explore -s project -e PYTHONPATH=/custom/path -- uv --directory . run python main.py
    

    MCP Server Management Commands:

    # List all configured MCP servers
    claude mcp list
    
    # Get details for a specific server
    claude mcp get data-explore
    
    # Remove an MCP server
    claude mcp remove data-explore
    
    # Reset project-scoped server approval choices
    claude mcp reset-project-choices
    
    # Import servers from Claude Desktop (macOS/WSL only)
    claude mcp add-from-claude-desktop
    
    # Add server from JSON configuration
    claude mcp add-json data-explore '{"type":"stdio","command":"uv","args":["--directory",".","run","python","main.py"],"env":{}}'
    
  3. Understanding MCP Server Scopes

    Claude Code supports three MCP server scopes with clear precedence:

    • local (default): Private to you in current project only
    • project: Shared with team via .mcp.json file (version controlled)
    • user: Available to you across all projects on your machine

    Scope Precedence: local > project > user (local overrides project, project overrides user)

    Choosing the Right Scope:

    • Local: Experimental configurations, sensitive credentials, personal development
    • Project: Team-shared tools, project-specific services, collaboration requirements
    • User: Personal utilities, development tools, cross-project services
    # View all MCP commands and help
    claude mcp --help
    
    # Check connection status of all servers (use /mcp command in Claude Code)
    /mcp
    
    # Configure server startup timeout (10 seconds example)
    MCP_TIMEOUT=10000 claude
    
  4. Start Claude Code

    # Start in project directory (automatically loads .mcp.json)
    cd /Users/ida/Documents/eric/mcp-data-explore
    claude
    
    # Claude Code will automatically detect and load the project MCP configuration
    # You can use the /mcp command within Claude Code to check server status
    
  5. Verify Connection

    Once connected, you should see the MCP tools available. Try asking:

    • "What MCP tools are available?"
    • "What MCP servers are connected?"
    • "Analyze the test_data.csv file"
    • "Show me the dataset summary for test_data.csv"
  6. Usage Examples with Claude Code

    You: "Analyze the test dataset in this directory"
    Claude Code: [Uses analyze_dataset tool] → Returns comprehensive analysis
    
    You: "What's the correlation between age and salary?"
    Claude Code: [Uses analyze_dataset with correlation type] → Returns correlation matrix
    
    You: "Clean my data by removing duplicates and filling nulls"
    Claude Code: [Uses clean_data tool] → Returns cleaning results
    
    You: "Test if the age column is normally distributed"
    Claude Code: [Uses statistical_summary tool] → Returns normality test results
    

Claude Desktop

  1. Install Claude Desktop

  2. Configure Claude Desktop

    Open your Claude Desktop configuration file:

    macOS/Linux:

    code ~/Library/Application\ Support/Claude/claude_desktop_config.json
    

    Windows:

    code %APPDATA%\Claude\claude_desktop_config.json
    
  3. Add Server Configuration

    Add the following to your claude_desktop_config.json:

    {
      "mcpServers": {
        "data-explore": {
          "command": "uv",
          "args": [
            "--directory", 
            "/ABSOLUTE/PATH/TO/mcp-data-explore",
            "run", 
            "python", 
            "main.py"
          ]
        }
      }
    }
    

    Important: Replace /ABSOLUTE/PATH/TO/mcp-data-explore with the actual absolute path to your project directory.

    Windows Example:

    {
      "mcpServers": {
        "data-explore": {
          "command": "uv",
          "args": [
            "--directory", 
            "C:\\\\Users\\\\YourName\\\\mcp-data-explore",
            "run", 
            "python", 
            "main.py"
          ]
        }
      }
    }
    
  4. Restart Claude Desktop

    Completely close and restart Claude Desktop for the changes to take effect.

  5. Verify Connection

    Look for the tools icon in Claude Desktop. You should see 3 available tools:

    • analyze_dataset
    • clean_data
    • statistical_summary

Other MCP Clients

For other MCP-compatible clients, use these connection details:

  • Transport: stdio
  • Command: uv --directory /path/to/mcp-data-explore run python main.py
  • Server Name: data-explore

Available Tools

1. analyze_dataset

Performs comprehensive dataset analysis.

Parameters:

  • dataset_path (required): Path to CSV file
  • analysis_type (optional): "summary", "correlation", "distribution", "missing_values" (default: "summary")
  • columns (optional): List of column names to analyze

Example Usage:

  • "Analyze the dataset at /path/to/data.csv"
  • "Show correlation analysis for the sales data"
  • "Check for missing values in my dataset"

2. clean_data

Performs data cleaning operations and shows results.

Parameters:

  • dataset_path (required): Path to CSV file
  • operations (required): List of operations - "remove_nulls", "fill_nulls", "remove_duplicates", "standardize_columns", "convert_types"
  • output_path (optional): Path to save cleaned dataset

Example Usage:

  • "Clean my dataset by removing null values and duplicates"
  • "Fill missing values and standardize column names"
  • "Optimize data types in my dataset"

3. statistical_summary

Performs statistical tests and analysis.

Parameters:

  • dataset_path (required): Path to CSV file
  • columns (optional): Specific columns to analyze
  • tests (optional): List of tests - "normality", "correlation_test", "ttest"

Example Usage:

  • "Run statistical tests on my dataset"
  • "Test if the age column follows a normal distribution"
  • "Check correlation significance between variables"

Example Usage

Once connected to your MCP client, you can ask natural language questions like:

  • "Analyze the dataset at /Users/me/sales_data.csv"
  • "What's the correlation between age and income in my data?"
  • "Clean my dataset by removing duplicates and filling missing values"
  • "Test if the revenue column is normally distributed"
  • "Show me distribution analysis for the price column"

The server will automatically:

  1. Load your CSV data
  2. Perform the requested analysis
  3. Return formatted results with insights and interpretations

Troubleshooting

Claude Code MCP Issues

  1. MCP Server Not Loading

    # Verify MCP server starts manually
    cd /Users/ida/Documents/eric/mcp-data-explore
    python main.py
    
    # Check server configuration
    claude mcp get data-explore
    
    # List all servers
    claude mcp list
    
  2. Tools Not Available

    • Ensure .mcp.json is in the project root for project scope
    • Check JSON syntax is valid (use claude mcp get data-explore)
    • Use /mcp command in Claude Code to check connection status
    • Try: "What MCP servers are connected?" or "What MCP tools are available?"
  3. Path Issues

    • Use relative paths ("--directory", ".") for project scope
    • Use absolute paths for user scope
    • Ensure uv is in your PATH: which uv
    • For Windows: May need cmd /c wrapper for some commands
  4. Security Approval Required

    • Claude Code prompts for approval before using project-scoped servers
    • Click "Allow" when prompted
    • Use claude mcp reset-project-choices to reset approval choices
  5. Debug MCP Connection

    # Test server directly with JSON-RPC
    echo '{"jsonrpc": "2.0", "method": "initialize", "params": {}, "id": 1}' | python main.py
    
    # Set debug timeout
    MCP_TIMEOUT=30000 claude
    

Server Not Appearing in Claude Desktop

  1. Check your JSON syntax in claude_desktop_config.json
  2. Ensure the path is absolute (not relative)
  3. Restart Claude Desktop completely
  4. Check Claude's logs: ~/Library/Logs/Claude/mcp*.log

Tool Calls Failing

  1. Verify the CSV file path exists and is accessible
  2. Check that the CSV file is properly formatted
  3. Ensure all required parameters are provided
  4. Look for error messages in the returned results

Import Errors

If you see module import errors:

uv sync  # Reinstall dependencies
python -c "import pandas; print('Dependencies OK')"  # Test imports

Development

Adding New Tools

  1. Add a new @mcp.tool() decorated async function in main.py
  2. Follow the existing pattern for error handling and input validation
  3. Update CLAUDE.md with the new tool specifications
  4. Test the tool before deploying

Extending Data Format Support

Currently supports CSV files. To add support for other formats:

  1. Modify the data loading logic in each tool
  2. Add format detection based on file extension
  3. Update tool documentation and examples

License

This project is open source. Feel free to modify and distribute according to your needs.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选