Bio-MCP FastQC Server

Bio-MCP FastQC Server

Enables AI assistants to perform quality control analysis on high-throughput sequencing data using FastQC and MultiQC. It supports single-file and batch processing of FASTQ/FASTA files and generates comprehensive, interactive summary reports.

Category
访问服务器

README

Bio-MCP FastQC Server 🔬

Quality Control Analysis via Model Context Protocol

An MCP server that enables AI assistants to run FastQC and MultiQC quality control analysis on sequencing data. Part of the Bio-MCP ecosystem.

🎯 Purpose

FastQC is essential for quality assessment of high-throughput sequencing data. This MCP server allows AI assistants to:

  • Analyze single files - Get detailed QC reports for individual FASTQ/FASTA files
  • Batch process - Run QC on multiple files simultaneously
  • Generate summary reports - Create MultiQC reports combining multiple analyses
  • Handle large datasets - Queue system support for computationally intensive jobs

🚀 Quick Start

Prerequisites

Install FastQC and MultiQC:

# Via conda (recommended)
conda install -c bioconda fastqc multiqc

# Via package managers
# Ubuntu/Debian
sudo apt-get install fastqc
pip install multiqc

# macOS
brew install fastqc
pip install multiqc

Installation

# Clone and install
git clone https://github.com/bio-mcp/bio-mcp-fastqc.git
cd bio-mcp-fastqc
pip install -e .

# Or install directly
pip install git+https://github.com/bio-mcp/bio-mcp-fastqc.git

Claude Desktop Configuration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "bio-fastqc": {
      "command": "python",
      "args": ["-m", "src.server"],
      "cwd": "/path/to/bio-mcp-fastqc"
    }
  }
}

🔧 Available Tools

Core Analysis Tools

fastqc_single

Run FastQC on a single FASTQ/FASTA file.

Parameters:

  • input_file (required): Path to FASTQ or FASTA file
  • threads (optional): Number of threads (default: 1)
  • contaminants (optional): Path to custom contaminants file
  • adapters (optional): Path to custom adapters file
  • limits (optional): Path to custom limits file

Example:

User: "Run quality control on my_sample.fastq.gz"
AI: [calls fastqc_single] → Returns detailed QC report with pass/warn/fail status for each module

fastqc_batch

Run FastQC on multiple files in a directory.

Parameters:

  • input_dir (required): Directory containing FASTQ/FASTA files
  • file_pattern (optional): File pattern to match (default: ".fastq")
  • threads (optional): Number of threads (default: 4)

Example:

User: "Analyze all fastq files in the data/ directory"
AI: [calls fastqc_batch] → Processes all files and returns summary statistics

multiqc_report

Generate MultiQC report from FastQC results.

Parameters:

  • input_dir (required): Directory containing FastQC and other analysis results
  • title (optional): Custom title for the report
  • comment (optional): Comment to add to the report
  • template (optional): Report template (default, simple, sections, gathered)

Example:

User: "Create a summary report from all the QC results"
AI: [calls multiqc_report] → Generates interactive HTML report combining all analyses

Queue System Tools (when queue enabled)

For large datasets or batch processing:

  • fastqc_single_async - Queue single file analysis
  • fastqc_batch_async - Queue batch analysis
  • multiqc_report_async - Queue report generation
  • get_job_status - Check job progress
  • get_job_result - Retrieve completed results
  • cancel_job - Cancel running jobs

💡 Usage Examples

Basic Quality Assessment

User: "What's the quality of my sequencing data in sample1.fastq.gz?"

AI Response:
✅ FastQC Analysis Complete for sample1

Module Status Summary:
✅ Basic Statistics: PASS
✅ Per base sequence quality: PASS
⚠️ Per sequence quality scores: WARN
✅ Per base sequence content: PASS
✅ Per sequence GC content: PASS
✅ Per base N content: PASS
✅ Sequence Length Distribution: PASS
❌ Sequence Duplication Levels: FAIL
✅ Overrepresented sequences: PASS
✅ Adapter Content: PASS

Basic Statistics:
• Total Sequences: 2,450,000
• Sequences flagged as poor quality: 0
• Sequence length: 150
• %GC: 47

Batch Analysis

User: "Analyze all FASTQ files in my project/raw_data/ folder"

AI Response:
FastQC Batch Analysis Complete
========================================

Processed 12 files:

✅ sample01_R1.fastq.gz: 8P/2W/0F
✅ sample01_R2.fastq.gz: 8P/2W/0F
⚠️ sample02_R1.fastq.gz: 7P/2W/1F
⚠️ sample02_R2.fastq.gz: 7P/2W/1F
...

Overall Summary:
• Total PASS: 94
• Total WARN: 24
• Total FAIL: 2

Tip: Run multiqc_report on this directory to generate a combined report!

Complete Workflow

User: "I have a directory of paired-end FASTQ files. Can you run quality control and create a summary report?"

AI: I'll run a complete QC workflow on your paired-end data:

1. First, let me analyze all FASTQ files in batch:
   [runs fastqc_batch on directory]

2. Now I'll generate a MultiQC summary report:
   [runs multiqc_report on results]

✅ Complete QC workflow finished!

Summary:
- 24 FASTQ files processed (12 samples, paired-end)
- Average quality score: 32.5
- 2 samples have adapter contamination warnings
- 1 sample shows high duplication levels
- Interactive HTML report generated: multiqc_report.html

The MultiQC report provides detailed visualizations of:
- Quality score distributions across all samples
- GC content comparison
- Sequence length distributions
- Adapter content analysis
- Sample correlation analysis

🐳 Docker Usage

Build and Run

# Build the image
docker build -t bio-mcp-fastqc .

# Run with data mounting
docker run -v /path/to/data:/data bio-mcp-fastqc

Docker Compose (with Queue System)

services:
  fastqc-server:
    build: .
    volumes:
      - ./data:/data
    environment:
      - BIO_MCP_QUEUE_URL=http://queue-api:8000
    depends_on:
      - queue-api

⚙️ Configuration

Environment Variables

  • BIO_MCP_FASTQC_PATH - Path to FastQC executable (default: "fastqc")
  • BIO_MCP_MULTIQC_PATH - Path to MultiQC executable (default: "multiqc")
  • BIO_MCP_MAX_FILE_SIZE - Maximum file size in bytes (default: 10GB)
  • BIO_MCP_TIMEOUT - Command timeout in seconds (default: 1800)
  • BIO_MCP_TEMP_DIR - Temporary directory for processing

Queue System Integration

To enable async processing for large datasets:

from src.server_with_queue import FastQCServerWithQueue

server = FastQCServerWithQueue(queue_url="http://localhost:8000")

📊 Output Files

FastQC generates several output files:

  • HTML Report (*_fastqc.html) - Interactive quality report
  • Data File (fastqc_data.txt) - Raw metrics and statistics
  • Summary File (summary.txt) - Pass/warn/fail status for each module
  • Plots - Various quality plots and charts

MultiQC combines these into:

  • MultiQC Report (multiqc_report.html) - Combined interactive report
  • Data Directory (multiqc_data/) - Processed data and statistics
  • General Stats (multiqc_general_stats.txt) - Summary table

🔍 Quality Metrics Explained

FastQC analyzes multiple quality aspects:

Key Modules

  • Per base sequence quality - Quality scores across read positions
  • Per sequence quality scores - Distribution of mean quality scores
  • Per base sequence content - A/T/G/C content across positions
  • Per sequence GC content - GC% distribution vs expected
  • Sequence duplication levels - PCR duplication assessment
  • Adapter content - Contaminating adapter sequences

Status Interpretation

  • PASS - Analysis indicates no problems
  • ⚠️ WARN - Slightly unusual, may not be problematic
  • FAIL - Likely problematic, requires attention

🧬 Integration with Bio-MCP Ecosystem

FastQC works seamlessly with other Bio-MCP tools:

User: "Run the complete preprocessing pipeline on my samples"

AI Workflow:
1. fastqc_batch → Initial quality assessment
2. trimmomatic → Trim low-quality bases and adapters  
3. fastqc_batch → Post-trimming QC
4. multiqc_report → Combined before/after report

🤝 Contributing

We welcome contributions! See the Bio-MCP contributing guide.

Development Setup

git clone https://github.com/bio-mcp/bio-mcp-fastqc.git
cd bio-mcp-fastqc
pip install -e ".[dev]"
pytest

📄 License

MIT License - see LICENSE file.

🙏 Acknowledgments

  • FastQC by Simon Andrews at Babraham Bioinformatics
  • MultiQC by Phil Ewels and the MultiQC community
  • Bio-MCP project and contributors

Part of the Bio-MCP ecosystem - Making bioinformatics accessible to AI assistants.

For more tools: Bio-MCP Organization

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选
mcp-server-qdrant

mcp-server-qdrant

这个仓库展示了如何为向量搜索引擎 Qdrant 创建一个 MCP (Managed Control Plane) 服务器的示例。

官方
精选