STRING-MCP

STRING-MCP

A Model Context Protocol server that provides tools for interacting with the STRING database to analyze protein-protein interaction networks and functional enrichment. It enables users to map protein identifiers, retrieve interaction data, and generate biological network visualizations through natural language interfaces.

Category
访问服务器

README

STRING-MCP

A comprehensive Python package for interacting with the STRING database API through a Model Context Protocol (MCP) bridge.

Installation

Install the package in development mode:

pip install -e .

Or install from PyPI (when available):

pip install string-mcp

Claude config

  "mcpServers": {
    "string-mcp": {
      "command": "/path/to/python/env/bin/string-mcp-server",
      "env": {}
    }
  }
}

Usage

MCP Server (Primary Use Case)

The package provides an MCP server for integration with MCP-compatible clients:

# Run the MCP server
string-mcp-server

The MCP server provides the following tools:

  • map_identifiers: Map protein identifiers to STRING IDs
  • get_network_interactions: Get network interactions data
  • get_functional_enrichment: Perform functional enrichment analysis
  • get_network_image: Generate network visualization images
  • get_version_info: Get STRING database version information

Command Line Interface

The package also provides a string-mcp command for standalone usage:

# Run demo
string-mcp demo

# Get help
string-mcp --help

# Map protein identifiers
string-mcp map TP53 BRCA1 EGFR --species 9606

# Get network interactions
string-mcp network TP53 BRCA1 --species 9606

# Generate network image
string-mcp image TP53 BRCA1 --output network.png --species 9606

Python API

from stringmcp.main import StringDBBridge

# Initialize the bridge
bridge = StringDBBridge()

# Map protein identifiers
proteins = ["TP53", "BRCA1", "EGFR"]
mapped = bridge.map_identifiers(proteins, species=9606)  # 9606 = human

# Get network interactions
interactions = bridge.get_network_interactions(proteins, species=9606)

# Perform functional enrichment
enrichment = bridge.get_functional_enrichment(proteins, species=9606)

Features

  • Protein Identifier Mapping: Convert various protein identifiers to STRING IDs
  • Network Analysis: Retrieve protein-protein interaction networks
  • Functional Enrichment: Perform gene ontology and pathway enrichment analysis
  • Network Visualization: Generate network images in various formats
  • Interaction Partners: Find all interaction partners for proteins
  • Functional Annotations: Get detailed functional annotations
  • Protein Similarity: Calculate similarity scores between proteins
  • PPI Enrichment: Test for protein-protein interaction enrichment
  • MCP Integration: Full Model Context Protocol server implementation

API Methods

Core Methods

  • map_identifiers(): Map protein identifiers to STRING IDs
  • get_network_interactions(): Get network interaction data
  • get_network_image(): Generate network visualization images
  • get_interaction_partners(): Find all interaction partners
  • get_functional_enrichment(): Perform enrichment analysis
  • get_functional_annotation(): Get functional annotations
  • get_protein_similarity(): Calculate similarity scores
  • get_ppi_enrichment(): Test for PPI enrichment
  • get_version_info(): Get STRING database version

Configuration

The package uses a StringConfig class for configuration:

from stringmcp.main import StringConfig, StringDBBridge

config = StringConfig(
    base_url="https://string-db.org/api",
    version_url="https://version-12-0.string-db.org/api",
    caller_identity="my_app",
    request_delay=1.0  # Delay between requests in seconds
)

bridge = StringDBBridge(config)

Output Formats

The package supports multiple output formats:

  • JSON: Structured data (default)
  • TSV: Tab-separated values
  • XML: XML format
  • IMAGE: Network visualization images
  • SVG: Scalable vector graphics
  • PSI_MI: PSI-MI format

Species Support

The package supports all species available in STRING. Common species IDs:

  • Human: 9606
  • Mouse: 10090
  • Rat: 10116
  • Yeast: 4932
  • E. coli: 511145

MCP Server Configuration

To use the MCP server with an MCP client, configure it as follows:

{
  "mcpServers": {
    "string-mcp": {
      "command": "string-mcp-server",
      "env": {}
    }
  }
}

The server will automatically handle:

  • JSON-RPC communication
  • Tool discovery and invocation
  • Error handling and reporting
  • Base64 encoding for image data

Development

Setup Development Environment

# Install in development mode with dev dependencies
pip install -e .[dev]

# Format code
black stringmcp/

# Type checking
mypy stringmcp/

# Lint code
flake8 stringmcp/

Note: Test files are not currently included in this repository. To add tests, create a tests/ directory and add test files following the pytest configuration in pyproject.toml.

Project Structure

STRINGmcp/
├── pyproject.toml          # Package configuration and dependencies
├── README.md              # This file
├── LICENSE                # MIT License
├── .gitignore             # Git ignore patterns
├── stringmcp/             # Main package
│   ├── __init__.py        # Package initialization
│   └── main.py            # Core STRING API bridge and MCP server
└── string_mcp.egg-info/   # Package metadata (generated during install)
    ├── PKG-INFO           # Package information
    ├── SOURCES.txt        # Source files list
    ├── dependency_links.txt
    ├── entry_points.txt   # CLI entry points
    ├── requires.txt       # Dependencies
    └── top_level.txt      # Top-level package names

License

MIT License - see LICENSE file for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Run the test suite
  6. Submit a pull request

Support

For issues and questions, please use the GitHub issue tracker.

Example Usage

Complete DNA Repair Protein Analysis

This example demonstrates the comprehensive functionality of the STRING-DB MCP bridge by analyzing a set of well-known human DNA repair proteins: TP53, BRCA1, BRCA2, ATM, and ATR.

2. Protein Identifier Mapping

Map gene symbols to STRING identifiers:

[
  {
    "queryIndex": 0,
    "queryItem": "TP53",
    "stringId": "9606.ENSP00000269305",
    "ncbiTaxonId": 9606,
    "taxonName": "Homo sapiens",
    "preferredName": "TP53",
    "annotation": "Cellular tumor antigen p53; Acts as a tumor suppressor in many tumor types; induces growth arrest or apoptosis depending on the physiological circumstances and cell type..."
  },
  {
    "queryIndex": 1,
    "queryItem": "BRCA1",
    "stringId": "9606.ENSP00000418960",
    "ncbiTaxonId": 9606,
    "taxonName": "Homo sapiens",
    "preferredName": "BRCA1",
    "annotation": "Breast cancer type 1 susceptibility protein; E3 ubiquitin-protein ligase that specifically mediates the formation of 'Lys-6'-linked polyubiquitin chains..."
  },
  {
    "queryIndex": 2,
    "queryItem": "BRCA2",
    "stringId": "9606.ENSP00000369497",
    "ncbiTaxonId": 9606,
    "taxonName": "Homo sapiens",
    "preferredName": "BRCA2",
    "annotation": "Breast cancer type 2 susceptibility protein; Involved in double-strand break repair and/or homologous recombination..."
  },
  {
    "queryIndex": 3,
    "queryItem": "ATM",
    "stringId": "9606.ENSP00000278616",
    "ncbiTaxonId": 9606,
    "taxonName": "Homo sapiens",
    "preferredName": "ATM",
    "annotation": "Serine-protein kinase ATM; Serine/threonine protein kinase which activates checkpoint signaling upon double strand breaks..."
  },
  {
    "queryIndex": 4,
    "queryItem": "ATR",
    "stringId": "9606.ENSP00000343741",
    "ncbiTaxonId": 9606,
    "taxonName": "Homo sapiens",
    "preferredName": "ATR",
    "annotation": "Serine/threonine-protein kinase ATR; Serine/threonine protein kinase which activates checkpoint signaling upon genotoxic stresses..."
  }
]

3. Protein-Protein Interaction Network

Examine network interactions between these proteins:

[
  {
    "stringId_A": "9606.ENSP00000269305",
    "stringId_B": "9606.ENSP00000369497",
    "preferredName_A": "TP53",
    "preferredName_B": "BRCA2",
    "score": 0.995
  },
  {
    "stringId_A": "9606.ENSP00000269305",
    "stringId_B": "9606.ENSP00000343741",
    "preferredName_A": "TP53",
    "preferredName_B": "ATR",
    "score": 0.996
  },
  {
    "stringId_A": "9606.ENSP00000269305",
    "stringId_B": "9606.ENSP00000278616",
    "preferredName_A": "TP53",
    "preferredName_B": "ATM",
    "score": 0.999
  },
  {
    "stringId_A": "9606.ENSP00000269305",
    "stringId_B": "9606.ENSP00000418960",
    "preferredName_A": "TP53",
    "preferredName_B": "BRCA1",
    "score": 0.999
  },
  {
    "stringId_A": "9606.ENSP00000278616",
    "stringId_B": "9606.ENSP00000369497",
    "preferredName_A": "ATM",
    "preferredName_B": "BRCA2",
    "score": 0.995
  },
  {
    "stringId_A": "9606.ENSP00000278616",
    "stringId_B": "9606.ENSP00000418960",
    "preferredName_A": "ATM",
    "preferredName_B": "BRCA1",
    "score": 0.999
  },
  {
    "stringId_A": "9606.ENSP00000278616",
    "stringId_B": "9606.ENSP00000343741",
    "preferredName_A": "ATM",
    "preferredName_B": "ATR",
    "score": 0.999
  },
  {
    "stringId_A": "9606.ENSP00000343741",
    "stringId_B": "9606.ENSP00000369497",
    "preferredName_A": "ATR",
    "preferredName_B": "BRCA2",
    "score": 0.831
  },
  {
    "stringId_A": "9606.ENSP00000343741",
    "stringId_B": "9606.ENSP00000418960",
    "preferredName_A": "ATR",
    "preferredName_B": "BRCA1",
    "score": 0.996
  },
  {
    "stringId_A": "9606.ENSP00000369497",
    "stringId_B": "9606.ENSP00000418960",
    "preferredName_A": "BRCA2",
    "preferredName_B": "BRCA1",
    "score": 0.999
  }
]

Key Findings: All interactions show very high confidence scores (>0.8), with most exceeding 0.99, indicating these proteins form a tightly interconnected functional module.

4. Network Statistics

Check if this network is significantly enriched for interactions:

{
  "number_of_nodes": 5,
  "number_of_edges": 10,
  "average_node_degree": 4.0,
  "local_clustering_coefficient": 1.0,
  "expected_number_of_edges": 5,
  "p_value": 0.0122
}

Statistical Significance: The network shows perfect clustering (coefficient = 1.0) and is significantly enriched for interactions (p = 0.0122), with twice as many edges as expected by chance.

5. Functional Enrichment Analysis

Analyze which biological pathways are enriched in this protein set:

Top DNA Repair Pathways (Selected Results):

[
  {
    "category": "Process",
    "term": "GO:0071479",
    "number_of_genes": 5,
    "preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
    "p_value": 9.72e-13,
    "fdr": 1.52e-08,
    "description": "Cellular response to ionizing radiation"
  },
  {
    "category": "Process",
    "term": "GO:0042770",
    "number_of_genes": 5,
    "preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
    "p_value": 1.69e-11,
    "fdr": 1.32e-07,
    "description": "Signal transduction in response to DNA damage"
  },
  {
    "category": "Process",
    "term": "GO:0006281",
    "number_of_genes": 5,
    "preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
    "p_value": 1.05e-08,
    "fdr": 1.10e-05,
    "description": "DNA repair"
  },
  {
    "category": "KEGG",
    "term": "hsa03440",
    "number_of_genes": 3,
    "preferredNames": ["ATM", "BRCA2", "BRCA1"],
    "p_value": 8.34e-08,
    "fdr": 2.80e-05,
    "description": "Homologous recombination"
  },
  {
    "category": "KEGG",
    "term": "hsa04115",
    "number_of_genes": 3,
    "preferredNames": ["TP53", "ATM", "ATR"],
    "p_value": 5.27e-07,`
    "fdr": 5.44e-05,`
    "description": "p53 signaling pathway"
  }
]

Disease Associations:

[
  {
    "category": "DISEASES",
    "term": "DOID:1612",
    "number_of_genes": 4,
    "preferredNames": ["TP53", "ATM", "BRCA2", "BRCA1"],
    "p_value": 5.72e-10,
    "fdr": 2.02e-06,
    "description": "Breast cancer"
  },
  {
    "category": "DISEASES",
    "term": "DOID:3012",
    "number_of_genes": 3,
    "preferredNames": ["TP53", "BRCA2", "BRCA1"],
    "p_value": 6.59e-10,
    "fdr": 2.02e-06,
    "description": "Li-Fraumeni syndrome"
  }
]

The package can generate protein interaction network visualizations showing evidence-based functional associations.

Example Network Visualization: View Protein Interaction Network DNA Repair Protein Network This visualization shows the protein-protein interaction network for TP53, BRCA1, BRCA2, ATM, and ATR with high-confidence interactions (score ≥ 400).

7. Functional Enrichment Visualization

The package can also create enrichment scatter plots showing the most significantly enriched biological processes.

Example Enrichment Visualization: View Functional Enrichment Plot Functional Enrichment Plot

This visualization displays the top 10 most significantly enriched biological processes and pathways for the DNA repair protein set, showing p-values and gene counts for each enriched term.

Summary

This comprehensive analysis demonstrates that the STRING-DB MCP bridge successfully:

  • Identified all 5 DNA repair proteins with detailed annotations
  • Discovered 10 high-confidence protein interactions (all >0.8 score)
  • Revealed significant pathway enrichments with p-values < 1e-8
  • Confirmed statistical significance of the network (p = 0.0122)
  • Generated both network and enrichment visualizations

The results validate these proteins as a core DNA damage response module, with exceptionally strong enrichment for:

  • Cellular response to ionizing radiation (p = 1.52e-8)
  • DNA damage signaling (p = 1.32e-7)
  • Homologous recombination (p = 2.8e-5)
  • p53 signaling pathway (p = 5.44e-5)
  • Breast cancer associations (p = 2.02e-6)v This showcases the complete functionality of the STRING-DB MCP bridge for protein interaction network analysis and functional annotation.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选