ProteinAtlas MCP Server

ProteinAtlas MCP Server

A comprehensive Model Context Protocol (MCP) server for accessing Human Protein Atlas data, providing information about protein expression, subcellular localization, pathology, and more.

Category
访问服务器

README

Logo

Unofficial Human Protein Atlas MCP Server

A comprehensive Model Context Protocol (MCP) server for accessing Human Protein Atlas data, providing information about protein expression, subcellular localization, pathology, and more.

Overview

The Human Protein Atlas MCP Server enables seamless access to the vast repository of protein data from the Human Protein Atlas (https://www.proteinatlas.org). This server provides tools and resources for:

  • Protein Search and Information: Search for proteins by name, gene symbol, or description
  • Tissue Expression: Access tissue-specific expression profiles
  • Subcellular Localization: Retrieve protein localization data
  • Pathology Data: Access cancer-related protein information
  • Blood and Brain Expression: Specialized expression data for blood cells and brain regions
  • Antibody Information: Validation and staining data for antibodies
  • Batch Processing: Efficient lookup of multiple proteins
  • Advanced Search: Complex queries with multiple filters

Features

Core Capabilities

  • 🔍 Comprehensive Search: Find proteins using various identifiers and keywords
  • 🧬 Multi-Modal Data: Access expression, localization, and pathology information
  • 🩸 Specialized Atlases: Blood Atlas and Brain Atlas data integration
  • 📊 Batch Processing: Efficient handling of multiple protein queries
  • 🔬 Research-Grade Data: High-quality, peer-reviewed protein information
  • ⚡ Fast Response: Optimized for quick data retrieval

Data Types Available

  1. Basic Protein Information

    • Gene symbols and Ensembl IDs
    • Protein descriptions and classifications
    • UniProt cross-references
  2. Expression Data

    • Tissue-specific RNA expression
    • Blood cell expression profiles
    • Brain region expression data
    • Single-cell expression information
  3. Subcellular Localization

    • Protein localization patterns
    • Reliability scores
    • Immunofluorescence data
  4. Pathology Information

    • Cancer prognostic markers
    • Disease associations
    • Therapeutic targets
  5. Antibody Data

    • Antibody validation information
    • Staining patterns
    • Reliability assessments

Installation

Prerequisites

  • Node.js 18 or higher
  • npm or yarn package manager

Setup

  1. Clone or download the server code

  2. Install dependencies:

    cd proteinatlas-server
    npm install
    
  3. Build the server:

    npm run build
    
  4. The server is now ready to use!

Usage

Command Line

Run the server directly:

npm start
# or
node build/index.js

MCP Client Integration

Add to your MCP client configuration:

{
  "mcpServers": {
    "proteinatlas": {
      "command": "node",
      "args": ["/path/to/proteinatlas-server/build/index.js"]
    }
  }
}

Available Tools

Basic Search and Retrieval

search_proteins

Search Human Protein Atlas for proteins by name, gene symbol, or description.

Parameters:

  • query (required): Search query (gene name, protein name, or keyword)
  • format: Output format (json, tsv) - default: json
  • columns: Specific columns to include in results
  • maxResults: Maximum number of results (1-10000) - default: 100
  • compress: Whether to compress the response - default: false

Example:

{
  "query": "BRCA1",
  "format": "json",
  "maxResults": 10
}

get_protein_info

Get detailed information for a specific protein by gene symbol.

Parameters:

  • gene (required): Gene symbol (e.g., BRCA1, TP53)
  • format: Output format (json, tsv, xml, trig) - default: json

get_protein_by_ensembl

Get protein information using Ensembl gene ID.

Parameters:

  • ensemblId (required): Ensembl gene ID (e.g., ENSG00000139618)
  • format: Output format (json, tsv, xml, trig) - default: json

Expression Analysis

get_tissue_expression

Get tissue-specific expression data for a protein.

Parameters:

  • gene (required): Gene symbol
  • format: Output format (json, tsv) - default: json

search_by_tissue

Find proteins highly expressed in specific tissues.

Parameters:

  • tissue (required): Tissue name (e.g., liver, brain, heart)
  • expressionLevel: Expression level filter (high, medium, low, not detected)
  • format: Output format (json, tsv) - default: json
  • maxResults: Maximum number of results (1-10000) - default: 100

get_blood_expression

Get blood cell expression data for a protein.

get_brain_expression

Get brain region expression data for a protein.

Subcellular Localization

get_subcellular_location

Get subcellular localization data for a protein.

search_by_subcellular_location

Find proteins localized to specific subcellular compartments.

Parameters:

  • location (required): Subcellular location (e.g., nucleus, mitochondria, cytosol)
  • reliability: Reliability filter (approved, enhanced, supported, uncertain)
  • format: Output format (json, tsv) - default: json
  • maxResults: Maximum number of results (1-10000) - default: 100

Pathology and Cancer

get_pathology_data

Get cancer and pathology data for a protein.

search_cancer_markers

Find proteins associated with specific cancers or with prognostic value.

Parameters:

  • cancer: Cancer type (e.g., breast cancer, lung cancer)
  • prognostic: Prognostic filter (favorable, unfavorable)
  • format: Output format (json, tsv) - default: json
  • maxResults: Maximum number of results (1-10000) - default: 100

Advanced Features

advanced_search

Perform advanced search with multiple filters and criteria.

Parameters:

  • query: Base search query
  • tissueSpecific: Tissue-specific expression filter
  • subcellularLocation: Subcellular localization filter
  • cancerPrognostic: Cancer prognostic filter
  • proteinClass: Protein class filter
  • chromosome: Chromosome filter
  • antibodyReliability: Antibody reliability filter
  • format: Output format (json, tsv) - default: json
  • columns: Specific columns to include in results
  • maxResults: Maximum number of results (1-10000) - default: 100

batch_protein_lookup

Look up multiple proteins simultaneously.

Parameters:

  • genes (required): Array of gene symbols (max 100)
  • format: Output format (json, tsv) - default: json
  • columns: Specific columns to include in results

compare_expression_profiles

Compare expression profiles between multiple proteins.

Parameters:

  • genes (required): Array of gene symbols to compare (2-10)
  • expressionType: Type of expression data (tissue, brain, blood, single_cell) - default: tissue
  • format: Output format (json, tsv) - default: json

Available Resources

The server provides several resource templates for direct data access:

Resource Templates

  • hpa://protein/{gene}: Complete protein atlas data for a gene symbol
  • hpa://ensembl/{ensemblId}: Complete protein atlas data for an Ensembl gene ID
  • hpa://tissue/{gene}: Tissue-specific expression data for a gene
  • hpa://subcellular/{gene}: Subcellular localization information for a gene
  • hpa://pathology/{gene}: Cancer and pathology data for a gene
  • hpa://blood/{gene}: Blood cell expression data for a gene
  • hpa://brain/{gene}: Brain region expression data for a gene
  • hpa://antibody/{gene}: Antibody validation and staining information for a gene
  • hpa://search/{query}: Search results for proteins matching the query

Example Resource Access

// Access tissue expression data for BRCA1
const resource = await client.readResource("hpa://tissue/BRCA1");

// Search for insulin-related proteins
const searchResults = await client.readResource("hpa://search/insulin");

Data Sources

This server accesses data from:

  • Human Protein Atlas: Main protein atlas database
  • Tissue Atlas: Normal tissue expression data
  • Blood Atlas: Blood cell expression profiles
  • Brain Atlas: Brain region expression data
  • Pathology Atlas: Cancer-related protein data
  • Cell Atlas: Single-cell expression information

Rate Limiting and Best Practices

  • The server implements appropriate rate limiting to respect the Human Protein Atlas API
  • For batch operations, consider breaking large requests into smaller chunks
  • Use specific column selections to reduce response size when possible
  • Cache frequently accessed data when appropriate

Error Handling

The server provides comprehensive error handling:

  • Invalid Parameters: Clear error messages for incorrect input
  • Network Issues: Retry logic for transient failures
  • Data Format Errors: Graceful handling of unexpected response formats
  • Rate Limiting: Appropriate backoff strategies

Examples

Basic Protein Lookup

// Search for BRCA1 protein
const result = await callTool("search_proteins", {
  query: "BRCA1",
  format: "json",
});

Tissue Expression Analysis

// Get tissue expression for multiple genes
const comparison = await callTool("compare_expression_profiles", {
  genes: ["BRCA1", "BRCA2", "TP53"],
  expressionType: "tissue",
});

Cancer Research

// Find breast cancer prognostic markers
const markers = await callTool("search_cancer_markers", {
  cancer: "breast cancer",
  prognostic: "unfavorable",
  maxResults: 50,
});

Batch Processing

// Look up multiple proteins at once
const batchResult = await callTool("batch_protein_lookup", {
  genes: ["BRCA1", "BRCA2", "TP53", "EGFR", "MYC"],
  format: "json",
});

Development

Building from Source

# Install dependencies
npm install

# Build the project
npm run build

# Run in development mode
npm run dev

Testing

# Run the server
npm start

# Test with MCP client or direct stdio communication

Contributing

Contributions are welcome! Please ensure:

  1. Code follows TypeScript best practices
  2. Error handling is comprehensive
  3. Documentation is updated for new features
  4. Tests are included for new functionality

License

MIT License - see LICENSE file for details.

Support

For issues and questions:

  1. Check the Human Protein Atlas documentation: https://www.proteinatlas.org/about/help
  2. Review the MCP specification: https://modelcontextprotocol.io/
  3. Submit issues via the project repository

Acknowledgments

  • Human Protein Atlas team for providing the comprehensive protein database
  • Model Context Protocol community for the standardized communication framework
  • TypeScript and Node.js communities for the development tools

This server provides programmatic access to Human Protein Atlas data for research and educational purposes. Please cite appropriate sources when using this data in publications.

Citation

If you use this project in your research or publications, please cite it as follows:

author = {Moudather Chelbi},
title = {Human Protein Atlas MCP Server},
year = {2025},
howpublished = {https://github.com/Augmented-Nature/ProteinAtlas-MCP-Server/},
note = {Accessed: 2025-06-29}

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选