Ensembl MCP Server

Ensembl MCP Server

A Model Context Protocol server providing LLMs with access to the Ensembl genomics database, enabling AI assistants to query gene information, sequences, variants, and other genomic data across multiple species.

Category
访问服务器

README

Ensembl MCP Server

A Model Context Protocol (MCP) server that provides LLMs with access to the Ensembl genomics database. This server enables AI assistants to query genomic data, gene information, sequences, variants, and more.

Features

🧬 Gene Information: Get details about genes by ID or symbol
🔍 Gene Search: Search genes across species
🧬 Sequence Retrieval: Get DNA sequences for genomic regions
🔬 Variant Data: Query genetic variants and their annotations
📊 Transcript Info: Access transcript details and isoforms
🌍 Multi-Species: Support for all Ensembl species
🔗 Cross-References: Get external database links
Rate Limited: Built-in rate limiting to respect API limits

Installation

Prerequisites

  • Bun runtime
  • Internet connection for Ensembl API access

Setup

# Clone or create the project
git clone <your-repo> ensembl-mcp
cd ensembl-mcp

# Install dependencies
bun install

# Build the server
bun run build

# Start the server
bun run start

Available Tools

1. get_gene_info

Get detailed information about a gene by Ensembl ID or symbol.

Parameters:

  • gene_identifier (string, required): Gene ID (e.g., "ENSG00000157764") or symbol (e.g., "BRAF")
  • species (string, optional): Species name (default: "homo_sapiens")
  • include_transcripts (boolean, optional): Include transcript information

Example:

{
  "gene_identifier": "BRAF",
  "species": "homo_sapiens",
  "include_transcripts": true
}

2. search_genes

Search for genes by symbol.

Parameters:

  • gene_name (string, required): Gene symbol (e.g., "TP53", "BRCA1")
  • species (string, optional): Species name

3. get_sequence

Get DNA sequence for a genomic region.

Parameters:

  • region (string, required): Region like "17:7565096-7590856"
  • species (string, optional): Species name
  • format (string, optional): "json" or "fasta"

4. get_variants_in_region

Get genetic variants in a genomic region.

Parameters:

  • region (string, required): Genomic region
  • species (string, optional): Species name
  • consequence_type (string, optional): Filter by consequence

5. get_variant_info

Get information about a specific variant.

Parameters:

  • variant_id (string, required): Variant ID like "rs699"

6. get_transcript_info

Get transcript details.

Parameters:

  • transcript_id (string, required): Transcript ID like "ENST00000288602"

7. list_species

Get all available species in Ensembl.

8. get_species_info

Get detailed information about a species.

Parameters:

  • species (string, required): Species name or common name

9. get_assembly_info

Get genome assembly information.

Parameters:

  • species (string, optional): Species name

10. get_gene_xrefs

Get external database references for a gene.

Parameters:

  • gene_id (string, required): Ensembl gene ID

Usage Examples

LLM Query Examples

Once connected to an MCP-enabled LLM client, you can ask questions like:

  • "What is the BRAF gene and where is it located?"
  • "Show me variants in the TP53 gene region"
  • "Get the DNA sequence for chromosome 17 from position 7565096 to 7590856"
  • "What transcripts are available for the EGFR gene?"
  • "List all available species in Ensembl"

Direct API Usage

For testing or direct integration:

import { EnsemblApiClient } from "./src/utils/ensembl-api.js";

const client = new EnsemblApiClient();

// Get gene info
const gene = await client.getGeneById("ENSG00000157764", "homo_sapiens");
console.log(gene);

// Search genes
const results = await client.searchGenes({
  gene_name: "BRAF",
  species: "homo_sapiens",
});
console.log(results);

Architecture

Transport Choice: stdio

We use stdio transport because:

  • ✅ Universal compatibility with MCP clients
  • ✅ Simple process-based communication
  • ✅ No network ports or sockets needed
  • ✅ Built-in in the MCP SDK

Rate Limiting

  • Respects Ensembl's rate limits (10 requests/second max)
  • Built-in 100ms minimum interval between requests
  • No API keys required (Ensembl is open access)

Memory & State

  • Stateless design: No persistent memory needed
  • Each request is independent
  • Client-side caching can be implemented by the LLM client
  • Rate limiter maintains minimal state (last request time)

Data Sources

Ensembl REST API

  • Base URL: https://rest.ensembl.org
  • Format: JSON responses
  • Rate Limit: ~15 requests/second (we use 10/second for safety)
  • Species: 270+ genomes across all domains of life

No Biomart Integration (Yet)

For this initial version, we're focusing on the REST API. Biomart integration could be added later for:

  • Complex queries across multiple datasets
  • Bulk data retrieval
  • Advanced filtering and analysis

Development

Scripts

bun run dev     # Development mode with auto-reload
bun run build   # Build TypeScript to dist/
bun run start   # Start the server
bun test        # Run tests (to be implemented)

Project Structure

src/
├── index.ts           # Main MCP server
├── handlers/
│   └── tools.ts       # Tool definitions and handlers
├── utils/
│   └── ensembl-api.ts # Ensembl API client
└── types/
    └── ensembl.ts     # TypeScript interfaces

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Submit a pull request

License

MIT License - see LICENSE file for details.

Ensembl Citation

If you use this tool in research, please cite Ensembl:

Ensembl 2024. Nucleic Acids Research (2024) doi:10.1093/nar/gkad1045

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选