MCP 服务器

Biomart MCP

一个模型上下文协议服务器，它与 Biomart 数据库对接，允许模型发现生物数据集、探索属性/过滤器、检索生物数据以及在不同的生物标识符之间进行转换。

研究与数据

数据库

Python

访问服务器

Tools

list_marts

Lists all available Biomart marts (databases) from Ensembl. Biomart organizes biological data in a hierarchy: MART -> DATASET -> ATTRIBUTES/FILTERS. This function returns all available marts as a CSV string. Returns: str: CSV-formatted table of all marts with their display names and descriptions. Example: list_marts() >>> "name,display_name,description ENSEMBL_MART_ENSEMBL,Ensembl Genes,Gene annotation from Ensembl ENSEMBL_MART_MOUSE,Mouse strains,Strain-specific data for mouse ..."

list_datasets

Lists all available biomart datasets for a given mart. Each mart contains multiple datasets. This function returns all datasets available in the specified mart as a CSV string. Args: mart (str): The mart identifier to list datasets from. Valid values include: ENSEMBL_MART_ENSEMBL, ENSEMBL_MART_MOUSE, ENSEMBL_MART_ONTOLOGY, ENSEMBL_MART_GENOMIC, ENSEMBL_MART_SNP, ENSEMBL_MART_FUNCGEN Returns: str: CSV-formatted table of all datasets with their display names and descriptions. Example: list_datasets("ENSEMBL_MART_ENSEMBL") >>> "name,display_name,description hsapiens_gene_ensembl,Human genes,Human genes (GRCh38.p13) mmusculus_gene_ensembl,Mouse genes,Mouse genes (GRCm39) ..."

list_common_attributes

Lists commonly used attributes available for a given dataset. This function returns only the most frequently used attributes (defined in COMMON_ATTRIBUTES) to avoid overwhelming the model with too many options. For a complete list, use list_all_attributes. Args: mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL") dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl") Returns: str: CSV-formatted table of common attributes with their display names and descriptions. Example: list_common_attributes("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl") >>> "name,display_name,description ensembl_gene_id,Gene stable ID,Ensembl stable ID for the gene external_gene_name,Gene name,The gene name ..."

list_all_attributes

Lists all available attributes for a given dataset with some filtering. This function returns a filtered list of all attributes available for the specified dataset. Some less commonly used attributes (homologs, microarray probes) are filtered out to reduce the response size. CAUTION: This function can return a large number of attributes and may be unstable for certain datasets. Consider using list_common_attributes first. Args: mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL") dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl") Returns: str: CSV-formatted table of all filtered attributes. Example: list_all_attributes("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl")

list_filters

Lists all available filters for a given dataset. Filters are used to narrow down the results of a Biomart query. This function returns all filters that can be applied to the specified dataset. Args: mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL") dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl") Returns: str: CSV-formatted table of all filters with their display names and descriptions. Example: list_filters("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl") >>> "name,description chromosome_name,Chromosome/scaffold name start,Gene start (bp) end,Gene end (bp) ..."

get_data

Queries Biomart for data using specified attributes and filters. This function performs the main data retrieval from Biomart, allowing you to query biological data by specifying which attributes to return and which filters to apply. Includes automatic retry logic for resilience. Args: mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL") dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl") attributes (list[str]): List of attributes to retrieve (e.g., ["ensembl_gene_id", "external_gene_name"]) filters (dict[str, str]): Dictionary of filters to apply (e.g., {"chromosome_name": "1"}) Returns: str: CSV-formatted results of the query. Example: get_data( "ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", ["ensembl_gene_id", "external_gene_name", "chromosome_name"], {"chromosome_name": "X", "biotype": "protein_coding"} ) >>> "ensembl_gene_id,external_gene_name,chromosome_name ENSG00000000003,TSPAN6,X ENSG00000000005,TNMD,X ..."

get_translation

Translates a single identifier from one attribute type to another. This function allows conversion between different identifier types, such as converting a gene symbol to an Ensembl ID. Results are cached to improve performance. Args: mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL") dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl") from_attr (str): The source attribute name (e.g., "hgnc_symbol") to_attr (str): The target attribute name (e.g., "ensembl_gene_id") target (str): The identifier value to translate (e.g., "TP53") Returns: str: The translated identifier, or an error message if not found. Example: get_translation("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", "hgnc_symbol", "ensembl_gene_id", "TP53") >>> "ENSG00000141510"

batch_translate

Translates multiple identifiers in a single batch operation. This function is more efficient than multiple calls to get_translation when you need to translate many identifiers at once. Args: mart (str): The mart identifier (e.g., "ENSEMBL_MART_ENSEMBL") dataset (str): The dataset identifier (e.g., "hsapiens_gene_ensembl") from_attr (str): The source attribute name (e.g., "hgnc_symbol") to_attr (str): The target attribute name (e.g., "ensembl_gene_id") targets (list[str]): List of identifier values to translate (e.g., ["TP53", "BRCA1", "BRCA2"]) Returns: dict: A dictionary containing: - translations: Dictionary mapping input IDs to translated IDs - not_found: List of IDs that could not be translated - found_count: Number of successfully translated IDs - not_found_count: Number of IDs that could not be translated Example: batch_translate("ENSEMBL_MART_ENSEMBL", "hsapiens_gene_ensembl", "hgnc_symbol", "ensembl_gene_id", ["TP53", "BRCA1", "BRCA2"]) >>> {"translations": {"TP53": "ENSG00000141510", "BRCA1": "ENSG00000012048"}, "not_found": ["BRCA2"], "found_count": 2, "not_found_count": 1}

README

Biomart MCP

一个与 Biomart 交互的 MCP 服务器

模型上下文协议 (MCP) 是一个开放协议，用于标准化应用程序如何为 Anthropic 开发的 LLM 提供上下文。在这里，我们使用 MCP python-sdk 创建一个 MCP 服务器，该服务器通过 pybiomart 包与 Biomart 交互。

演示 biomart-mcp 运行效果

这里有一个简短的演示视频展示了 MCP 服务器在 Claude Desktop 上的运行效果。

安装

克隆仓库

git clone https://github.com/jzinno/biomart-mcp.git
cd biomart-mcp

Claude Desktop

uv run --with mcp[cli] mcp install --with pybiomart biomart-mcp.py

Cursor

通过 Cursor 的代理模式，其他模型也可以利用 MCP 服务器，例如来自 OpenAI 或 DeepSeek 的模型。点击 Cursor 设置齿轮，导航到 Features -> MCP Servers -> Add new MCP Server。将名称设置为 biomart（或您喜欢的任何名称），并将 Type 设置为 command。

将命令设置为：

uv run --with mcp[cli] --with pybiomart mcp run /your/path/to/biomart-mcp.py

Glama

开发

# 创建一个虚拟环境
uv venv

# MacOS/Linux
source .venv/bin/activate

# Windows
.venv\Scripts\activate

uv sync #or uv add mcp[cli] pybiomart

# 在开发模式下运行服务器
mcp dev biomart-mcp.py

功能

Biomart-MCP 提供了几个与 Biomart 数据库交互的工具：

Mart 和数据集发现：列出可用的 mart 和数据集，以探索 Biomart 数据库结构
属性和过滤器探索：查看特定数据集的常见或所有可用属性和过滤器
数据检索：使用特定属性和过滤器查询 Biomart 以获取生物数据
ID 转换：在不同的生物标识符之间进行转换（例如，基因符号到 Ensembl ID）

贡献

欢迎提交 Pull Request！关于开发的一些小提示：

我们在这里仅使用 @mcp.tool()，这是为了最大限度地与支持 MCP 的客户端兼容，如文档中所示。
我们使用 @lru_cache 来缓存计算量大或进行外部 API 调用的函数的结果。
我们需要注意不要超出模型的上下文窗口，例如，您会在很多地方看到 df.to_csv(index=False).replace("\r", "")。这种 csv 风格的返回比 df.to_string() 之类的东西更节省 token，因为后者的大部分 token 都是空格。还要注意，从染色体或类似的大型请求中提取所有基因也会超出上下文窗口。

潜在的未来功能

当然，可以添加更多功能，其中一些可能超出名称 biomart-mcp 的范围。这里有一些想法：

为资源站点添加 webscraping，使用 bs4，例如，我们获得了 NOTCH1 的 Ensembl 基因 ID，那么在某些情况下，从 UCSC 上的页面抓取整理好的 Comments and Description Text from UniProtKB 部分可能很有用
$...$

推荐服务器

DuckDuckGo MCP Server

一个模型上下文协议 (MCP) 服务器，通过 DuckDuckGo 提供网页搜索功能，并具有内容获取和解析的附加功能。

精选

Python

YouTube Transcript MCP Server

这个服务器用于获取指定 YouTube 视频 URL 的字幕，从而可以与 Goose CLI 或 Goose Desktop 集成，进行字幕提取和处理。

精选

Python

Supabase MCP Server

一个模型上下文协议（MCP）服务器，它提供对 Supabase 管理 API 的编程访问。该服务器允许 AI 模型和其他客户端通过标准化的接口来管理 Supabase 项目和组织。

精选

JavaScript

Crypto Price & Market Analysis MCP Server

一个模型上下文协议 (MCP) 服务器，它使用 CoinCap API 提供全面的加密货币分析。该服务器通过一个易于使用的界面提供实时价格数据、市场分析和历史趋势。 (Alternative, slightly more formal and technical translation): 一个模型上下文协议 (MCP) 服务器，利用 CoinCap API 提供全面的加密货币分析服务。该服务器通过用户友好的界面，提供实时价格数据、市场分析以及历史趋势数据。

精选

TypeScript