Academic MCP Server
Enables AI assistants to search across multiple academic databases (PubMed, arXiv, bioRxiv, medRxiv, Semantic Scholar) through a unified interface. Supports advanced filtering, metadata retrieval, PDF downloads, and comprehensive research workflows with citation analysis.
README
Academic MCP Server
🔍 A unified Model Context Protocol (MCP) server that provides AI assistants access to multiple academic databases through a single, consistent interface.
🌟 Features
Supported Databases
- PubMed 🏥 - Biomedical and life sciences literature (NCBI)
- bioRxiv 🧬 - Biology preprints
- medRxiv 💊 - Medical preprints
- arXiv 🔬 - Physics, mathematics, computer science, and more
- Semantic Scholar 🤖 - AI-powered academic search across disciplines
- Sci-Hub 📚 - Comprehensive academic paper access and download
Core Capabilities
- ✅ Unified Search: Search across all databases with a single query
- ✅ Advanced Filtering: Filter by title, author, date, journal, and more
- ✅ Metadata Access: Retrieve detailed paper information
- ✅ PDF Download: Download open access papers when available
- ✅ Deep Analysis: Generate comprehensive paper analysis prompts
- ✅ Local PDF Analysis: Support for both local and online PDF file analysis
- ✅ Citation Network Analysis: Analyze paper citation relationships and impact
- ✅ Complete Research Workflow: One-click retrieve→analyze→read→summarize
- ✅ Standardized Output: Consistent data format across all sources
🚀 Quick Start
Prerequisites
- Python 3.10+
- MCP library
- Internet connection
Installation
✅ Already Installed! Your Academic MCP Server is fully configured and ready to use.
If you need to set it up on another machine:
-
Clone or download this repository:
cd Academic-MCP-Server -
Create a virtual environment:
python -m venv venv -
Activate the virtual environment:
- Windows:
venv\Scripts\activate - Mac/Linux:
source venv/bin/activate
- Windows:
-
Install dependencies:
pip install -r requirements.txt
Note: All PubMed functionality is integrated locally. No external dependencies required!
Configuration for Cursor
This project provides TWO MCP servers with complementary features:
academic- Basic search, metadata retrieval, and PDF downloads across 6 databases (PubMed, bioRxiv, medRxiv, arXiv, Semantic Scholar, Sci-Hub)academic-research- Advanced features including citation analysis, paper impact evaluation, local PDF analysis, and complete research workflows
Add this configuration to your MCP settings file (~/.cursor/mcp.json or C:\Users\YOUR_USERNAME\.cursor\mcp.json):
Windows:
{
"mcpServers": {
"academic": {
"command": "C:\\Users\\YOUR_USERNAME\\path\\to\\Academic-MCP-Server\\venv\\Scripts\\python.exe",
"args": [
"C:\\Users\\YOUR_USERNAME\\path\\to\\Academic-MCP-Server\\academic_server.py"
],
"env": {},
"disabled": false,
"autoApprove": []
},
"academic-research": {
"command": "C:\\Users\\YOUR_USERNAME\\path\\to\\Academic-MCP-Server\\venv\\Scripts\\python.exe",
"args": [
"C:\\Users\\YOUR_USERNAME\\path\\to\\Academic-MCP-Server\\academic_research_advanced.py"
],
"env": {},
"disabled": false,
"autoApprove": []
}
}
}
Mac/Linux:
{
"mcpServers": {
"academic": {
"command": "/path/to/Academic-MCP-Server/venv/bin/python",
"args": [
"/path/to/Academic-MCP-Server/academic_server.py"
],
"env": {},
"disabled": false,
"autoApprove": []
},
"academic-research": {
"command": "/path/to/Academic-MCP-Server/venv/bin/python",
"args": [
"/path/to/Academic-MCP-Server/academic_research_advanced.py"
],
"env": {},
"disabled": false,
"autoApprove": []
}
}
}
Note: Replace YOUR_USERNAME and path/to with your actual paths.
📖 Usage
Search Papers
Search across all databases:
search_papers(
keywords="UCAR-T",
source="all",
num_results=15
)
Search specific database:
search_papers(
keywords="machine learning",
source="arxiv",
num_results=10
)
Advanced Search
search_papers_advanced(
title="neural networks",
author="Hinton",
start_date="2020-01-01",
end_date="2024-12-31",
source="semantic_scholar",
num_results=10
)
PubMed-specific advanced search:
search_papers_advanced(
title="CAR-T",
author="Wang",
journal="Nature",
start_date="2024/01/01", # PubMed uses YYYY/MM/DD
end_date="2025/12/31",
source="pubmed",
num_results=10
)
Get Paper Metadata
# PubMed
get_paper_metadata(identifier="40883768", source="pubmed")
# bioRxiv
get_paper_metadata(identifier="10.1101/2024.01.001", source="biorxiv")
# arXiv
get_paper_metadata(identifier="2301.00001", source="arxiv")
# Semantic Scholar (Paper ID or DOI)
get_paper_metadata(identifier="DOI:10.1038/s41586-020-1234-5", source="semantic_scholar")
Download PDF
download_paper_pdf(identifier="2301.00001", source="arxiv")
List Available Sources
list_available_sources()
# Returns: ["pubmed", "biorxiv", "medrxiv", "arxiv", "semantic_scholar", "scihub"]
Deep Paper Analysis
deep_paper_analysis(identifier="40883768", source="pubmed")
🛠 MCP Tools Reference
Server: academic (Basic Search & Retrieval)
1. search_papers
Search for papers using keywords.
Parameters:
keywords(str): Search querysource(str): "all", "pubmed", "biorxiv", "medrxiv", "arxiv", "semantic_scholar", or "scihub"num_results(int): Number of results per source (default: 10)
2. search_papers_advanced
Advanced search with multiple filters.
Parameters:
title(str, optional): Search in titlesauthor(str, optional): Author namejournal(str, optional): Journal namestart_date(str, optional): Start dateend_date(str, optional): End dateterm(str, optional): General search termsource(str): Database sourcenum_results(int): Number of results
3. get_paper_metadata
Get detailed metadata for a specific paper.
Parameters:
identifier(str): Paper ID (PMID, DOI, arXiv ID, etc.)source(str): Database source
4. download_paper_pdf
Download PDF for a paper.
Parameters:
identifier(str): Paper IDsource(str): Database source
5. list_available_sources
List all available databases.
6. deep_paper_analysis
Generate comprehensive analysis prompt.
Parameters:
identifier(str): Paper IDsource(str): Database source
Server: academic-research (Advanced Analysis & Research)
1. analyze_citation_network
Analyze paper's citation network.
Parameters:
paper_id(str): Paper identifier (DOI, PMID, etc.)source(str): Data source (default: "semantic_scholar")max_depth(int): Network depth 1-3 layers (default: 2)
2. evaluate_paper_impact
Evaluate academic impact of a paper.
Parameters:
paper_id(str): Paper identifiersource(str): Data source (default: "semantic_scholar")
3. recommend_related_papers
Recommend related papers using multiple strategies.
Parameters:
paper_id(str): Source paper identifiersource(str): Data source (default: "semantic_scholar")num_recommendations(int): Number of recommendations (default: 10)strategy(str): "comprehensive", "citations", "similar", or "influential"
4. research_workflow_complete
⭐ Recommended Core Feature - Complete research workflow: retrieve → analyze → read → summarize
Parameters:
topic(str): Research topic (e.g., "CRISPR gene editing")num_papers(int): Number of papers to retrieve (default: 5)include_analysis(bool): Include deep analysis (default: true)include_summary(bool): Include auto-summary (default: true)
5. analyze_local_paper
Comprehensively analyze local or online PDF papers.
Parameters:
pdf_path(str): PDF file path (local or URL)include_figures(bool): Analyze figures (default: true)include_summary(bool): Generate summary (default: true)
6. list_all_figures
List all figures from a PDF paper.
Parameters:
pdf_path(str): PDF file path (local or URL)
7. explain_specific_figure
Explain a specific figure from a PDF.
Parameters:
pdf_path(str): PDF file path (local or URL)figure_number(int): Figure number (e.g., 1, 2, 3)provide_context(bool): Include context paragraphs (default: true)
8. extract_text_from_pdf
Extract text content from PDF (supports both local and online URLs).
Parameters:
pdf_path(str): PDF path (local or URL)extract_sections(bool): Whether to extract by sectionspage_range(tuple, optional): Page range, e.g., (1, 10) for pages 1-10
9. batch_analyze_local_papers
Batch analyze all PDF papers in a folder (local folders only).
Parameters:
folder_path(str): Folder pathmax_papers(int): Maximum number of papers to analyze (default: 10)file_pattern(str): File matching pattern (default: "*.pdf")
10. compare_papers
Compare multiple papers.
Parameters:
paper_ids(list): List of paper IDs to compare (2-5 papers)comparison_aspects(list, optional): Comparison dimensions - "methodology", "findings", "impact", "timeline"
11. extract_key_information
Extract key information from papers.
Parameters:
paper_id(str): Paper identifiersource(str): Data source (default: "semantic_scholar")info_types(list, optional): List of information types to extract- "methodology": Research methods
- "findings": Main findings
- "limitations": Study limitations
- "datasets": Used datasets
- "metrics": Evaluation metrics
- "contributions": Main contributions
12. generate_paper_summary
Automatically generate paper summaries.
Parameters:
paper_id(str): Paper identifiersource(str): Data source (default: "semantic_scholar")summary_type(str): Summary type- "brief": Brief summary (100-200 words)
- "comprehensive": Comprehensive summary (500-800 words)
- "technical": Technical details summary
- "layman": Easy-to-understand version
13. extract_pdf_fulltext
Extract full text content from PDF.
Parameters:
pdf_url(str): PDF file URLextract_sections(bool): Whether to identify and extract sections (default: true)
📊 Standardized Output Format
All search results return papers in this standardized format:
{
"id": "Unique identifier (PMID, DOI, arXiv ID, etc.)",
"title": "Paper title",
"authors": "Author names (comma-separated)",
"abstract": "Paper abstract",
"publication_date": "Publication date",
"journal": "Journal or venue name",
"url": "Link to paper",
"pdf_url": "PDF link (if available)",
"source": "Database source (pubmed/biorxiv/arxiv/etc.)"
}
Semantic Scholar results include additional fields:
citation_count: Number of citationsreference_count: Number of referencesfields_of_study: Research areas
🔧 Architecture
Dual Server Design
This project provides two complementary MCP servers:
academic_server.py- Core search and retrieval functionalityacademic_research_advanced.py- Advanced analysis and research workflows
Project Structure
Academic-MCP-Server/
├── academic_server.py # Main MCP server (basic search)
├── academic_research_advanced.py # Advanced research server
├── adapters/ # Database adapters
│ ├── base_adapter.py # Abstract base class
│ ├── pubmed_adapter.py # PubMed wrapper
│ ├── biorxiv_adapter.py # bioRxiv/medRxiv
│ ├── arxiv_adapter.py # arXiv
│ ├── semantic_scholar_adapter.py
│ └── scihub_adapter.py # Sci-Hub
├── utils/ # Helper functions
│ ├── helpers.py # General utilities
│ └── pubmed_utils.py # PubMed-specific utilities
├── requirements.txt # Dependencies
└── README.md / README_CN.md # Documentation
Adapter Pattern
Each database is wrapped in an adapter that implements a common interface:
Adding New Databases
To add a new database:
- Create a new adapter in
adapters/ - Inherit from
BaseAdapter - Implement all required methods
- Register in
academic_server.py
Example:
# adapters/new_database_adapter.py
from .base_adapter import BaseAdapter
class NewDatabaseAdapter(BaseAdapter):
def search_by_keywords(self, keywords, num_results):
# Implementation
pass
# ... implement other methods
# In academic_server.py
from adapters.new_database_adapter import NewDatabaseAdapter
adapters = {
# ... existing adapters
"new_database": NewDatabaseAdapter()
}
🎯 Use Cases
For Researchers
- Search across multiple preprint servers simultaneously
- Find papers by specific authors or topics
- Download open access papers automatically
- Generate literature review materials
- Analyze local PDF collections
- Perform comprehensive citation network analysis
- Generate automated paper summaries
For AI Assistants
- Access comprehensive academic knowledge
- Provide up-to-date research information
- Help with citation and reference management
- Analyze research trends and findings
- Process and explain figures from academic papers
- Conduct complete research workflows automatically
⚠️ Limitations & Notes
API Rate Limits
- PubMed: No API key required, but rate-limited
- bioRxiv/medRxiv: No authentication required
- arXiv: Rate-limited (1 request per 3 seconds recommended)
- Semantic Scholar: Free tier has rate limits; get API key for higher limits at https://www.semanticscholar.org/product/api
- Sci-Hub: No authentication required; use responsibly
PDF Availability
- PubMed: Only PMC open access articles
- bioRxiv/medRxiv: All articles are open access
- arXiv: All articles are open access
- Semantic Scholar: Depends on publisher policies
- Sci-Hub: Wide coverage of academic papers (use for research purposes only)
Local PDF Support
- Full text extraction: Extract complete text from local or online PDFs
- Figure analysis: List and explain figures from PDF papers
- Section parsing: Automatically identify and extract paper sections
- Batch processing: Analyze multiple PDFs in a folder simultaneously
Date Formats
- PubMed:
YYYY/MM/DD - Others:
YYYY-MM-DD
🤝 Contributing
Contributions are welcome! Feel free to:
- Add new database adapters
- Improve existing functionality
- Fix bugs
- Enhance documentation
📄 License
This project builds upon the PubMed-MCP-Server and follows similar open-source principles.
🙏 Acknowledgments
- PubMed-MCP-Server for the original PubMed integration
- NCBI E-utilities
- bioRxiv/medRxiv API
- arXiv API
- Semantic Scholar API
- Sci-Hub MCP Server (JackKuo666/Sci-Hub-MCP-Server)
- FastMCP framework
⚠️ Disclaimer
The Sci-Hub integration is provided for research and educational purposes only. Users are responsible for complying with copyright laws and institutional policies in their jurisdiction. The authors do not endorse or encourage copyright infringement. Please support publishers and authors by obtaining papers through legitimate channels when possible.
📊 Project Statistics
- Supported Databases: 6 (PubMed, bioRxiv, medRxiv, arXiv, Semantic Scholar, Sci-Hub)
- MCP Servers: 2 (academic, academic-research)
- Basic MCP Tools: 6
- Advanced Research Tools: 15+
- Lines of Code: ~3,000
- Supported Formats: PDF, metadata, citations, full-text analysis
- PDF Support: Both local files and online URLs
🚀 Enhanced Features
Advanced Research Capabilities
- Citation Network Analysis: Understand paper relationships and impact
- Automated Summarization: Generate summaries in multiple styles
- Key Information Extraction: Extract methodology, findings, limitations
- Complete Research Workflows: One-click research from topic to summary
PDF Processing
- Local and Online Support: Process PDFs from local storage or URLs
- Figure Explanation: AI-powered figure analysis and explanation
- Section Recognition: Automatic identification of paper sections
- Batch Analysis: Process multiple papers simultaneously
Smart Search Features
- Concurrent Database Search: Search all databases simultaneously
- Intelligent Result Merging: Deduplicate and rank results
- Advanced Filtering: Multi-parameter search with date ranges
- Source-Specific Optimization: Tailored search for each database
📞 Support
For issues or questions:
- Check the documentation above
- Review error messages in logs
- Ensure all dependencies are installed
- Verify your MCP configuration
Happy researching! 📚🔬
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。