knowledge-mcp-server

knowledge-mcp-server

A high-performance MCP server that aggregates Wikipedia, arXiv, Context7, and DevDocs into a unified interface for AI agents.

Category
访问服务器

README

Knowledge MCP Server

npm version License: MIT Node.js Version TypeScript

A high-performance Model Context Protocol (MCP) server that aggregates multiple knowledge sources into a unified interface for AI agents and applications.

🎯 Features

  • Multi-Source Knowledge Aggregation — Unified access to 4 powerful knowledge sources
  • Production-Ready — Built-in rate limiting, caching, and structured logging
  • High Performance — In-memory caching reduces API calls by up to 80%
  • API Compliant — Respects rate limits for all external APIs
  • Configurable — Environment-based configuration for all features
  • Type-Safe — Full TypeScript implementation with strict typing

📚 Knowledge Sources

Source Description Rate Limit Cache TTL
Context7 Up-to-date library & framework documentation 60 req/hour (free tier) Configurable
Wikipedia General knowledge & encyclopedic content 70 req/s 15 minutes
arXiv Academic papers & research publications 1 req/3s 10-15 minutes
DevDocs Developer documentation for popular libraries 5 req/s 15-30 minutes

⚠️ DevDocs Disclaimer: The DevDocs integration uses an unofficial API and is not affiliated with or endorsed by DevDocs.io. It may break if the site's structure changes. We'll endeavor to update it promptly, but use in production at your own discretion.

📦 Installation

Prerequisites

  • Node.js >= 18.0.0
  • npm or yarn
  • npx (usually comes with Node.js)

Quick Start

# Clone the repository
git clone https://github.com/Maouv/knowledge-mcp-server.git
cd knowledge-mcp-server

# Install dependencies
npm install

# Build the project
npm run build

# Start the server
npm start

The server will start at http://localhost:3000/mcp by default.

⚙️ Configuration

Create a .env file in the project root (or copy from .env.example):

cp .env.example .env

Core Configuration

# Server Configuration
PORT=3000                          # Server port (default: 3000)
TRANSPORT=http                     # Transport mode: 'http' or 'stdio'

# User-Agent (Required by some APIs)
USER_AGENT=knowledge-mcp-server/2.0.0 (your-contact@example.com)

# Logging
LOG_LEVEL=info                     # Levels: error, warn, info, debug

# Rate Limiting
RATE_LIMIT_ENABLED=true            # Enable/disable rate limiting
RATE_LIMIT_WIKIPEDIA=70            # Requests per second
RATE_LIMIT_ARXIV=0.33              # 1 request per 3 seconds
RATE_LIMIT_CONTEXT7=60             # Requests per hour
RATE_LIMIT_DEVDOCS=5               # Requests per second

# Caching
CACHE_ENABLED=true                 # Enable/disable caching
CACHE_TTL=600                      # Default cache TTL in seconds

Environment Variables Reference

Variable Type Default Description
PORT number 3000 HTTP server port
TRANSPORT string http Transport mode (http or stdio)
USER_AGENT string (required) User agent for API requests
LOG_LEVEL string info Logging verbosity level
RATE_LIMIT_ENABLED boolean true Enable rate limiting
RATE_LIMIT_WIKIPEDIA number 70 Wikipedia requests per second
RATE_LIMIT_ARXIV number 0.33 arXiv requests per second
RATE_LIMIT_CONTEXT7 number 60 Context7 requests per hour
RATE_LIMIT_DEVDOCS number 5 DevDocs requests per second
CACHE_ENABLED boolean true Enable response caching
CACHE_TTL number 600 Default cache TTL (seconds)

🚀 Usage

HTTP Mode (Recommended)

Start the server:

npm start
# or with custom port
PORT=8080 npm start

Connect to your MCP client:

{
  "mcpServers": {
    "knowledge": {
      "url": "http://localhost:3000/mcp"
    }
  }
}

Stdio Mode

For local development or subprocess usage:

TRANSPORT=stdio npm start

Client configuration:

{
  "mcpServers": {
    "knowledge": {
      "command": "node",
      "args": ["/path/to/knowledge-mcp-server/dist/index.js"],
      "env": { "TRANSPORT": "stdio" }
    }
  }
}

🔧 Available Tools

Wikipedia Tools

wikipedia_summary

Get a summary of a Wikipedia article by title.

Arguments:

  • title (string): Article title (e.g., "JavaScript", "Machine learning")

Example:

{
  "title": "React (software)"
}

wikipedia_search

Search Wikipedia articles.

Arguments:

  • query (string): Search query
  • limit (number, optional): Number of results (1-20, default: 5)

wikipedia_related

Get related articles linked from a Wikipedia article.

Arguments:

  • title (string): Article title

arXiv Tools

arxiv_search

Search academic papers on arXiv.

Arguments:

  • query (string): Search query
  • limit (number, optional): Number of results (1-20, default: 5)
  • category (string, optional): arXiv category filter (e.g., "cs.AI", "cs.LG")

Example:

{
  "query": "large language models",
  "category": "cs.CL",
  "limit": 10
}

arxiv_get_paper

Get full details of a specific paper.

Arguments:

  • paperId (string): arXiv paper ID or URL

Context7 Tools

context7_resolve_library

Resolve a library name to its Context7-compatible ID.

Arguments:

  • libraryName (string): Library name (e.g., "react", "nextjs")

context7_get_docs

Fetch up-to-date documentation for a library.

Arguments:

  • libraryId (string): Context7 library ID (from context7_resolve_library)
  • topic (string, optional): Specific topic to focus on
  • tokens (number, optional): Max tokens to return (1000-10000, default: 5000)

DevDocs Tools

devdocs_list

List all available documentation sets.

devdocs_search

Search within a documentation set.

Arguments:

  • slug (string): Documentation slug (e.g., "react", "node")
  • query (string): Search term
  • limit (number, optional): Max results (1-30, default: 10)

devdocs_get_page

Fetch full page content.

Arguments:

  • slug (string): Documentation slug
  • path (string): Page path from search results

📊 Example Workflows

Finding React Documentation

# 1. List available docs
devdocs_list()

# 2. Search for hooks
devdocs_search(slug: "react", query: "useState")

# 3. Get the full page
devdocs_get_page(slug: "react", path: "hooks/use-state")

Researching Machine Learning Papers

# 1. Search arXiv
arxiv_search(query: "transformer architectures", category: "cs.LG", limit: 5)

# 2. Get specific paper details
arxiv_get_paper(paperId: "2301.00001")

Learning About a Concept

# 1. Get Wikipedia overview
wikipedia_summary(title: "Artificial neural network")

# 2. Find related concepts
wikipedia_related(title: "Artificial neural network")

# 3. Search for research papers
arxiv_search(query: "neural networks")

🏗️ Architecture

┌─────────────────────────────────────────────┐
│         Knowledge MCP Server                │
├─────────────────────────────────────────────┤
│  ┌──────────┐  ┌──────────┐  ┌──────────┐ │
│  │ Wikipedia│  │  arXiv   │  │ Context7 │ │
│  │  Service │  │ Service  │  │ Service  │ │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘ │
│       │             │              │       │
│  ┌────▼─────────────▼──────────────▼─────┐│
│  │      Rate Limiter (Bottleneck)        ││
│  └────────────────────┬──────────────────┘│
│                       │                    │
│  ┌────────────────────▼──────────────────┐│
│  │       Cache Layer (node-cache)        ││
│  └────────────────────┬──────────────────┘│
│                       │                    │
│  ┌────────────────────▼──────────────────┐│
│  │      Logging (Winston)                ││
│  └───────────────────────────────────────┘│
└─────────────────────────────────────────────┘

🛡️ Rate Limiting

This server implements respectful rate limiting for all external APIs to ensure compliance and reliability:

Wikipedia

  • Official Limit: ~200 requests/second
  • Server Default: 70 requests/second (safe buffer)
  • Reasoning: Conservative approach to avoid throttling

arXiv

  • Official Limit: 1 request per 3 seconds
  • Server Default: Strict 1 request per 3 seconds
  • Reasoning: arXiv has very strict limits; violations may result in IP bans

Context7

  • Free Tier Limit: 60 requests/hour
  • Server Default: 60 requests/hour
  • Paid Plans: Configurable via RATE_LIMIT_CONTEXT7 env var
  • Reasoning: Matches Upstash free tier quota

DevDocs

  • Official Limit: Not officially documented (unofficial API)
  • Server Default: 5 requests/second
  • Reasoning: Conservative to avoid service disruption

Behavior

When rate limits are reached:

  • Requests are automatically queued
  • No errors are thrown
  • Logs warn when approaching limits
  • Requests execute when capacity is available

To disable rate limiting (not recommended for production):

RATE_LIMIT_ENABLED=false

🗄️ Caching Strategy

The server uses in-memory caching to reduce API calls and improve response times:

Source Cache TTL Reasoning
Wikipedia Summaries 15 minutes Content rarely changes rapidly
arXiv Search 10 minutes Papers don't change after publication
arXiv Papers 15 minutes Static content
DevDocs List 1 hour Documentation index rarely changes
DevDocs Search 15 minutes Reasonable balance
DevDocs Pages 30 minutes Documentation rarely updates frequently
Context7 Configurable Depends on use case

Cache statistics available programmatically:

import { getCacheStats } from './cache.js';

const stats = getCacheStats();
// { keys: 45, hits: 1234, misses: 56 }

To disable caching:

CACHE_ENABLED=false

📝 Logging

Structured logging via Winston with multiple log levels:

# Development (verbose)
LOG_LEVEL=debug npm start

# Production (standard)
LOG_LEVEL=info npm start

# Minimal
LOG_LEVEL=error npm start

All logs are written to stderr (stdout is reserved for MCP protocol).

Example log output:

2024-01-15 10:23:45 [info]: knowledge-mcp-server running on http://localhost:3000/mcp
2024-01-15 10:23:50 [info]: Rate limiting enabled {"wikipedia":"70 req/s","arxiv":"0.33 req/s"}
2024-01-15 10:24:01 [info]: Searching arXiv {"query":"transformers","limit":5}

🔍 Health Check

Check server health:

curl http://localhost:3000/health

Response:

{
  "status": "ok",
  "server": "knowledge-mcp-server",
  "version": "2.0.0",
  "tools": ["wikipedia", "context7", "arxiv", "devdocs"]
}

🧪 Development

Build

npm run build

Development Mode

npm run dev

Project Structure

knowledge-mcp-server/
├── src/
│   ├── index.ts           # Entry point
│   ├── constants.ts       # Configuration constants
│   ├── types.ts           # TypeScript type definitions
│   ├── logger.ts          # Winston logger setup
│   ├── cache.ts           # In-memory caching layer
│   ├── rateLimiter.ts     # Rate limiting logic
│   ├── services/          # External API integrations
│   │   ├── wikipedia.ts
│   │   ├── arxiv.ts
│   │   ├── context7.ts
│   │   └── devdocs.ts
│   └── tools/             # MCP tool definitions
│       ├── wikipedia.ts
│       ├── arxiv.ts
│       ├── context7.ts
│       └── devdocs.ts
├── dist/                  # Compiled JavaScript
├── package.json
├── tsconfig.json
├── .env.example
└── README.md

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Setup

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Code Style

  • TypeScript with strict mode
  • ES Modules (ESM)
  • Async/await for asynchronous operations
  • Meaningful variable and function names
  • Comprehensive error handling

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📮 Support


Built with ❤️ for the AI community

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选