LearnMCP Server

LearnMCP Server

Extracts and summarizes learning content from YouTube videos, PDFs, and web articles to provide context for project-based learning. It features automated background processing and integrates with Forest's HTA builder for informed task generation.

Category
访问服务器

README

LearnMCP Server

A standalone MCP server that enhances Forest with learning content extraction and summarization capabilities.

Overview

LearnMCP extracts and summarizes learning content from various sources (YouTube videos, PDFs, web articles) and makes those summaries available to Forest's HTA builder for more informed task generation.

Features

  • Content Extraction: YouTube videos (with transcripts), PDF documents, web articles
  • Background Processing: Async content processing with queue management
  • Smart Summarization: Content chunking and summarization with relevance scoring
  • Forest Integration: Optional integration with Forest's HTA tree builder
  • Standalone Operation: Can be enabled/disabled independently of Forest

Architecture

User → LearnMCP Tools → LearnService → BackgroundProcessor ⇄ Extractors ⇄ Summarizer → DataPersistence
                                                                                              ↓
                                                                                    <DATA_DIR>/learn-content/
                                                                                              ↓
                                                                              Forest HTA Builder (optional)

Installation

  1. Install Dependencies:

    cd learn-mcp-server
    npm install
    
  2. Configure MCP: Add to your mcp-config.json:

    {
      "mcpServers": {
        "learn-mcp": {
          "command": "node",
          "args": ["server.js"],
          "cwd": "learn-mcp-server",
          "env": {
            "FOREST_DATA_DIR": "<same as Forest>"
          }
        }
      }
    }
    
  3. Start Server: The server starts automatically when Claude Desktop loads the MCP config.

Available Tools

add_learning_sources

Add learning sources (URLs) to a project for content extraction.

Parameters:

  • project_id (string): Project ID to add sources to
  • urls (array): Array of URLs (YouTube, PDF, articles)

Example:

{
  "project_id": "my_project",
  "urls": [
    "https://youtube.com/watch?v=example",
    "https://example.com/document.pdf",
    "https://blog.example.com/article"
  ]
}

process_learning_sources

Start background processing of pending learning sources.

Parameters:

  • project_id (string): Project ID to process sources for

list_learning_sources

List learning sources for a project, optionally filtered by status.

Parameters:

  • project_id (string): Project ID
  • status (string, optional): Filter by status (pending, processing, completed, failed)

get_learning_summary

Get learning content summary for a project or specific source.

Parameters:

  • project_id (string): Project ID
  • source_id (string, optional): Specific source ID (if not provided, returns aggregated summary)
  • token_limit (number, optional): Maximum tokens for aggregated summary (default: 2000)

delete_learning_sources

Delete learning sources and their summaries.

Parameters:

  • project_id (string): Project ID
  • source_ids (array): Array of source IDs to delete

get_processing_status

Get current processing status for learning sources.

Parameters:

  • project_id (string): Project ID

Supported Content Types

YouTube Videos

  • Extracts video metadata (title, author, duration, etc.)
  • Downloads transcripts when available
  • Falls back to description if no transcript

PDF Documents

  • Extracts text content from remote PDF URLs
  • Preserves document metadata
  • Handles various PDF formats

Web Articles

  • Uses Mozilla Readability for clean content extraction
  • Extracts metadata (title, author, publish date, etc.)
  • Estimates reading time

Data Storage

LearnMCP stores data in <FOREST_DATA_DIR>/learn-content/:

learn-content/
├── <project_id>/
│   ├── sources.json          # Source registry
│   └── summaries/
│       ├── <source_id>.json  # Individual summaries
│       └── ...

Forest Integration

When both LearnMCP and Forest are active, Forest's HTA builder can optionally include learning content summaries in its task generation prompts. This happens automatically when:

  1. LearnMCP has processed learning sources for a project
  2. Forest builds an HTA tree for the same project
  3. Learning content summaries are injected into the HTA generation prompt

Workflow Examples

Basic Learning Content Workflow

  1. Add Sources:

    add_learning_sources(project_id="learn_python", urls=["https://youtube.com/watch?v=python_tutorial"])
    
  2. Process Content:

    process_learning_sources(project_id="learn_python")
    
  3. Check Status:

    get_processing_status(project_id="learn_python")
    
  4. Get Summary:

    get_learning_summary(project_id="learn_python")
    

Integrated with Forest

  1. Add and process learning sources in LearnMCP
  2. Build HTA tree in Forest - it will automatically include learning content context
  3. Generated tasks will be informed by the processed learning materials

Configuration

Environment Variables

  • FOREST_DATA_DIR: Shared data directory with Forest (required)
  • LOG_LEVEL: Logging level (debug, info, warn, error)
  • NODE_ENV: Environment (development, production)

Background Processor Settings

  • Max Queue Size: 50 tasks
  • Max Concurrent: 2 simultaneous extractions
  • Processing Interval: 3 seconds
  • Retry Attempts: 3 per source
  • Timeout: 5 minutes per extraction

Error Handling

  • Graceful Degradation: Failed extractions don't block other sources
  • Retry Logic: Automatic retries with exponential backoff
  • Comprehensive Logging: Detailed logs for debugging
  • Status Tracking: Clear status indicators for each source

Development

Running Tests

npm test

Linting

npm run lint
npm run lint:fix

Debugging

Set LOG_LEVEL=debug for detailed logging.

Troubleshooting

Common Issues

  1. YouTube extraction fails: Check if video has transcripts enabled
  2. PDF extraction fails: Ensure PDF is publicly accessible
  3. Article extraction fails: Some sites block automated access

Logs

Check logs in <FOREST_DATA_DIR>/logs/:

  • learn-mcp.log: General operations
  • learn-mcp-errors.log: Error details

License

MIT License - Same as Forest MCP Server

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选