Link Scan MCP Server

Link Scan MCP Server

Automatically scans and summarizes video links (YouTube, Instagram Reels) and text links (blogs, articles) using AI-powered transcription and summarization. Provides concise 3-sentence summaries without requiring API keys.

Category
访问服务器

README

Link Scan MCP Server 🚀

링크를 스캔하고 요약을 제공하는 포괄적인 Model Context Protocol (MCP) 서버입니다. YouTube, Instagram Reels 등 비디오 링크와 블로그, 기사 등 텍스트 링크를 자동으로 감지하고 분석하여 3문장 이내의 간결한 요약을 제공합니다. API 키 없이 모든 기능을 사용할 수 있습니다!

Link Scan MCP Server - A comprehensive Model Context Protocol (MCP) server for scanning and summarizing links. Automatically detects and analyzes video links (YouTube, Instagram Reels) and text links (blogs, articles) to provide concise 3-sentence summaries. All features work without requiring API keys!

Python 3.11+ | MCP Compatible | License: MIT

✨ Features

🎥 Video Link Analysis

  • YouTube Support
    • Comprehensive metadata extraction (title, description)
    • Subtitle extraction for first 7 seconds (yt-dlp)
    • Audio transcription using OpenAI Whisper
    • Integrated summarization combining all text sources
  • Instagram Reels Support
    • Audio download and transcription (first 7 seconds)
    • Automatic content summarization
  • Smart Link Detection
    • Automatic video/text link type detection
    • Error handling for unsupported URLs

📝 Text Link Analysis

  • Web Content Extraction
    • BeautifulSoup-based HTML parsing
    • Main content area detection
    • Automatic navigation/ad removal
  • Intelligent Summarization
    • Llama3-powered text summarization
    • 3-sentence limit enforcement
    • Natural Korean output

🤖 AI-Powered Summarization

  • Llama3 Integration
    • Local LLM via Ollama (no API keys required)
    • Separate prompts for video and text content
    • Fallback to original text on errors
  • Whisper Transcription
    • High-quality speech-to-text conversion
    • Optimized for speed and accuracy
    • Supports multiple languages

🐳 Docker Support

  • One-Command Setup
    • Docker Compose configuration
    • Automatic Ollama service setup
    • Llama3 model auto-download
    • Development mode with hot reload

🔧 Developer-Friendly

  • Type-safe with Pydantic models
  • Async/await support for better performance
  • Comprehensive error handling
  • Extensible architecture
  • Hot reload in development mode

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/your-username/mcp-link-scan.git
cd mcp-link-scan

# Install dependencies
pip install -r requirements.txt

System Dependencies

ffmpeg (required for audio processing):

  • macOS: brew install ffmpeg
  • Ubuntu/Debian: sudo apt-get install ffmpeg
  • Windows: Download from https://ffmpeg.org/download.html

Ollama (required for summarization):

  • macOS: brew install ollama or download from https://ollama.com/download
  • Linux: curl -fsSL https://ollama.com/install.sh | sh
  • Windows: Download from https://ollama.com/download
  • After installation: ollama pull llama3:latest

Configuration

Create a .env file:

# 서버 설정
PORT=8000                    # 서버 포트 (기본값: 8000)
HOST=0.0.0.0                 # 서버 호스트 (기본값: 0.0.0.0)
DEBUG=False                  # 디버그 모드 (기본값: False)

# API 경로 prefix (선택)
# 같은 서버에 여러 MCP 서버를 호스팅할 때 사용
# 기본값: /link-scan
API_PREFIX=/link-scan

# Ollama 설정 (선택)
# Docker Compose를 사용하는 경우 자동으로 설정됨
OLLAMA_API_URL=http://localhost:11434    # Ollama API URL (기본값: http://localhost:11434)
OLLAMA_MODEL=llama3:latest                # 사용할 Ollama 모델 (기본값: llama3)

환경 변수 설명

변수명 필수 기본값 설명
PORT 8000 서버가 사용할 포트 번호
HOST 0.0.0.0 서버가 바인딩할 호스트 주소
DEBUG False 디버그 모드 활성화 (True/False)
API_PREFIX /link-scan API 엔드포인트 경로 prefix
OLLAMA_API_URL http://localhost:11434 Ollama API 서버 URL
OLLAMA_MODEL llama3 사용할 Ollama 모델 이름

Running as MCP Server

Local Mode (stdio):

python -m src.server

Remote Mode (HTTP):

python run_server.py

Or with uvicorn directly:

uvicorn src.server_http:app --host 0.0.0.0 --port 8000

Docker Setup (Recommended)

Using Docker Compose:

# Start all services (link-scan + Ollama)
docker-compose up -d

# Check logs
docker-compose logs -f

# Stop services
docker-compose down

Docker Compose automatically:

  • Sets up Ollama service with 8GB memory
  • Downloads Llama3 model
  • Configures link-scan service
  • Enables development mode with hot reload

Development Mode: The docker-compose.yml is configured for development with:

  • Source code volume mounting
  • Hot reload enabled (DEBUG=True)
  • Automatic code changes detection

Testing with MCP Inspector

You can test the server using the MCP Inspector tool:

# Test with Python
npx @modelcontextprotocol/inspector python run_server.py

# Or test stdio mode
npx @modelcontextprotocol/inspector python -m src.server

The MCP Inspector provides a web interface to:

  • View available tools and their schemas
  • Test tool execution with sample inputs
  • Debug server responses and error handling
  • Validate MCP protocol compliance

🛠️ Available Tools

1. scan_video_link

Scan and summarize video links (YouTube, Instagram Reels, etc.).

Parameters:

  • url (string, required): Video URL to scan

Example:

{
  "name": "scan_video_link",
  "arguments": {
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  }
}

Process:

  1. Detects link type (YouTube, Instagram, etc.)
  2. For YouTube: Extracts title, description, subtitles (first 7s)
  3. Downloads audio (first 7 seconds)
  4. Transcribes audio with Whisper
  5. Combines all text sources
  6. Summarizes with Llama3 (3 sentences max)

2. scan_text_link

Scan and summarize text links (blogs, articles, etc.).

Parameters:

  • url (string, required): Text URL to scan

Example:

{
  "name": "scan_text_link",
  "arguments": {
    "url": "https://example.com/blog/article"
  }
}

Process:

  1. Fetches HTML content
  2. Extracts main text content
  3. Removes navigation, ads, and noise
  4. Summarizes with Llama3 (3 sentences max)

📊 Example Outputs

Video Link Summary

Input: YouTube video URL

Output:

이 영상은 Python 프로그래밍 언어의 기본 개념을 소개합니다. 
변수, 함수, 클래스 등 핵심 문법을 실습 예제와 함께 설명합니다. 
초보자도 쉽게 따라할 수 있도록 단계별로 구성되어 있습니다.

Text Link Summary

Input: Blog article URL

Output:

이 글은 Docker 컨테이너 기술의 장단점을 분석합니다. 
가상화 기술과 비교하여 리소스 효율성과 배포 편의성을 강점으로 제시합니다. 
다만 보안과 복잡성 측면에서 주의가 필요하다고 조언합니다.

🏗️ Architecture

mcp-link-scan/
├── src/
│   ├── server.py              # Local server (stdio)
│   ├── server_http.py         # Remote server (HTTP)
│   ├── tools/                  # MCP tools
│   │   ├── link_scanner.py     # Main tool definitions
│   │   ├── media_handler.py    # Video processing (Whisper)
│   │   └── text_handler.py    # Text extraction
│   ├── utils/                  # Utilities
│   │   ├── link_detector.py    # Link type detection
│   │   ├── youtube_extractor.py # YouTube metadata/subtitles
│   │   └── llm_summarizer.py   # Llama3 integration
│   └── prompts/                # LLM prompts
│       └── __init__.py         # Video/text prompt templates
├── docker/
│   └── init-ollama.sh          # Ollama initialization script
├── docker-compose.yml          # Docker services
├── Dockerfile                  # Container build config
├── requirements.txt            # Python dependencies
└── run_server.py               # Server entry point

🔧 Development

Setting up Development Environment

# Clone and install
git clone https://github.com/your-username/mcp-link-scan.git
cd mcp-link-scan
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your settings

# Start Ollama (if not using Docker)
ollama serve
ollama pull llama3:latest

Development Mode with Docker

# Start in development mode (hot reload enabled)
docker-compose up -d

# View logs
docker-compose logs -f link-scan

# Code changes are automatically reloaded

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src

# Run specific test file
pytest tests/test_link_scanner.py

Customizing Prompts

Edit src/prompts/__init__.py to customize LLM prompts:

# Video summarization prompt
VIDEO_SUMMARIZE_SYSTEM = """
Your custom system prompt here...
"""

# Text summarization prompt
TEXT_SUMMARIZE_SYSTEM = """
Your custom system prompt here...
"""

Configuring Whisper Model

Edit src/tools/media_handler.py:

# Change model size (tiny, base, small, medium, large)
_whisper_model = whisper.load_model("base")  # Default: "base"

📋 Requirements

  • Python 3.11+
  • ffmpeg - Audio processing
  • Ollama - LLM runtime (for summarization)
  • yt-dlp - Video/audio download
  • openai-whisper - Speech-to-text
  • torch - PyTorch (for Whisper)
  • aiohttp - Async HTTP client
  • beautifulsoup4 - HTML parsing
  • fastapi - HTTP server framework
  • uvicorn - ASGI server
  • mcp - Model Context Protocol SDK

🌐 Deployment

PlayMCP Registration

  1. Deploy Server: Deploy to cloud hosting (Render, Railway, Fly.io, AWS, GCP, etc.)
  2. Get Server URL: Example: https://your-server.railway.app
  3. Register in PlayMCP: Use URL https://your-server.railway.app/messages

Important: Server URL must be publicly accessible and support HTTPS for production use.

Using with MCP Clients

Amazon Q CLI:

{
  "mcpServers": {
    "link-scan": {
      "command": "python",
      "args": ["run_server.py"],
      "cwd": "/path/to/mcp-link-scan"
    }
  }
}

Other MCP Clients:

{
  "mcpServers": {
    "link-scan": {
      "url": "https://your-server.com/messages"
    }
  }
}

🤝 Contributing

We welcome contributions! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass (pytest)
  6. Commit your changes (git commit -m 'Add amazing feature')
  7. Push to the branch (git push origin feature/amazing-feature)
  8. Open a Pull Request

Development Workflow

# Install in development mode
pip install -e .

# Run tests
pytest

# Format code (if using formatters)
black src/ tests/
isort src/ tests/

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • yt-dlp team for the excellent YouTube extraction library
  • OpenAI Whisper team for the speech-to-text model
  • Ollama team for the local LLM runtime
  • MCP team for the Model Context Protocol specification
  • Pydantic team for the data validation library

📞 Support

🗺️ Roadmap

  • [ ] Batch processing for multiple links
  • [ ] Caching layer for improved performance
  • [ ] Export functionality (JSON, CSV, etc.)
  • [ ] Advanced analytics (sentiment analysis, topic extraction)
  • [ ] Support for more video platforms (TikTok, Vimeo, etc.)
  • [ ] WebSocket support for real-time updates
  • [ ] Integration examples with popular MCP clients
  • [ ] Custom prompt templates via API
  • [ ] Multi-language support for summaries
  • [ ] Video thumbnail extraction

📝 Notes

  • Audio downloads are temporarily stored and automatically cleaned up
  • Whisper model is loaded once and reused for better performance
  • Processing time depends on video length and Whisper model size
  • YouTube videos are processed for first 7 seconds only to reduce processing time
  • All text sources (title, description, subtitles, transcription) are combined for YouTube videos
  • Summaries are limited to 3 sentences maximum
  • For production, consider using GPU for faster Whisper conversion
  • Ollama timeout is set to 5 minute for tool calls

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选