YouTube Transcript MCP Server

YouTube Transcript MCP Server

Enables LLMs to extract YouTube video transcripts with timestamps, metadata, and file export.

Category
访问服务器

README

YouTube Transcript MCP Server

npm version License: MIT Node.js Version

A Model Context Protocol (MCP) server that enables Large Language Models (LLMs) to extract transcripts from YouTube videos. Built with the reliable youtubei.js library, this server provides seamless transcript extraction with support for timestamps, metadata, and file exports.

✨ Features

  • 🎥 Extract transcripts from any YouTube video with captions
  • ⏱️ Timestamp support - Get transcripts with or without timestamps
  • 📊 Rich metadata - Word count, duration, segment count, and more
  • 💾 Export to files - Save transcripts as text files
  • 🔧 Flexible input - Accepts full URLs, short URLs, or just video IDs
  • High reliability - Uses YouTube's internal API via youtubei.js
  • 🚀 No API key required - Works out of the box
  • 🛡️ Error handling - Clear, actionable error messages

📦 Installation

As an MCP Server for Claude Desktop

# Clone the repository
git clone https://github.com/tanush-yadav/youtube-transcript-mcp.git
cd youtube-transcript-mcp

# Install dependencies
npm install

As an npm Package

npm install @tanush-yadav/youtube-transcript-mcp

Or using yarn:

yarn add @tanush-yadav/youtube-transcript-mcp

🚀 Quick Start

Configuration for Claude Desktop

Add the server to your Claude Desktop configuration:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "youtube-transcript": {
      "command": "node",
      "args": ["/absolute/path/to/youtube-transcript-mcp/index.js"]
    }
  }
}

Or if installed globally via npm:

{
  "mcpServers": {
    "youtube-transcript": {
      "command": "npx",
      "args": ["@tanush-yadav/youtube-transcript-mcp"]
    }
  }
}

🛠️ Available Tools

1. get_transcript

Extract transcript from a YouTube video with optional timestamps.

Parameters:

  • url (string, required): YouTube video URL or video ID
  • include_timestamps (boolean, optional): Include timestamps in output (default: false)

Example Request:

{
  "name": "get_transcript",
  "arguments": {
    "url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
    "include_timestamps": true
  }
}

Example Output with timestamps:

[00:00] We're no strangers to love
[00:04] You know the rules and so do I
[00:08] A full commitment's what I'm thinking of

2. get_transcript_with_metadata

Extract transcript along with comprehensive metadata.

Parameters:

  • url (string, required): YouTube video URL or video ID

Example Response:

{
  "metadata": {
    "video_id": "dQw4w9WgXcQ",
    "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
    "word_count": 251,
    "segment_count": 42,
    "duration": "3:32",
    "duration_seconds": 212,
    "language": "en",
    "is_auto_generated": false
  },
  "transcript": "Never gonna give you up...",
  "full_transcript_length": 1234
}

3. save_transcript

Save transcript to text file(s) on the local filesystem.

Parameters:

  • url (string, required): YouTube video URL or video ID
  • filename (string, required): Base filename (without extension)
  • with_timestamps (boolean, optional): Save version with timestamps (default: true)

Example:

{
  "name": "save_transcript",
  "arguments": {
    "url": "https://youtu.be/dQw4w9WgXcQ",
    "filename": "rickroll_transcript",
    "with_timestamps": true
  }
}

Creates files:

  • rickroll_transcript_clean.txt - Plain text transcript
  • rickroll_transcript_with_timestamps.txt - Transcript with timestamps (if enabled)

💻 Programmatic Usage

As an MCP Client

import { Client } from '@modelcontextprotocol/sdk/client/index.js'
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js'

// Initialize transport
const transport = new StdioClientTransport({
  command: 'node',
  args: ['/path/to/youtube-transcript-mcp/index.js'],
})

// Create client
const client = new Client({
  name: 'youtube-transcript-client',
  version: '1.0.0',
})

// Connect and use
await client.connect(transport)

// Get transcript with timestamps
const result = await client.callTool({
  name: 'get_transcript',
  arguments: {
    url: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
    include_timestamps: true,
  },
})

console.log(result.content[0].text)

Direct Module Usage

// Coming soon: Direct module import support
import { YouTubeTranscriptExtractor } from '@tanush-yadav/youtube-transcript-mcp'

const extractor = new YouTubeTranscriptExtractor()
const transcript = await extractor.getTranscript('dQw4w9WgXcQ')
console.log(transcript)

🌐 Supported URL Formats

The server accepts various YouTube URL formats:

  • ✅ Standard: https://www.youtube.com/watch?v=VIDEO_ID
  • ✅ Short: https://youtu.be/VIDEO_ID
  • ✅ Embed: https://www.youtube.com/embed/VIDEO_ID
  • ✅ Mobile: https://m.youtube.com/watch?v=VIDEO_ID
  • ✅ Shorts: https://www.youtube.com/shorts/VIDEO_ID
  • ✅ With timestamps: https://youtube.com/watch?v=VIDEO_ID&t=123
  • ✅ With playlist: https://youtube.com/watch?v=VIDEO_ID&list=PLAYLIST_ID
  • ✅ Just video ID: dQw4w9WgXcQ

📝 Usage Examples with Claude

Once configured, you can ask Claude:

"Get the transcript from https://www.youtube.com/watch?v=dQw4w9WgXcQ"

"Extract the YouTube transcript with timestamps from video ID abc123"

"Save the transcript from this video to a file: [URL]"

"Get detailed metadata and transcript from: [URL]"

"Summarize this YouTube video: [URL]" (Claude will fetch and summarize)

🔧 Development

Running Tests

npm test

Building from Source

git clone https://github.com/tanush-yadav/youtube-transcript-mcp.git
cd youtube-transcript-mcp
npm install
npm run build

Development Mode

npm run dev

Testing the MCP Server

Create a test file test-client.js:

import { Client } from '@modelcontextprotocol/sdk/client/index.js'
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js'

const transport = new StdioClientTransport({
  command: 'node',
  args: ['./index.js'],
})

const client = new Client({
  name: 'test-client',
  version: '1.0.0',
})

await client.connect(transport)

// List available tools
const tools = await client.listTools()
console.log('Available tools:', tools)

// Test transcript extraction
const result = await client.callTool({
  name: 'get_transcript',
  arguments: {
    url: 'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
  },
})

console.log('Transcript:', result.content[0].text)
await transport.close()

🐛 Troubleshooting

Common Issues

  1. "No transcript available"

    • ✓ Ensure the video has captions/subtitles available
    • ✓ Check if the video is public and not age-restricted
    • ✓ Some live streams may not have transcripts available
  2. Connection errors

    • ✓ Verify your internet connection
    • ✓ Check if YouTube is accessible in your region
    • ✓ Ensure Node.js version is 18.0 or higher
  3. MCP server not found in Claude

    • ✓ Verify the path in your Claude configuration is absolute
    • ✓ Ensure Node.js is properly installed and in PATH
    • ✓ Restart Claude Desktop after configuration changes
  4. Permission errors when saving files

    • ✓ Ensure write permissions in the target directory
    • ✓ Check disk space availability

Debug Mode

Enable debug logging by setting the environment variable:

DEBUG=youtube-transcript-mcp node index.js

📊 Performance

  • Average transcript extraction time: 1-3 seconds
  • Memory usage: ~50MB
  • Supports videos up to 12+ hours in length
  • Handles 1000+ segments efficiently

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow existing code style
  • Add tests for new features
  • Update documentation as needed
  • Ensure all tests pass before submitting PR

📄 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

📈 Roadmap

  • [ ] Support for multiple language transcripts
  • [ ] Batch processing for multiple videos
  • [ ] Transcript translation capabilities
  • [ ] Export to SRT/VTT subtitle formats
  • [ ] Caching for improved performance
  • [ ] Support for playlist extraction
  • [ ] Real-time transcript streaming
  • [ ] Custom formatting options

💬 Support

For issues, questions, or suggestions:

📝 Changelog

[1.0.0] - 2024-01-03

  • 🎉 Initial release
  • ✨ Transcript extraction with youtubei.js
  • ⏱️ Timestamp support
  • 📊 Metadata extraction
  • 💾 File saving capability
  • 🔧 MCP protocol implementation

Made with ❤️ by the Open Source Community

Star ⭐ this repo if you find it useful!

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选