MCP 服务器

MCP Gemini Video Understanding

An MCP server that uses Google's Gemini API to analyze videos and convert them to text descriptions that Claude Code can understand and act upon.

README

MCP Gemini Video Understanding

An MCP (Model Context Protocol) server that uses Google's Gemini API to analyze videos and convert them to text descriptions that Claude Code can understand and act upon.

What is this?

This MCP server acts as a bridge between video content and Claude Code. When you have a video (screen recording, Loom video, YouTube tutorial, etc.), this server uses Gemini's powerful video understanding capabilities to extract meaningful text descriptions that Claude Code can then use to write code, fix bugs, or implement features.

Use Cases

Bug Reproduction Videos: Record a video showing a bug → Get detailed steps to reproduce and debugging insights
Design Mockups: Show a design in a video → Get implementation guidance with UI component breakdowns
YouTube Tutorials: Share a tutorial URL → Extract key learnings and implementation steps
Responsive Issues: Record layout problems → Get specific CSS fixes and responsive solutions

Installation

npm install -g @ugarchance/mcp-gemini-video-understanding

Or use directly with npx:

npx @ugarchance/mcp-gemini-video-understanding

Setup

1. Get a Gemini API Key

Go to Google AI Studio
Click "Get API Key"
Create or select a project
Copy your API key

2. Set Environment Variable

export GEMINI_API_KEY="your-api-key-here"

3. Configure Claude Code

Add to your claude_desktop_config.json:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "gemini-video": {
      "command": "npx",
      "args": [
        "-y",
        "@ugarchance/mcp-gemini-video-understanding"
      ],
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Or if installed globally:

{
  "mcpServers": {
    "gemini-video": {
      "command": "mcp-gemini-video",
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Usage

All tools support these common parameters:

model (string, optional): Gemini model to use. Options:
- gemini-2.5-pro - Most capable, best for complex analysis
- gemini-2.5-flash - Default, balanced speed and quality
- gemini-2.5-flash-lite - Fastest, lighter analysis
- gemini-2.0-flash - Previous generation fast model
- gemini-2.0-flash-exp - Experimental features
output_file (string, optional): Path to save analysis. If file exists, cached result is used (no re-analysis!)

Tool 1: analyze_bug_video

Analyze a video showing a bug or error.

Parameters:

video_path (string): Path to video file or YouTube URL
is_youtube (boolean, optional): Set to true if using YouTube URL
additional_context (string, optional): Extra context about the bug
model (string, optional): Gemini model to use
output_file (string, optional): Path to save analysis

Example with Claude Code:

I have a bug video at /Users/me/Desktop/bug-demo.mp4
Save the analysis to bug-analysis.md and fix the issue.

With model selection:

Analyze /Users/me/Desktop/complex-bug.mp4 using gemini-2.5-pro
Save to analysis.txt and help me fix it.

Tool 2: analyze_design_video

Analyze a video showing a design mockup or feature demonstration.

Parameters:

video_path (string): Path to video file or YouTube URL
is_youtube (boolean, optional): Set to true if using YouTube URL
tech_stack (string, optional): Technologies to use (e.g., "React with Tailwind")
model (string, optional): Gemini model to use
output_file (string, optional): Path to save analysis

Example with Claude Code:

I recorded a design mockup at /Users/me/Desktop/new-feature.mp4
Save analysis to design-spec.md then implement using React and Tailwind CSS.

Tool 3: analyze_tutorial_video

Analyze a YouTube tutorial to extract key learnings.

Parameters:

video_url (string): YouTube URL
focus_area (string, optional): Specific topic to focus on
model (string, optional): Gemini model to use
output_file (string, optional): Path to save analysis

Example with Claude Code:

Watch this tutorial: https://www.youtube.com/watch?v=xxxxx
Save the learnings to tutorial-notes.md then implement the auth system.

Using faster model for quick summaries:

Analyze https://www.youtube.com/watch?v=xxxxx with gemini-2.5-flash-lite
Just give me the key points.

Tool 4: analyze_responsive_issues

Analyze a video showing responsive design problems.

Parameters:

video_path (string): Path to video file or YouTube URL
is_youtube (boolean, optional): Set to true if using YouTube URL
target_devices (string, optional): Target devices (e.g., "mobile, tablet")
model (string, optional): Gemini model to use
output_file (string, optional): Path to save analysis

Example with Claude Code:

I recorded responsive issues at /Users/me/Desktop/mobile-issues.mp4
Save analysis to responsive-fixes.md and fix the layout for mobile.

How It Works

You record a video or find a YouTube URL
You ask Claude Code to analyze it via MCP (optionally specifying model and output file)
MCP Server checks if cached analysis exists (if output_file specified)
If no cache: Sends video to Gemini API with chosen model
Gemini analyzes video and returns detailed text description
MCP Server saves result to file (if output_file specified)
Claude Code receives the text and can now write/fix code based on it

Caching Strategy

When you specify an output_file:

First run: Video is analyzed and result is saved to the file
Subsequent runs: Cached file is read instantly (no API call, no cost!)
To re-analyze: Delete the output file first

This is perfect for:

Iterating on implementations without re-analyzing videos
Sharing analysis results with team members
Reducing API costs and latency

Supported Video Formats

MP4
MOV
AVI
WebM
MKV
FLV
WMV
3GP
MPEG

Available Models

Model	Speed	Quality	Best For	Cost
`gemini-2.5-pro`	Slow	Highest	Complex bugs, detailed designs	$$$
`gemini-2.5-flash`	Fast	High	General use (default)	$$
`gemini-2.5-flash-lite`	Fastest	Good	Quick summaries, simple videos	$
`gemini-2.0-flash`	Fast	Good	Previous gen, reliable	$$
`gemini-2.0-flash-exp`	Fast	Varies	Experimental features	$$

Limitations

YouTube: Only public videos (not private or unlisted)
File Size: Files >20MB automatically use Gemini's File API (may take longer to process)
Video Length: Longer videos take more time to process
Rate Limits: Subject to Gemini API rate limits
Caching: Only works when output_file is specified

Development

Local Development

# Clone the repo
git clone https://github.com/ugarchance/mcp-gemini-video-understanding
cd mcp-gemini-video-understanding

# Install dependencies
npm install

# Build
npm run build

# Test locally with Claude Code
# Add to claude_desktop_config.json:
{
  "mcpServers": {
    "gemini-video": {
      "command": "node",
      "args": ["/absolute/path/to/mcp-gemini-video-understanding/build/index.js"],
      "env": {
        "GEMINI_API_KEY": "your-key"
      }
    }
  }
}

Publishing to npm

# Update package.json with your npm username
npm login
npm publish

Troubleshooting

"GEMINI_API_KEY environment variable is required"

Make sure you've set the GEMINI_API_KEY in your claude_desktop_config.json under the env section.

"Error analyzing video"

Check that the video file path is absolute (not relative)
Verify the video format is supported
For YouTube videos, ensure the URL is valid and the video is public
Check Gemini API quotas and rate limits

Tools not showing in Claude Code

Restart Claude Code completely (Cmd+Q on Mac, not just close window)
Check claude_desktop_config.json syntax is valid JSON
Look at Claude Code logs: ~/Library/Logs/Claude/mcp*.log (macOS)

License

MIT

Contributing

Contributions welcome! Please open an issue or PR.

Credits

Built with:

Gemini API for video understanding
Model Context Protocol for Claude integration

MCP Gemini Video Understanding

README

MCP Gemini Video Understanding

What is this?

Use Cases

Installation

Setup

1. Get a Gemini API Key

2. Set Environment Variable

3. Configure Claude Code

Usage

Tool 1: analyze_bug_video

Tool 2: analyze_design_video

Tool 3: analyze_tutorial_video

Tool 4: analyze_responsive_issues

How It Works

Caching Strategy

Supported Video Formats

Available Models

Limitations

Development

Local Development

Publishing to npm

Troubleshooting

"GEMINI_API_KEY environment variable is required"

"Error analyzing video"

Tools not showing in Claude Code

License

Contributing

Credits

推荐服务器