MCP 服务器

Gemini Image Generation MCP Server

Provides image generation capabilities using Google's Gemini 2.0 Flash Preview model through the MCP protocol, enabling AI assistants to generate high-quality images from text prompts.

README

Gemini Image Generation MCP Server

A Model Context Protocol (MCP) server that provides image generation capabilities using Google's Gemini 2.0 Flash Preview model. This server allows AI assistants to generate high-quality images from text prompts through the MCP protocol.

Features

Image Generation: Generate images from text prompts using Gemini 2.0 Flash Preview
Multiple Output Formats: Support for PNG, JPEG, and other image formats
File Management: Automatic saving of generated images with organized file naming
Base64 Support: Handle image data in base64 format for easy integration
Status Monitoring: Check API connection status and model information
Prompt Templates: Pre-built prompts for optimized image generation

Prerequisites

Python 3.9 or higher
Google AI API key (Gemini API access)
uv package manager (recommended) or pip

Installation

Using uv (Recommended)

Clone or download this repository:

git clone <repository-url>
cd image-generation-gemini-mcp

Install dependencies:

uv sync

Using pip

Install dependencies:

pip install -r requirements.txt

Setup

1. Get Google AI API Key

Visit Google AI Studio
Create a new API key
Copy the API key for use in the next step

2. Set Environment Variable

Set your Gemini API key as an environment variable:

Windows (PowerShell):

$env:GEMINI_API_KEY="your-api-key-here"

Windows (Command Prompt):

set GEMINI_API_KEY=your-api-key-here

macOS/Linux:

export GEMINI_API_KEY="your-api-key-here"

For permanent setup, add the environment variable to your system's environment variables or shell profile.

Usage

Running the Server

Development Mode

To test the server locally:

uv run server.py

MCP Integration Mode

To run as an MCP server:

uv run server.py stdio

Integration with MCP Clients

Claude Desktop Integration

Add the server to your Claude Desktop configuration file:

Windows: %APPDATA%\Claude\claude_desktop_config.json macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "gemini-image-generator": {
      "command": "uv",
      "args": ["run", "server.py", "stdio"],
      "cwd": "C:\\path\\to\\image-generation-gemini-mcp",
      "env": {
        "GEMINI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Restart Claude Desktop
The image generation tools will be available in your conversations

Other MCP Clients

For other MCP-compatible clients, configure them to run:

uv run server.py stdio

With the working directory set to this project folder and the GEMINI_API_KEY environment variable configured.

Available Tools

`generate_image`

Generate an image from a text prompt.

Parameters:

prompt (string): Text description of the image to generate
output_dir (string, optional): Directory to save the image (default: "./generated_images")

Returns:

success (boolean): Whether generation was successful
message (string): Status message or error description
image_data (string): Base64 encoded image data (if successful)
mime_type (string): MIME type of the generated image
file_extension (string): File extension for the image

`save_image_from_base64`

Save a base64 encoded image to a file.

Parameters:

image_data (string): Base64 encoded image data
filename (string): Name for the output file (include extension)
output_dir (string, optional): Directory to save the image

Returns:

success (string): "true" or "false"
message (string): Status message
file_path (string): Path to saved file (if successful)

Available Resources

`gemini://api-status`

Check the status of the Gemini API connection.

`gemini://model-info`

Get information about the Gemini image generation model capabilities.

Available Prompts

`image_generation_prompt`

Generate a detailed prompt optimized for image generation.

Parameters:

subject (string): Main subject or object to generate
style (string, optional): Art style (default: "photorealistic")
mood (string, optional): Mood or atmosphere (default: "neutral")

Example Usage

Once integrated with an MCP client like Claude Desktop, you can:

Generate an image:

Please generate an image of a sunset over mountains

Use specific styles:

Create a cartoon-style image of a friendly robot

Check API status:

Can you check the status of the Gemini API?

Get model information:

What are the capabilities of the image generation model?

File Structure

image-generation-gemini-mcp/
├── server.py              # Main MCP server implementation
├── requirements.txt       # Python dependencies
├── pyproject.toml        # Project configuration
├── README.md             # This file
├── docs/                 # Documentation files
│   ├── development-guidelines.md
│   ├── mcp-info.md
│   └── mcp-python-sdk-readme.md
└── generated_images/     # Default output directory (created automatically)

Troubleshooting

Common Issues

"GEMINI_API_KEY environment variable is required"
- Ensure you've set the GEMINI_API_KEY environment variable
- Verify the API key is valid and has access to Gemini API
"Error connecting to Gemini API"
- Check your internet connection
- Verify your API key is correct and active
- Ensure you have access to the Gemini 2.0 Flash Preview model
"No image was generated"
- Try rephrasing your prompt
- Ensure your prompt is descriptive and clear
- Check if there are any content policy restrictions
Permission errors when saving files
- Ensure the output directory is writable
- Check file system permissions