FLUX MCP Server
Enables high-quality image generation using FLUX.1-dev through Claude Desktop or CLI, with automatic model unloading to save VRAM and memory-efficient bfloat16 processing.
README
FLUX MCP Server & CLI
A Model Context Protocol (MCP) server and command-line tool for generating images using FLUX.1-dev with automatic model unloading to save VRAM and power.
Features
- 🎨 High-Quality Image Generation - Uses FLUX.1-dev for state-of-the-art image synthesis
- ⚡ Lazy Loading - Model loads only when needed
- 🔄 Auto-Unload - Automatically unloads model after configurable inactivity period (MCP mode)
- 💾 Memory Efficient - Uses bfloat16 for optimal VRAM usage (~12GB)
- 🎲 Reproducible - Seed-based generation for consistent results
- 📊 Status Monitoring - Check model status and VRAM usage
- 🔧 Runtime Configuration - Adjust timeout without restarting
- 🖥️ Dual Interface - Use via MCP-compatible applications or command-line (CLI)
Quick Start
Get started with FLUX MCP in minutes:
# 1. Install dependencies (using UV - recommended)
uv sync
# 2. Configure environment
cp .env.example .env
# Edit .env to set FLUX_OUTPUT_DIR and other preferences
# 3. Add to your MCP client config (example for Claude Desktop)
# Add to ~/.config/Claude/claude_desktop_config.json (Linux)
# See "MCP Server Registration" section below for full details
# 4. Generate your first image (CLI mode)
flux generate "a beautiful sunset over mountains"
# Or use via MCP client (e.g., Claude Desktop)
# Just ask: "Generate an image of a beautiful sunset over mountains"
For detailed setup and configuration, see the sections below.
Requirements
- Python 3.10+
- NVIDIA GPU with 16GB+ VRAM (tested on RTX 4070 Ti Super)
- CUDA toolkit installed
- PyTorch with CUDA support
Installation
- Clone the repository (or navigate to the project directory):
cd /path/to/flux-mcp
- Install with UV (recommended):
uv sync
Or install with pip:
pip install -e .
- Configure environment variables:
cp .env.example .env
# Edit .env with your preferred settings
Configuration Options
Edit .env to customize:
# Auto-unload timeout in seconds (default: 300 = 5 minutes)
FLUX_UNLOAD_TIMEOUT=300
# Output directory for generated images
FLUX_OUTPUT_DIR=/path/to/flux_output
# Optional: Custom HuggingFace cache directory
# FLUX_MODEL_CACHE=/path/to/cache
MCP Server Registration
Add the server to your MCP client configuration. Below is an example for Claude Desktop:
Claude Desktop configuration file locations:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"flux": {
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/flux-mcp",
"run",
"flux-mcp"
]
}
}
}
Or if installed globally with pip:
{
"mcpServers": {
"flux": {
"command": "python",
"args": [
"-m",
"flux_mcp.server"
]
}
}
}
After adding the configuration, restart your MCP client (e.g., Claude Desktop).
CLI Usage
In addition to the MCP server mode, you can use FLUX directly from the command line for completely offline and private image generation.
Quick Start
# Basic usage
flux generate "a beautiful sunset over mountains"
# With custom parameters
flux generate "portrait of a cat" --steps 35 --guidance 4.0 --seed 42
# Interactive mode for batch generation
flux generate --interactive
# Check system status
flux status
# View configuration
flux config
# Open output directory
flux open-output
Generate Command
The main command for image generation:
flux generate [OPTIONS] PROMPT
Options:
--steps, -s INTEGER- Number of inference steps (default: 28)--guidance, -g FLOAT- Guidance scale (default: 3.5)--width, -w INTEGER- Image width in pixels, must be multiple of 8 (default: 1024)--height, -h INTEGER- Image height in pixels, must be multiple of 8 (default: 1024)--seed INTEGER- Random seed for reproducibility--output, -o PATH- Custom output path (default: auto-generated)--output-dir PATH- Override output directory--interactive, -i- Interactive mode--verbose, -v- Verbose output with debug info
Examples:
# Simple generation
flux generate "a cozy cabin in snowy mountains"
# High quality with more steps
flux generate "professional portrait" --steps 40 --guidance 7.5
# Custom resolution
flux generate "wide landscape" --width 1536 --height 1024
# Reproducible generation
flux generate "cute robot" --seed 42
# Save to specific location
flux generate "sunset" --output ~/Pictures/my-sunset.png
# Interactive mode (best for multiple images)
flux generate --interactive
Interactive Mode
Interactive mode allows you to generate multiple images without reloading the model:
flux generate --interactive
Interactive workflow:
- Enter your prompt
- Configure parameters (steps, guidance, dimensions, seed)
- Image generates and saves
- Choose to generate another or exit
- Model stays loaded between generations for faster subsequent images
Other Commands
Status Command:
flux status
Shows:
- Model information
- Output directory
- CUDA availability
- GPU name and VRAM usage
- Model cache location
Config Command:
flux config
Displays current configuration from environment variables.
Open Output:
flux open-output
Opens the output directory in your file manager (Linux: xdg-open, macOS: open, Windows: explorer).
Output Files
Generated images are saved with metadata:
- Image:
flux_YYYYMMDD_HHMMSS_SEED.png - Metadata:
flux_YYYYMMDD_HHMMSS_SEED.json
Metadata JSON contains:
{
"prompt": "your prompt here",
"seed": 42,
"steps": 28,
"guidance_scale": 3.5,
"width": 1024,
"height": 1024,
"model": "black-forest-labs/FLUX.1-dev",
"generation_time_seconds": 15.3,
"timestamp": "2025-01-26T12:34:56"
}
CLI vs MCP Server
CLI Mode:
- ✓ Completely offline and private (no MCP client needed)
- ✓ Direct control from terminal
- ✓ Batch generation with interactive mode
- ✓ No auto-unload (process terminates after generation)
- ✓ Saves metadata JSON files
- ✓ Rich terminal UI with progress bars
MCP Server Mode:
- ✓ Integrated with MCP-compatible applications (like Claude Desktop)
- ✓ Natural language interface
- ✓ Auto-unload after timeout (saves power)
- ✓ Persistent background process
- ✓ Access from conversational AI interfaces
Both modes share the same configuration, model cache, and output directory.
MCP Server Tools
The following tools are available when using this server with any MCP-compatible client. Examples below show usage with Claude Desktop:
1. generate_image
Generate an image from a text prompt.
Parameters:
prompt(required): Text description of the imagesteps(optional): Number of inference steps (default: 28, range: 20-50)guidance_scale(optional): Guidance scale (default: 3.5, range: 1.0-10.0)width(optional): Image width in pixels (default: 1024, range: 256-2048)height(optional): Image height in pixels (default: 1024, range: 256-2048)seed(optional): Random seed for reproducibility (random if not provided)
Example Usage (natural language with MCP client):
Generate an image of a futuristic cyberpunk city at sunset with neon lights
Generate an image with seed 42 of a serene mountain landscape with steps=30
2. unload_model
Immediately unload the FLUX model from GPU memory.
Example Usage:
Unload the FLUX model to free up VRAM
3. get_status
Check the current status of the FLUX generator.
Returns:
- Model load status
- Time remaining until auto-unload
- Current VRAM usage
- Last access time
Example Usage:
Check the FLUX model status
4. set_timeout
Change the auto-unload timeout at runtime.
Parameters:
timeout_seconds(required): New timeout in seconds (0 to disable)
Example Usage:
Set FLUX auto-unload timeout to 600 seconds
Disable FLUX auto-unload
Usage Examples
These examples demonstrate using the MCP server with conversational AI clients (like Claude Desktop):
Basic Image Generation
Generate an image: "A majestic dragon flying over a medieval castle"
The server will:
- Load the FLUX model (if not already loaded)
- Generate the image
- Save it to the output directory as
YYYYMMDD_HHMMSS_{seed}.png - Return the file path, seed, and generation settings
- Schedule auto-unload after 5 minutes (default)
Reproducible Generation
To generate the same image again, use the seed from a previous generation:
Generate an image with seed 12345: "A cute robot playing with a kitten"
Custom Parameters
Generate a portrait with steps=40, guidance_scale=7.5, width=768, height=1024:
"Professional headshot of a business executive"
Memory Management
Check current status:
What's the FLUX model status?
Manually unload to free VRAM:
Unload the FLUX model
Adjust auto-unload timeout:
Set FLUX timeout to 10 minutes
How It Works
Auto-Unload Mechanism
- Lazy Loading: The model is NOT loaded when the server starts
- On-Demand Loading: Model loads automatically on first generation request
- Timer Reset: Each generation resets the auto-unload timer
- Automatic Cleanup: After the configured timeout with no activity:
- Model is removed from memory
- GPU cache is cleared (
torch.cuda.empty_cache()) - Python garbage collection runs
- Seamless Reload: Model automatically reloads on next request
Memory Management
The server uses several strategies to minimize VRAM usage:
- bfloat16 precision instead of float32 (saves ~50% VRAM)
- Explicit cache clearing when unloading
- Threading for non-blocking auto-unload
- Lock-based synchronization for thread-safe operation
Output Files
Generated images are saved as:
{FLUX_OUTPUT_DIR}/{timestamp}_{seed}.png
Example: 20250126_143052_42.png
Troubleshooting
CUDA Out of Memory
Problem: Error during generation: "CUDA out of memory"
Note: The generator automatically uses sequential CPU offloading to reduce VRAM usage from ~28GB to ~12GB. This should work on 16GB GPUs like RTX 4070 Ti Super.
If you still get OOM errors:
-
Close other GPU applications:
# Check what's using VRAM nvidia-smi -
Reduce image dimensions:
flux generate "prompt" --width 768 --height 768 # Or even smaller flux generate "prompt" --width 512 --height 512 -
Reduce inference steps:
flux generate "prompt" --steps 20 # Default is 28 -
Restart the process if VRAM isn't fully freed:
# CLI: Just run again (process exits after generation) # MCP: Restart your MCP client
Model Download Issues
Problem: Model download fails or times out
Solutions:
- Check internet connection
- Set a custom cache directory with more space:
FLUX_MODEL_CACHE=/path/to/large/disk/cache - Download manually with HuggingFace CLI:
huggingface-cli download black-forest-labs/FLUX.1-dev
Server Not Responding
Problem: MCP client doesn't see the tools
Solutions:
- Check your MCP client's logs for errors
- Verify the configuration path is absolute
- Ensure UV is in PATH or use full path to UV binary
- Restart your MCP client after config changes
- Test the server manually:
cd /path/to/flux-mcp uv run flux-mcp
Slow Generation
Problem: Image generation takes too long
Solutions:
- Reduce
stepsparameter (try 20-25 instead of 28) - Ensure GPU is being used (check with
nvidia-smi) - Close background applications to free GPU resources
- Check that CUDA is properly installed
Permission Errors
Problem: Cannot write to output directory
Solutions:
- Check directory permissions
- Set a different output directory in
.env:FLUX_OUTPUT_DIR=/home/$USER/flux_output - Create the directory manually:
mkdir -p ~/flux_output chmod 755 ~/flux_output
Advanced Configuration
Custom Model Cache
To share the model cache across multiple projects or save space:
# In .env
FLUX_MODEL_CACHE=/mnt/data/huggingface/cache
Disable Auto-Unload
To keep the model loaded permanently (uses more power but faster):
# In .env
FLUX_UNLOAD_TIMEOUT=0
Or at runtime:
Set FLUX timeout to 0
Logging
The server logs to stderr. To capture logs:
{
"mcpServers": {
"flux": {
"command": "sh",
"args": [
"-c",
"cd /path/to/flux-mcp && uv run flux-mcp 2>> /tmp/flux-mcp.log"
]
}
}
}
Performance Tips
Optimal Settings for RTX 4070 Ti Super (16GB)
- Resolution: Up to 1024x1024 comfortably
- Steps: 25-30 for good quality
- Batch size: 1 (model doesn't support batching well)
- Timeout: 300s for occasional use, 600s for active sessions
Generation Time Expectations
- 1024x1024, 28 steps: ~20-40 seconds (depending on prompt complexity)
- 512x512, 20 steps: ~5-10 seconds
- First generation: +10-15 seconds for model loading
Technical Details
Architecture
flux-mcp/
├── src/flux_mcp/
│ ├── __init__.py # Package metadata
│ ├── config.py # Environment configuration (shared)
│ ├── generator.py # FluxGenerator class (shared)
│ ├── server.py # MCP server (tool handlers)
│ └── cli.py # CLI tool
├── pyproject.toml # Project dependencies
├── .env # Local configuration
└── README.md # This file
Key Components
- FluxGenerator: Manages model lifecycle, threading, and GPU memory (shared between CLI and MCP)
- Config: Loads environment variables and provides defaults (shared)
- MCP Server: Exposes tools via Model Context Protocol for MCP-compatible clients
- CLI Tool: Direct command-line interface for offline usage
Thread Safety
The generator uses a threading lock (threading.Lock) to ensure:
- Only one generation at a time
- Safe model loading/unloading
- No race conditions with auto-unload timer
License
MIT License - see LICENSE file for details
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
Support
For issues and questions:
- Check the Troubleshooting section above
- Review server logs for errors
- Open an issue on GitHub
Changelog
v0.1.0 (2025-01-26)
- Initial release
- FLUX.1-dev integration
- Auto-unload functionality (MCP mode)
- Four MCP tools (generate, unload, status, set_timeout)
- CLI tool with interactive mode (
fluxcommand) - Shared architecture between CLI and MCP server
- Comprehensive documentation with CLI and MCP usage examples
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。