MCP Server for Apache Airflow

MCP Server for Apache Airflow

Provides integration with Apache Airflow's REST API, allowing AI assistants to programmatically interact with Airflow workflows, monitor DAG runs, and manage tasks.

Category
访问服务器

README

MCP Server for Apache Airflow

A Model Context Protocol (MCP) server that provides comprehensive integration with Apache Airflow's REST API. This server allows AI assistants to interact with Airflow workflows, monitor DAG runs, and manage tasks programmatically.

Features

  • DAG Management: List, view details, pause, and unpause DAGs
  • DAG Run Operations: Trigger new runs, list existing runs, and get detailed run information
  • Task Instance Monitoring: View task instances and their execution details
  • Universal Compatibility: Works with all popular Airflow hosting platforms:
  • Comprehensive Logging: Access and monitor logs for debugging and troubleshooting:
    • Real-time log retrieval for individual tasks
    • Aggregate logs for entire DAG runs
    • Smart log tailing with recent activity summaries
    • Automatic log formatting and decoding

Available Tools

DAG Management

  1. airflow_list_dags - List all DAGs with pagination and sorting
  2. airflow_get_dag - Get detailed information about a specific DAG
  3. airflow_trigger_dag - Trigger a new DAG run with optional configuration
  4. airflow_pause_dag - Pause a DAG
  5. airflow_unpause_dag - Unpause a DAG

DAG Run Monitoring

  1. airflow_list_dag_runs - List DAG runs for a specific DAG
  2. airflow_get_dag_run - Get details of a specific DAG run
  3. airflow_list_task_instances - List task instances for a DAG run
  4. airflow_get_task_instance - Get detailed task instance information

Logging & Debugging

  1. airflow_get_task_logs - Get complete logs for a specific task instance
  2. airflow_get_dag_run_logs - Get logs for all tasks in a DAG run
  3. airflow_tail_dag_run - Tail/monitor a DAG run with recent activity and logs

Installation & Deployment

Local Development

Via NPX (Recommended for Claude Desktop)

npx mcp-server-airflow

HTTP Server (Recommended for Cloud Deployment)

npx mcp-server-airflow-http

From Source

git clone https://github.com/tomnagengast/mcp-server-airflow.git
cd mcp-server-airflow
npm install
npm run build

# For stdio mode (Claude Desktop)
npm start

# For HTTP mode (cloud deployment)
npm run start:http

Cloud Deployment (Recommended)

This server supports streamable HTTP transport, which is the current best practice for MCP servers. Deploy to your preferred cloud platform:

Quick Deploy

npm run deploy

This interactive script will guide you through deploying to:

  • Google Cloud Platform (Cloud Run)
  • Amazon Web Services (ECS Fargate)
  • DigitalOcean App Platform
  • Netlify (Serverless Functions)

Manual Deployment Options

<details> <summary>🌐 Google Cloud Platform (Cloud Run)</summary>

# Build and push to Container Registry
gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/mcp-server-airflow

# Create secrets
echo "https://your-airflow-instance.com" | gcloud secrets create airflow-base-url --data-file=-
echo "your_token_here" | gcloud secrets create airflow-token --data-file=-

# Deploy to Cloud Run
gcloud run deploy mcp-server-airflow \
  --image gcr.io/YOUR_PROJECT_ID/mcp-server-airflow \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --port 3000 \
  --memory 512Mi \
  --set-secrets AIRFLOW_BASE_URL=airflow-base-url:latest,AIRFLOW_TOKEN=airflow-token:latest

</details>

<details> <summary>☁️ Amazon Web Services (ECS Fargate)</summary>

# Create ECR repository
aws ecr create-repository --repository-name mcp-server-airflow

# Build and push image
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin YOUR_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
docker build -t mcp-server-airflow .
docker tag mcp-server-airflow:latest YOUR_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/mcp-server-airflow:latest
docker push YOUR_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/mcp-server-airflow:latest

# Create secrets in Secrets Manager
aws secretsmanager create-secret --name airflow-config --secret-string '{"base_url":"https://your-airflow-instance.com","token":"your_token_here"}'

# Register task definition and create service (use provided template)
aws ecs register-task-definition --cli-input-json file://deploy/aws-ecs-fargate.json

</details>

<details> <summary>🌊 DigitalOcean App Platform</summary>

  1. Fork this repository to your GitHub account
  2. Create a new app in DigitalOcean App Platform
  3. Connect your forked repository
  4. Use the provided app spec: deploy/digitalocean-app.yaml
  5. Set environment variables in the dashboard:
    • AIRFLOW_BASE_URL
    • AIRFLOW_TOKEN (or AIRFLOW_USERNAME and AIRFLOW_PASSWORD)

</details>

<details> <summary>⚡ Netlify (Serverless Functions)</summary>

Netlify offers excellent serverless deployment with built-in CI/CD and global CDN.

Quick Deploy

# Interactive deployment script (includes environment setup)
node scripts/deploy-netlify.js

# Or manage environment variables separately
npm run env:netlify

Manual Deployment

# Install Netlify CLI
npm install -g netlify-cli

# Authenticate with Netlify
netlify login

# Build for Netlify
npm run build:netlify

# Initialize site (first time only)
netlify init

# Deploy to production
netlify deploy --prod

Environment Variables

Option 1: Using Netlify CLI (Recommended)

# Interactive environment setup
npm run env:netlify

# Or manually set variables
netlify env:set AIRFLOW_BASE_URL "https://your-airflow-instance.com"
netlify env:set AIRFLOW_TOKEN "your_api_token"

# For basic auth instead of token
netlify env:set AIRFLOW_USERNAME "your_username"
netlify env:set AIRFLOW_PASSWORD "your_password"

# List current variables
netlify env:list

Option 2: Netlify Dashboard

Set these in your Netlify site dashboard (Site settings → Environment variables):

  • AIRFLOW_BASE_URL: Your Airflow instance URL
  • AIRFLOW_TOKEN: Your Airflow API token (recommended)

Or for basic auth:

  • AIRFLOW_USERNAME: Your Airflow username
  • AIRFLOW_PASSWORD: Your Airflow password

Local Development

# Install dependencies
npm install

# Start local Netlify development server
npm run dev:netlify

Your MCP server will be available at http://localhost:8888/.netlify/functions/mcp

</details>

Docker Deployment

# Build image
npm run docker:build

# Run with environment file
npm run docker:run

# Or with docker-compose
docker-compose up

Configuration

The server requires authentication configuration through environment variables:

Option 1: API Token (Recommended)

export AIRFLOW_BASE_URL="https://your-airflow-instance.com"
export AIRFLOW_TOKEN="your_api_token_here"

Option 2: Basic Authentication

export AIRFLOW_BASE_URL="https://your-airflow-instance.com"
export AIRFLOW_USERNAME="your_username"
export AIRFLOW_PASSWORD="your_password"

Environment Variables

Variable Required Description
AIRFLOW_BASE_URL Yes Base URL of your Airflow instance
AIRFLOW_TOKEN No* API token for authentication
AIRFLOW_USERNAME No* Username for basic auth
AIRFLOW_PASSWORD No* Password for basic auth

*Either AIRFLOW_TOKEN or both AIRFLOW_USERNAME and AIRFLOW_PASSWORD must be provided.

Platform-Specific Setup

Astronomer

export AIRFLOW_BASE_URL="https://your-deployment.astronomer.io"
export AIRFLOW_TOKEN="your_astronomer_api_token"

Google Cloud Composer

export AIRFLOW_BASE_URL="https://your-composer-environment-web-server-url"
export AIRFLOW_TOKEN="your_gcp_access_token"

Amazon MWAA

export AIRFLOW_BASE_URL="https://your-environment-name.airflow.region.amazonaws.com"
# Use AWS credentials with appropriate IAM permissions

Testing

Local Testing

Test both stdio and HTTP modes:

# Set required environment variables
export AIRFLOW_BASE_URL="https://your-airflow-instance.com"
export AIRFLOW_TOKEN="your_api_token_here"

# Run comprehensive local tests
npm run test:local

HTTP API Testing

Once deployed, test your HTTP endpoint:

# Health check
curl https://your-deployed-url/health

# MCP initialization
curl -X POST https://your-deployed-url/ \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "test", "version": "1.0.0"}}}'

# List available tools
curl -X POST https://your-deployed-url/ \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc": "2.0", "id": 2, "method": "tools/list"}'

Claude Desktop Integration

Stdio Mode (Local Development)

Add this to your Claude Desktop MCP settings:

{
  "mcpServers": {
    "airflow": {
      "command": "npx",
      "args": ["mcp-server-airflow"],
      "env": {
        "AIRFLOW_BASE_URL": "https://your-airflow-instance.com",
        "AIRFLOW_TOKEN": "your_api_token_here"
      }
    }
  }
}

HTTP Mode (Cloud Deployment)

For streamable HTTP transport, configure Claude to use your deployed endpoint:

{
  "mcpServers": {
    "airflow": {
      "transport": {
        "type": "http",
        "url": "https://your-deployed-url"
      }
    }
  }
}

Platform-specific endpoints:

  • Netlify: https://your-site.netlify.app/mcp
  • Google Cloud Run: https://your-service-url.run.app/
  • AWS/DigitalOcean: https://your-deployed-url/

Usage Examples

Once connected, you can use natural language to interact with Airflow:

DAG Management

  • "List all my DAGs"
  • "Show me the details of the data_pipeline DAG"
  • "Trigger the daily_etl DAG with custom configuration"
  • "Pause the problematic_dag DAG"

Monitoring & Status

  • "What's the status of the latest run for my_workflow?"
  • "Show me all failed task instances from the last run"
  • "List all DAG runs for my_data_pipeline from today"

Logging & Debugging

  • "Show me the logs for the extract_data task in run daily_etl_2024_01_15"
  • "Get all logs for the failed DAG run daily_etl_2024_01_15"
  • "Tail the current DAG run and show me what's happening"
  • "Show me the recent activity for the running data_pipeline"

Advanced Examples

  • "Get logs for task 'transform_data' in DAG 'etl_pipeline' run 'manual_2024_01_15', try number 2"
  • "Monitor the DAG run 'scheduled_2024_01_15' and show the last 100 log lines for each task"
  • "Show me logs for the first 5 tasks in the failed DAG run"

Authentication Requirements

This server uses Airflow's stable REST API (v1), which requires authentication. The API supports:

  • Bearer Token Authentication: Most secure, recommended for production
  • Basic Authentication: Username/password, useful for development
  • Session Authentication: Handled automatically when using web-based tokens

Security Considerations

  • Store credentials securely and never commit them to version control
  • Use environment variables or secure secret management systems
  • For production deployments, prefer API tokens over username/password
  • Ensure your Airflow instance has proper network security (TLS, VPC, etc.)
  • Apply appropriate rate limiting and monitoring
  • Use HTTPS endpoints for production deployments
  • Implement proper authentication and authorization at the load balancer/gateway level

Performance & Scaling

HTTP Mode Benefits

  • Stateless: Each request is independent, allowing horizontal scaling
  • Caching: Responses can be cached at the CDN/proxy level
  • Load Balancing: Multiple instances can handle requests
  • Monitoring: Standard HTTP monitoring tools work out of the box
  • Debugging: Easy to test and debug with standard HTTP tools

Recommended Production Setup

  • Auto-scaling: Configure your cloud platform to scale based on CPU/memory usage
  • Health Checks: Use the /health endpoint for load balancer health checks
  • Monitoring: Set up logging and metrics collection
  • Caching: Consider caching frequently accessed DAG information
  • Rate Limiting: Implement rate limiting to protect your Airflow instance

API Compatibility

This server is compatible with Apache Airflow 2.x REST API. It has been tested with:

  • Apache Airflow 2.7+
  • Astronomer Software and Cloud
  • Google Cloud Composer 2
  • Amazon MWAA (all supported Airflow versions)

Development

# Clone the repository
git clone https://github.com/tomnagengast/mcp-server-airflow.git
cd mcp-server-airflow

# Install dependencies
npm install

# Build the project
npm run build

# Run in development mode
npm run dev

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Related Projects

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选