Prometheus MCP Server

Prometheus MCP Server

Enables AI assistants to execute PromQL queries and discover metrics within AWS Managed Prometheus (AMP) using SigV4 authentication. It provides tools for instant and range queries, label management, and metric discovery in secure, VPC-isolated environments.

Category
访问服务器

README

Prometheus MCP Server

An MCP (Model Context Protocol) server for querying AWS Managed Prometheus (AMP) with SigV4 authentication. This server enables AI assistants to execute PromQL queries and discover metrics in a secure, VPC-isolated environment.

Features

  • SigV4 Authentication: Automatically signs requests using AWS credentials (supports EKS Pod Identity/IRSA)
  • 5 MCP Tools:
    • query_instant - Execute instant PromQL queries
    • query_range - Execute range queries for time series data
    • list_labels - Get all label names
    • get_label_values - Get values for a specific label
    • list_metrics - Get all metric names with optional metadata
  • VPC Isolation: Designed to run inside a VPC with no public exposure
  • Production Ready: Includes Terraform, Kubernetes manifests, and comprehensive testing

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         Customer VPC                            │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │                      EKS Cluster                          │  │
│  │  ┌─────────────────┐    ┌─────────────────────────────┐   │  │
│  │  │  Prometheus MCP │◄───│  ClusterIP Service :8080    │   │  │
│  │  │  (Pod Identity) │    │  (Internal Only)            │   │  │
│  │  └────────┬────────┘    └─────────────────────────────┘   │  │
│  │           │                                                │  │
│  └───────────┼────────────────────────────────────────────────┘  │
│              │ SigV4 Signed HTTP                                │
│              ▼                                                   │
│  ┌───────────────────────┐                                      │
│  │ AWS Managed Prometheus│                                      │
│  └───────────────────────┘                                      │
└─────────────────────────────────────────────────────────────────┘

Quick Start

Prerequisites

  • Python 3.11+
  • AWS CLI configured with credentials
  • Docker (for building container images)
  • Terraform 1.5+ (for infrastructure deployment)
  • kubectl (for Kubernetes deployment)
  • An SSH key pair in AWS

Local Development

# Clone and install
cd prometheus-mcp
pip install -e ".[dev]"

# Set environment variables
export PROMETHEUS_WORKSPACE_ID="ws-your-workspace-id"
export AWS_REGION="us-east-1"

# Run the server
python -m prometheus_mcp.server

Run Tests

# Install dev dependencies
pip install -e ".[dev]"

# Run unit tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=prometheus_mcp --cov-report=html

Production Deployment

Step 1: Deploy Infrastructure with Terraform

cd deploy/terraform

# Initialize Terraform
terraform init

# Review the plan
terraform plan -var="ssh_key_name=your-key-name"

# Apply (creates VPC, EKS, AMP, ECR, Bastion)
terraform apply -var="ssh_key_name=your-key-name"

Resources Created:

  • VPC with public/private subnets
  • EKS cluster with managed node group
  • AWS Managed Prometheus workspace
  • ECR repository
  • Bastion host for SSH access
  • IAM roles with Pod Identity

Step 2: Build and Push Docker Image

# Get ECR login command from terraform output
$(terraform output -raw docker_login_command)

# Build the image
docker build -t prometheus-mcp .

# Tag and push
ECR_URL=$(terraform output -raw ecr_repository_url)
docker tag prometheus-mcp:latest $ECR_URL:latest
docker push $ECR_URL:latest

Step 3: Deploy to Kubernetes

# Update kubeconfig
$(terraform output -raw kubeconfig_command)

# Get values for K8s manifests
export ECR_URL=$(terraform output -raw ecr_repository_url)
export WORKSPACE_ID=$(terraform output -raw amp_workspace_id)
export ROLE_ARN=$(terraform output -raw prometheus_mcp_role_arn)

# Update manifests with actual values
sed -i '' "s|\${ECR_URL}|$ECR_URL|g" deploy/k8s/deployment.yaml
sed -i '' "s|\${WORKSPACE_ID}|$WORKSPACE_ID|g" deploy/k8s/deployment.yaml
sed -i '' "s|\${PROMETHEUS_MCP_ROLE_ARN}|$ROLE_ARN|g" deploy/k8s/service-account.yaml

# Apply manifests
kubectl apply -f deploy/k8s/

Step 4: Verify Deployment

# Check pods are running
kubectl get pods -n prometheus-mcp

# Check logs
kubectl logs -n prometheus-mcp -l app=prometheus-mcp

Testing via SSH Tunnel

Since the MCP server is only accessible within the VPC, use SSH tunneling to test from your laptop.

Method 1: Manual Setup (3 Terminals)

Terminal 1 - SSH to Bastion and Port-Forward:

# SSH to bastion
ssh -i ~/.ssh/your-key.pem ec2-user@<BASTION_IP>

# On the bastion, set up kubectl
~/setup-kubectl.sh

# Start port-forward
~/port-forward-mcp.sh

Terminal 2 - SSH Tunnel:

ssh -i ~/.ssh/your-key.pem -L 8080:localhost:8080 ec2-user@<BASTION_IP>

Terminal 3 - MCP Inspector:

npx @anthropic/mcp-inspector http://localhost:8080

Method 2: Using the Test Script

cd deploy/scripts
./test-via-tunnel.sh

This script will display all the commands you need to run.

Verify VPC Isolation

# This should FAIL (timeout) - proves VPC isolation
kubectl get pod -n prometheus-mcp -o jsonpath='{.items[0].status.podIP}'
curl http://<POD_IP>:8080/health --connect-timeout 5
# Expected: Connection timed out

# This should SUCCEED (via SSH tunnel)
curl http://localhost:8080/health
# Expected: {"status": "healthy"}

MCP Tools Reference

query_instant

Execute an instant PromQL query at a single point in time.

{
  "name": "query_instant",
  "arguments": {
    "query": "up",
    "time": "2024-01-15T10:00:00Z"  // optional
  }
}

query_range

Execute a range query to get time series data.

{
  "name": "query_range",
  "arguments": {
    "query": "rate(http_requests_total[5m])",
    "start": "2024-01-15T00:00:00Z",
    "end": "2024-01-15T12:00:00Z",
    "step": "1m"
  }
}

list_labels

Get all label names.

{
  "name": "list_labels",
  "arguments": {
    "match": ["up", "http_requests_total"]  // optional
  }
}

get_label_values

Get all values for a specific label.

{
  "name": "get_label_values",
  "arguments": {
    "label_name": "job",
    "match": ["{namespace='production'}"]  // optional
  }
}

list_metrics

Get all metric names.

{
  "name": "list_metrics",
  "arguments": {
    "with_metadata": true  // optional, slower but includes type/help/unit
  }
}

Configuration

Environment Variables

Variable Description Default
PROMETHEUS_WORKSPACE_ID AMP workspace ID (required) -
AWS_REGION AWS region us-east-1

Terraform Variables

Variable Description Default
aws_region AWS region us-east-1
ssh_key_name SSH key pair name (required) -
allowed_ssh_cidr CIDR for SSH access 0.0.0.0/0
eks_cluster_version Kubernetes version 1.29
eks_node_instance_type EKS node instance type t3.medium
bastion_instance_type Bastion instance type t3.micro

Project Structure

prometheus-mcp/
├── prometheus_mcp/
│   ├── __init__.py
│   ├── server.py              # MCP server entry point
│   ├── amp_client.py          # SigV4-authenticated AMP client
│   └── promql/
│       ├── __init__.py
│       ├── models.py          # Pydantic response models
│       └── tools.py           # MCP tool implementations
├── deploy/
│   ├── terraform/             # Infrastructure as Code
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   ├── vpc.tf
│   │   ├── eks.tf
│   │   ├── amp.tf
│   │   ├── ecr.tf
│   │   ├── iam.tf
│   │   └── bastion.tf
│   ├── k8s/                   # Kubernetes manifests
│   │   ├── namespace.yaml
│   │   ├── service-account.yaml
│   │   ├── deployment.yaml
│   │   └── service.yaml
│   └── scripts/
│       ├── push-sample-metrics.sh
│       ├── test-via-tunnel.sh
│       └── cleanup.sh
├── tests/
│   ├── conftest.py
│   ├── test_amp_client.py
│   └── test_promql_tools.py
├── Dockerfile
├── pyproject.toml
└── README.md

Cleanup

To destroy all resources:

cd deploy/scripts
./cleanup.sh

Or manually:

# Delete K8s resources
kubectl delete -f deploy/k8s/

# Destroy Terraform infrastructure
cd deploy/terraform
terraform destroy

Security Considerations

  1. VPC Isolation: The MCP server is only accessible via ClusterIP service within the EKS cluster
  2. Pod Identity (IRSA): Uses AWS IAM roles for service accounts instead of static credentials
  3. Least Privilege: IAM role only has permissions to query AMP, not write
  4. No Public Endpoints: All access is via SSH tunnel through bastion
  5. Container Security: Runs as non-root user with read-only filesystem

Troubleshooting

Pod not starting

kubectl describe pod -n prometheus-mcp -l app=prometheus-mcp
kubectl logs -n prometheus-mcp -l app=prometheus-mcp

IAM permissions issues

# Verify the service account has the correct annotation
kubectl get sa -n prometheus-mcp prometheus-mcp -o yaml

# Check if Pod Identity is working
kubectl exec -n prometheus-mcp -it <pod-name> -- aws sts get-caller-identity

Cannot connect via SSH tunnel

# Verify bastion is running
aws ec2 describe-instances --filters "Name=tag:Name,Values=prometheus-mcp-bastion" --query "Reservations[].Instances[].State.Name"

# Check security group allows SSH
aws ec2 describe-security-groups --group-ids <bastion-sg-id>

License

MIT

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选