Prometheus MCP Server

Prometheus MCP Server

Enables AI assistants to query Prometheus metrics, monitor alerts, and analyze system health through read-only access to your Prometheus server with built-in query safety and optional AI-powered metric analysis.

Category
访问服务器

README

prometheus-mcp

A Model Context Protocol (MCP) server for Prometheus integration. Give your AI assistant eyes on your metrics and alerts.

Status: Planning Author: Claude (claude@arktechnwa.com) + Meldrey License: MIT Organization: ArktechNWA


Why?

Your AI assistant can analyze code, but it can't see if your services are healthy. It can suggest optimizations, but can't see the actual latency metrics. It's blind to the alerts firing at 3am.

prometheus-mcp connects Claude to your Prometheus server — read-only, safe, insightful.


Philosophy

  1. Read-only by design — Prometheus queries don't mutate state
  2. Query safety — Timeout expensive queries, limit cardinality
  3. Never hang — PromQL can be expensive, always timeout
  4. Structured output — Metrics + human summaries
  5. Fallback AI — Haiku for anomaly detection and query help

Features

Perception (Read)

  • Instant queries (current values)
  • Range queries (over time)
  • Alert status and history
  • Target health
  • Recording rules and alerts
  • Label discovery
  • Metric metadata

Analysis (AI-Assisted)

  • "Is this metric normal?"
  • "What caused this spike?"
  • "Suggest a query for X"
  • Anomaly detection

Permission Model

Prometheus is inherently read-only for queries. Permissions focus on:

Level Description Default
query Run PromQL queries ON
alerts View alert status ON
admin View config, reload rules OFF

Query Safety

{
  "query_limits": {
    "max_duration": "30s",
    "max_resolution": "10000",
    "max_series": 1000,
    "blocked_metrics": [
      "__.*",
      "secret_.*"
    ]
  }
}

Safety features:

  • Query timeout enforcement
  • Cardinality limits
  • Metric blacklist patterns
  • Rate limiting

Authentication

{
  "prometheus": {
    "url": "http://localhost:9090",
    "auth": {
      "type": "none" | "basic" | "bearer",
      "username_env": "PROM_USER",
      "password_env": "PROM_PASS",
      "token_env": "PROM_TOKEN"
    }
  }
}

Tools

Queries

prom_query

Execute instant query (current values).

prom_query({
  query: string,            // PromQL expression
  time?: string             // evaluation time (default: now)
})

Returns:

{
  "query": "up{job=\"api\"}",
  "result_type": "vector",
  "results": [
    {
      "metric": {"job": "api", "instance": "api-1:8080"},
      "value": 1,
      "timestamp": "2025-12-29T10:30:00Z"
    }
  ],
  "summary": "3 of 3 api instances are up"
}

prom_query_range

Execute range query (over time).

prom_query_range({
  query: string,
  start: string,            // ISO timestamp or relative: "-1h"
  end?: string,             // default: now
  step?: string             // resolution: "15s", "1m", "5m"
})

Returns:

{
  "query": "rate(http_requests_total[5m])",
  "result_type": "matrix",
  "results": [
    {
      "metric": {"handler": "/api/users"},
      "values": [[1735470600, "123.45"], ...],
      "stats": {
        "min": 100.2,
        "max": 456.7,
        "avg": 234.5,
        "current": 345.6
      }
    }
  ],
  "summary": "Request rate ranged from 100-457 req/s over the last hour, currently 346 req/s"
}

prom_series

Find series matching label selectors.

prom_series({
  match: string[],          // label matchers
  start?: string,
  end?: string,
  limit?: number
})

prom_labels

Get label names or values.

prom_labels({
  label?: string,           // get values for this label (omit for label names)
  match?: string[],         // filter by series
  limit?: number
})

Alerts

prom_alerts

Get current alert status.

prom_alerts({
  state?: "firing" | "pending" | "inactive",
  filter?: string           // alert name pattern
})

Returns:

{
  "alerts": [
    {
      "name": "HighErrorRate",
      "state": "firing",
      "severity": "critical",
      "summary": "Error rate > 5% for api service",
      "started_at": "2025-12-29T10:15:00Z",
      "duration": "15m",
      "labels": {"job": "api", "severity": "critical"},
      "annotations": {"summary": "..."}
    }
  ],
  "summary": "1 critical, 0 warning alerts firing"
}

prom_rules

Get alerting and recording rules.

prom_rules({
  type?: "alert" | "record",
  filter?: string
})

Targets

prom_targets

Get scrape target health.

prom_targets({
  state?: "active" | "dropped",
  job?: string
})

Returns:

{
  "targets": [
    {
      "job": "api",
      "instance": "api-1:8080",
      "health": "up",
      "last_scrape": "2025-12-29T10:29:45Z",
      "scrape_duration": "0.023s",
      "error": null
    }
  ],
  "summary": "12 of 12 targets healthy"
}

Discovery

prom_metadata

Get metric metadata (help, type, unit).

prom_metadata({
  metric?: string,          // specific metric (omit for all)
  limit?: number
})

Analysis

prom_analyze

AI-powered metric analysis.

prom_analyze({
  query: string,
  question?: string,        // "Is this normal?", "What caused the spike?"
  use_ai?: boolean
})

Returns:

{
  "query": "rate(http_errors_total[5m])",
  "data_summary": {
    "current": 12.3,
    "1h_ago": 2.1,
    "change": "+486%"
  },
  "synthesis": {
    "analysis": "Error rate spiked 5x in the last hour. The spike correlates with deployment at 10:15. Errors are concentrated on /api/checkout endpoint.",
    "suggested_queries": [
      "rate(http_errors_total{handler=\"/api/checkout\"}[5m])",
      "histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))"
    ],
    "confidence": "high"
  }
}

prom_suggest_query

Get PromQL query suggestions.

prom_suggest_query({
  intent: string            // "show me api latency p99"
})

NEVERHANG Architecture

PromQL queries can be expensive. High-cardinality queries can OOM Prometheus.

Query Timeouts

  • Default: 30s
  • Configurable per-query
  • Server-side timeout parameter

Cardinality Protection

  • Limit series returned
  • Block known expensive patterns
  • Warn on high-cardinality queries

Circuit Breaker

  • 3 timeouts in 60s → 5 minute cooldown
  • Tracks Prometheus health
  • Graceful degradation
{
  "neverhang": {
    "query_timeout": 30000,
    "max_series": 1000,
    "circuit_breaker": {
      "failures": 3,
      "window": 60000,
      "cooldown": 300000
    }
  }
}

Fallback AI

Optional Haiku for metric analysis.

{
  "fallback": {
    "enabled": true,
    "model": "claude-haiku-4-5",
    "api_key_env": "PROM_MCP_FALLBACK_KEY",
    "max_tokens": 500
  }
}

When used:

  • prom_analyze with questions
  • prom_suggest_query for natural language
  • Anomaly detection

Configuration

~/.config/prometheus-mcp/config.json:

{
  "prometheus": {
    "url": "http://localhost:9090",
    "auth": {
      "type": "none"
    }
  },
  "permissions": {
    "query": true,
    "alerts": true,
    "admin": false
  },
  "query_limits": {
    "max_duration": "30s",
    "max_series": 1000
  },
  "fallback": {
    "enabled": false
  }
}

Claude Code Integration

{
  "mcpServers": {
    "prometheus": {
      "command": "prometheus-mcp",
      "args": ["--config", "/path/to/config.json"]
    }
  }
}

Installation

npm install -g @arktechnwa/prometheus-mcp

Requirements

  • Node.js 18+
  • Prometheus server (2.x+)
  • Optional: Anthropic API key for fallback AI

Credits

Created by Claude (claude@arktechnwa.com) in collaboration with Meldrey. Part of the ArktechNWA MCP Toolshed.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选