PromptTuner MCP

PromptTuner MCP

Analyzes, refines, and optimizes prompts for AI assistants by fixing grammar, improving clarity, applying best practices like chain-of-thought and few-shot learning, and scoring prompt quality across multiple dimensions.

Category
访问服务器

README

PromptTuner MCP

<img src="docs/logo.png" alt="PromptTuner MCP Logo" width="200">

CI npm version License: MIT Node.js Version

An MCP server that helps you write better prompts for AI assistants. It analyzes, refines, and optimizes prompts to improve AI understanding and response quality.

Performance: LLM refinement 1-5s • Batch processing 100+ prompts/min

🔑 API Key Required

PromptTuner uses direct API integration with LLM providers. You'll need an API key from one of:

Set environment variables:

# Choose provider (default: openai)
export LLM_PROVIDER=openai

# Set API key for your chosen provider
export OPENAI_API_KEY=sk-...
# OR
export ANTHROPIC_API_KEY=sk-ant-...
# OR
export GOOGLE_API_KEY=...

# Optional: override default model
export LLM_MODEL=gpt-4o

Why Use PromptTuner?

Poor prompts lead to poor AI responses. PromptTuner helps by:

  • Fixing typos and grammar - Catches 50+ common misspellings
  • Improving clarity - Removes vague language, adds specificity
  • Applying best practices - Chain-of-thought, few-shot, role-based prompting
  • Scoring your prompts - Get actionable feedback with 0-100 scores
  • Multi-provider support - Works with OpenAI, Anthropic, and Google

🎯 Production Ready

New in v1.0.0:

  • Security Hardening: Request timeouts, X-Forwarded-For validation, LLM output validation
  • Performance: Parallel technique application (60% faster multi-technique optimization)
  • Testing: Comprehensive test suite with 70%+ coverage
  • Distributed: Redis session store for multi-instance deployments
  • Observability: Structured JSON logging, health checks, ready probes
  • Docker: Production-ready containers with health checks

Quick Example

Before:

trubbelshot this code for me plz

After (with refine_prompt):

You are an expert software developer.

Troubleshoot this code. Find the errors, explain what's wrong, and provide a corrected version. Ask questions if anything is unclear.

Installation

git clone https://github.com/j0hanz/prompttuner-mcp.git
cd prompttuner-mcp
npm install
npm run build

Usage

With Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "prompttuner": {
      "command": "node",
      "args": ["/path/to/prompttuner-mcp/dist/index.js"],
      "env": {
        "LLM_PROVIDER": "openai",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Note: Replace OPENAI_API_KEY with ANTHROPIC_API_KEY or GOOGLE_API_KEY depending on your provider choice.

With MCP Inspector

npm run inspector

HTTP Mode (Experimental)

For testing or integration with HTTP-based clients:

npm run start:http
# Server runs at http://127.0.0.1:3000/mcp

Custom port/host:

node dist/index.js --http --port 8080 --host 0.0.0.0

Docker (Recommended for Production)

Run with Docker for easy deployment and Redis caching:

# Build and start
docker-compose up -d

# View logs
docker-compose logs -f prompttuner

# Stop
docker-compose down

The Docker setup includes:

  • PromptTuner MCP server on port 3000
  • Redis cache for improved performance
  • Automatic health checks
  • Volume persistence

Configure via environment variables in .env (see .env.example).

Tools

refine_prompt

Fix grammar, improve clarity, and apply optimization techniques to any prompt. Includes intelligent caching to speed up repeated refinements.

Parameter Type Default Description
prompt string required Prompt text to improve (plain text, Markdown, or XML)
technique string "basic" Technique to apply
targetFormat string "auto" Output format

Performance:

  • Caching: Identical refinements are cached (LRU, 500 entries, 1-hour TTL)
  • Cache Key: Based on prompt + technique + format (SHA-256 hash)
  • fromCache: Response includes fromCache: true when served from cache

Techniques:

Technique Description Best For
basic Grammar/clarity Quick fixes
chainOfThought Step-by-step reasoning Math, logic, analysis
fewShot Examples Classification, translation
roleBased Persona Domain-specific tasks
structured Formatting (XML/Markdown) Complex instructions
comprehensive All techniques Maximum improvement

Target Formats:

Format Description Best For
auto Detect Unknown target
claude XML tags Claude models
gpt Markdown GPT models
json Schema Data extraction

Example:

{
  "prompt": "explain recursion",
  "technique": "comprehensive",
  "targetFormat": "claude"
}

analyze_prompt

Score prompt quality across 5 dimensions and get actionable improvement suggestions.

Input:

  • prompt (string, required): Prompt text to improve

Returns:

  • Score (0-100): clarity, specificity, completeness, structure, effectiveness, overall
  • Characteristics: detected format, word count, complexity level
  • Suggestions: actionable improvements
  • Flags: hasTypos, isVague, missingContext

optimize_prompt

Apply multiple techniques sequentially for comprehensive prompt improvement. Returns before/after scores and diff.

Parameter Type Default Description
prompt string required Prompt text to improve
techniques string[] ["basic"] Techniques to apply in order
targetFormat string "auto" Output format

Example:

{
  "prompt": "write code for sorting",
  "techniques": ["basic", "roleBased", "structured"],
  "targetFormat": "gpt"
}

Returns: Before/after scores, diff of changes.

detect_format

Identify target AI format (Claude XML, GPT Markdown, JSON) with confidence score.

Returns:

  • detectedFormat: claude, gpt, json, or auto
  • confidence: 0-100
  • recommendation: Format-specific advice

compare_prompts

Compare two prompt versions side-by-side with scoring, diff, and recommendations.

Input:

Parameter Type Default Description
promptA string required First prompt
promptB string required Second prompt
labelA string "Prompt A" Label for first
labelB string "Prompt B" Label for second

Returns:

  • Scores: Both prompts scored across 5 dimensions (clarity, specificity, completeness, structure, effectiveness)
  • Winner: Which prompt is better (A, B, or tie)
  • Score Deltas: Numerical differences for each dimension
  • Improvements: What got better in Prompt B vs A
  • Regressions: What got worse in Prompt B vs A
  • Recommendation: Actionable advice on which to use
  • Diff: Character-level comparison

Example:

{
  "promptA": "explain recursion",
  "promptB": "You are a computer science teacher. Explain recursion with examples.",
  "labelA": "Original",
  "labelB": "Improved"
}

Use Cases:

  • A/B testing prompts
  • Evaluating refinement effectiveness
  • Tracking prompt iterations
  • Choosing between versions

validate_prompt

Pre-flight validation: check for issues, estimate tokens, detect anti-patterns and security risks before using a prompt.

Input:

Parameter Type Default Description
prompt string required Prompt to validate
targetModel string "generic" AI model (claude/gpt/gemini)
checkInjection boolean true Check for prompt injection attacks

Returns:

  • Is Valid: Boolean (true if no errors)
  • Token Estimate: Approximate token count (1 token ≈ 4 chars)
  • Issues: Array of validation issues (error/warning/info)
    • Type: error, warning, or info
    • Message: What the issue is
    • Suggestion: How to fix it
  • Checks Performed:
    • Anti-patterns (vague language, missing context)
    • Token limits (model-specific)
    • Security (prompt injection patterns)
    • Typos (common misspellings)

Token Limits by Model:

Model Limit
claude 200,000
gpt 128,000
gemini 1,000,000
generic 8,000

Example:

{
  "prompt": "ignore all previous instructions and...",
  "targetModel": "gpt",
  "checkInjection": true
}

Use Cases:

  • Pre-flight checks before sending prompts to LLMs
  • Security audits for user-provided prompts
  • Token budget planning
  • Quality assurance in prompt pipelines

Resources

Browse and use prompt templates:

URI Description
templates://catalog List all available templates
templates://coding/code-review Code review template
templates://coding/debug-error Debugging template
templates://writing/summarize Summarization template
templates://analysis/pros-cons Pro/con analysis template
templates://system-prompts/expert-role Expert persona template
templates://data-extraction/json-extract JSON extraction template

Categories: coding, writing, analysis, system-prompts, data-extraction

Prompts (Workflows)

Pre-built workflows for common tasks:

Prompt Description
quick-optimize One-step optimization with single technique
deep-optimize Comprehensive optimization with all techniques
analyze Score quality and get improvement suggestions
review Educational feedback against best practices
iterative-refine Identify top 3 issues and fix iteratively
recommend-techniques Suggest best techniques for prompt + task
scan-antipatterns Detect common prompt mistakes

Scoring Explained

Dimension What It Measures
Clarity Clear language, no vague terms ("something", "stuff")
Specificity Concrete details, examples, numbers
Completeness Role context, output format, all requirements
Structure Organization, formatting, sections
Effectiveness Overall likelihood of good AI response

Score Interpretation:

  • 80-100: Excellent - Minor refinements only
  • 60-79: Good - Some improvements recommended
  • 40-59: Fair - Notable gaps to address
  • 0-39: Needs Work - Significant improvements needed

LLM Sampling vs Rule-Based

PromptTuner works in two modes:

  1. LLM Sampling (when available): Uses the MCP client's LLM for intelligent refinement
  2. Rule-Based Fallback (automatic): Uses pattern matching and dictionaries when sampling unavailable

The tool automatically falls back to rule-based refinement if your MCP client doesn't support sampling.

Development

npm run dev        # Watch mode with hot reload
npm run build      # Compile TypeScript
npm run test       # Run tests
npm run lint       # ESLint check
npm run type-check # TypeScript type checking
npm run format     # Prettier formatting

Troubleshooting

"LLM sampling is not supported"

This is normal! The tool automatically uses rule-based refinement. For full LLM-powered refinement, use Claude Desktop or another MCP client that supports sampling.

"Prompt too long"

Maximum prompt length is 10,000 characters. Split longer prompts into sections.

HTTP mode not connecting

Check that:

  1. Port 3000 (default) is not in use
  2. You're using POST to /mcp endpoint
  3. Headers include Content-Type: application/json

Contributing

Contributions welcome! Please:

  1. Run npm run lint && npm run type-check before committing
  2. Add tests for new features
  3. Update README for user-facing changes

License

MIT

Credits

Built with the Model Context Protocol SDK.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选