AI Vision MCP Server

AI Vision MCP Server

Provides AI-powered visual analysis capabilities for Claude and other MCP-compatible AI assistants, allowing them to capture and analyze screenshots, perform file operations, and generate UI/UX reports.

Category
访问服务器

README

MCP AI Vision Debug UI Automation

MCP Server GLAMA Compatible Smithery Compatible

An autonomous debugging MCP server that empowers AI models to analyze, debug, and interact with web interfaces through Playwright. This server enables any AI model (even those without built-in vision capabilities) to visually inspect web pages, find UI bugs, test user workflows, and validate application performance - all without human intervention.

UI Automation Screenshot

Autonomous UI Debugging Agent

This MCP server functions as an AI-powered autonomous debugging agent that can:

  • Perform comprehensive visual analysis of web applications
  • Detect UI issues by inspecting visual elements and their properties
  • Automatically test common user workflows without manual test script creation
  • Validate API endpoints and verify backend responses
  • Track visual changes between application versions
  • Monitor console logs for errors and warnings
  • Analyze performance metrics to identify bottlenecks
  • Generate detailed reports with screenshots and recommendations

The server is designed to work intelligently, reusing browser sessions, avoiding unnecessary file creation, and focusing on the most important aspects of your application.

Installation Options

Using an MCP Gateway (Recommended)

The easiest way to install this MCP server is through any MCP-compatible gateway:

# Example with Claude gateway
claude-gateway install mcp-ai-vision-debug-ui-automation

Quick Installation Script

Use our one-line installation script:

curl -s https://raw.githubusercontent.com/samihalawa/mcp-ai-vision-debug-ui-automation/main/scripts/install-global.sh | bash

NPM Installation

For global installation via npm:

# Install globally
npm install -g mcp-ai-vision-debug-ui-automation

# Start the server
mcp-ai-vision-debug-ui-automation

Docker Hub Installation

For containerized deployment:

# Pull the image from Docker Hub
docker pull samihalawa/mcp-ai-vision-debug-ui-automation:latest

# Run the container
docker run -p 8080:8080 samihalawa/mcp-ai-vision-debug-ui-automation:latest

Smithery Integration

This package is fully Smithery-compatible using the included configuration file:

# Install with Smithery
smithery install mcp-ai-vision-debug-ui-automation

# Or run with your API key
npm run smithery:key YOUR_SMITHERY_API_KEY

For full installation and usage instructions, see the Smithery Integration Guide.

Cross-Platform Support

Platform-specific packages are available for all major platforms:

# For macOS (Intel or Apple Silicon)
npm install -g mcp-ai-vision-debug-ui-automation-darwin-x64
npm install -g mcp-ai-vision-debug-ui-automation-darwin-arm64

# For Linux
npm install -g mcp-ai-vision-debug-ui-automation-linux-x64
npm install -g mcp-ai-vision-debug-ui-automation-linux-arm64

# For Windows
npm install -g mcp-ai-vision-debug-ui-automation-win32-x64

Complete Tool Reference

Primary Visual Analysis Tools

1. enhanced_page_analyzer 🔍

Provides comprehensive analysis of web pages with interactive elements mapping, performance metrics, and visual inspection.

const analysis = await mcp.callTool("enhanced_page_analyzer", {
  url: "https://example.com/dashboard",
  includeConsole: true,
  mapElements: true,
  fullPage: true
});

2. ui_workflow_validator 🔄

Automatically tests full user journeys by executing and validating a sequence of UI interactions.

const result = await mcp.callTool("ui_workflow_validator", {
  startUrl: "https://example.com/login",
  taskDescription: "User login flow",
  steps: [
    { description: "Enter username", action: "fill", selector: "#username", value: "test" },
    { description: "Enter password", action: "fill", selector: "#password", value: "pass" },
    { description: "Click login", action: "click", selector: "button[type='submit']" },
    { description: "Verify dashboard loads", action: "verifyElementVisible", selector: ".dashboard" }
  ],
  captureScreenshots: "all"
});

3. visual_comparison 👁️

Compares two web pages or UI states to identify visual differences.

const diff = await mcp.callTool("visual_comparison", {
  url1: "https://example.com/before",
  url2: "https://example.com/after",
  threshold: 0.05
});

4. screenshot_url 📸

Captures high-quality screenshots of any URL with options for full page or specific elements.

const screenshot = await mcp.callTool("screenshot_url", {
  url: "https://example.com/profile",
  fullPage: true,
  device: "iPhone 13"
});

5. batch_screenshot_urls 📷

Takes screenshots of multiple URLs in a single operation for efficient comparison.

const screenshots = await mcp.callTool("batch_screenshot_urls", {
  urls: ["https://example.com/page1", "https://example.com/page2"],
  fullPage: true
});

User Flow Testing Tools

6. navigation_flow_validator 🧭

Tests multi-step navigation sequences with validation.

const navResult = await mcp.callTool("navigation_flow_validator", {
  startUrl: "https://example.com",
  steps: [
    { action: "click", selector: "a.products" },
    { action: "wait", waitTime: 1000 },
    { action: "click", selector: ".product-item" }
  ],
  captureScreenshots: true
});

7. api_endpoint_tester 🔌

Tests multiple API endpoints and verifies responses for backend validation.

const apiTest = await mcp.callTool("api_endpoint_tester", {
  url: "https://api.example.com/v1",
  endpoints: [
    { path: "/users", method: "GET" },
    { path: "/products", method: "GET" }
  ],
  authToken: "Bearer token123"
});

DOM and Performance Analysis

8. dom_inspector 🔬

Inspects DOM elements and their properties in detail.

const elementInfo = await mcp.callTool("dom_inspector", {
  url: "https://example.com",
  selector: "nav.main-menu",
  includeChildren: true,
  includeStyles: true
});

9. console_monitor 📟

Monitors and captures console logs for error detection.

const logs = await mcp.callTool("console_monitor", {
  url: "https://example.com/app",
  filterTypes: ["error", "warning"],
  duration: 5000
});

10. performance_analysis

Measures and analyzes page load performance metrics.

const perfMetrics = await mcp.callTool("performance_analysis", {
  url: "https://example.com/dashboard",
  iterations: 3
});

Low-Level Playwright Controls

11. screenshot_local_files 📁

Takes screenshots of local HTML files.

const localScreenshot = await mcp.callTool("screenshot_local_files", {
  filePath: "/path/to/local/file.html"
});

12. Direct Playwright Actions

Complete set of low-level Playwright controls for precise automation:

  • playwright_navigate: Navigate to specific URLs
  • playwright_click: Click on elements
  • playwright_iframe_click: Click elements inside iframes
  • playwright_fill: Fill form fields
  • playwright_select: Select dropdown options
  • playwright_hover: Hover over elements
  • playwright_evaluate: Run JavaScript in the page context
  • playwright_console_logs: Get console logs
  • playwright_get_visible_text: Extract visible text
  • playwright_get_visible_html: Get visible HTML
  • playwright_go_back: Navigate back
  • playwright_go_forward: Navigate forward
  • playwright_press_key: Press keyboard keys
  • playwright_drag: Drag and drop elements
  • playwright_screenshot: Take custom screenshots

Autonomous Debugging Workflows

The MCP server can autonomously perform complete debugging workflows by combining tools. For example:

Visual Regression Testing

// 1. Analyze the current version
const currentAnalysis = await mcp.callTool("enhanced_page_analyzer", {...});

// 2. Compare with previous version
const comparisonResult = await mcp.callTool("visual_comparison", {...});

// 3. Generate visual difference report
const report = await mcp.callTool("ui_workflow_validator", {...});

End-to-End User Flow Validation

// 1. Start with login flow
const loginResult = await mcp.callTool("ui_workflow_validator", {...});

// 2. Validate core features
const featureResults = await mcp.callTool("navigation_flow_validator", {...});

// 3. Test API endpoints
const apiResults = await mcp.callTool("api_endpoint_tester", {...});

Performance Optimization

// 1. Analyze initial performance
const initialPerformance = await mcp.callTool("performance_analysis", {...});

// 2. Identify slow-loading elements
const elementPerformance = await mcp.callTool("dom_inspector", {...});

// 3. Monitor console for errors
const consoleErrors = await mcp.callTool("console_monitor", {...});

Visual Analysis Examples

Element Mapping

Element Mapping

The MCP server automatically maps all interactive elements on a page, making it easy for an AI model to understand the UI structure.

Visual Comparison

Visual Comparison

The visual comparison tool highlights differences between UI states, perfect for catching unexpected visual changes.

Integration Options

Integration with Smithery

# smithery.yaml configuration
startCommand:
  type: stdio
  configSchema:
    type: object
    properties:
      port:
        type: number
        description: Port number for the MCP server
      debug:
        type: boolean
        description: Enable debug mode

Integration with GLAMA

// glama.json configuration
{
  "name": "mcp-ai-vision-debug-ui-automation",
  "version": "1.0.2",
  "settings": {
    "port": 8080,
    "headless": true,
    "maxConcurrentSessions": 5
  }
}

Integration with Non-Vision Models

The MCP server converts visual information into structured data that can be used by any AI model, even those without vision capabilities:

// The model receives structured data about visual elements
{
  "interactiveElements": [
    {
      "tagName": "button",
      "text": "Submit",
      "bounds": {"x": 120, "y": 240, "width": 100, "height": 40},
      "visible": true
    },
    // More elements...
  ]
}

CI/CD Integration

This MCP server includes GitHub Actions workflows for continuous integration and deployment:

  • Build and Test: Validates code quality
  • NPM Publishing: Automates package publishing
  • Docker Publishing: Creates and pushes Docker images
  • Smithery Publishing: Deploys to Smithery platform

License

This project is licensed under the ISC License.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选