playwright-fixer-mcp

playwright-fixer-mcp

Automated Playwright E2E test repair powered by a self-improving, governed MCP server that runs failing tests, collects failure artifacts, reasons about root causes, validates and applies fixes, and re-runs to verify.

Category
访问服务器

README

playwright-fixer-mcp

Automated Playwright E2E test repair powered by a self-improving, governed MCP server.

Built on the three-layer AI automation architecture — Knowledge, Capability, and Governance — this tool turns failing Playwright tests into a closed verification loop: it runs the test, collects failure artifacts, reasons about the root cause, applies a rule-validated fix, and re-runs to verify — without hallucinating about runtime state.

We don't increase model intelligence. We reduce model freedom. The result is a system that works not by magic, but by design.


Why This Exists

When a Playwright test fails, most teams do one of two things:

  1. Send the error to an LLM and hope it guesses right.
  2. Alert an engineer to investigate manually.

Both share the same flaw: the LLM is reasoning blind. Timeout 5000ms exceeded contains almost no actionable information. No selector. No DOM context. No iframe structure. No screenshot.

This tool solves that by treating E2E debugging as a state reconstruction problem, not a prompting problem.

Instead of better prompts, the system provides:

  • Deterministic context — the model never guesses about runtime state; tools provide it
  • Constrained reasoning — exactly one failure context is resolved, one rule bundle injected
  • Closed verification — every proposed fix is validated by validate_and_apply_fix (the locked door) and re-run before accepting
  • Self-improving governance — learned fix patterns become rules, reviewed by humans, promoted automatically

Architecture

The system separates three concerns that most AI automation tools mix together. Mixing them is the most common failure mode.

┌──────────────────────────────────────────────────────────┐
│  Layer 3 — Governance  (.mdc rule files)                 │
│                                                          │
│  Behavioral contracts, workflow procedures, approval     │
│  gates. The AI executes these procedures — it doesn't    │
│  decide whether to follow them.                          │
│                                                          │
│  playwright-mcp.mdc · rule-evolution-review.mdc          │
└──────────────────────────────┬───────────────────────────┘
                               │ constrains
┌──────────────────────────────▼───────────────────────────┐
│  Layer 2 — Capability  (index.js — this MCP server)      │
│                                                          │
│  Neutral execution: run tests, read specs, analyze       │
│  failures, validate + apply fixes, propose rule updates. │
│                                                          │
│  LOCATOR_VIOLATION_RULES is enforced here as a hard      │
│  constraint (locked-door pattern) — not as a suggestion. │
└──────────────────────────────┬───────────────────────────┘
                               │ consults
┌──────────────────────────────▼───────────────────────────┐
│  Layer 1 — Knowledge  (playwright-test-standards.mdc)    │
│                                                          │
│  Locator priority, DSL conventions, evolved heuristics.  │
│  Guides reasoning — not enforced. Accumulates learned    │
│  rules over time via the rule evolution system.          │
└──────────────────────────────────────────────────────────┘

Key distinctionapply_approved_rules is intentionally not an MCP tool. It lives in rule-evolution-review.mdc as a governed procedure. Keeping it out of the automated executor prevents the system from modifying its own rule layer without human approval.

Closed Repair Loop

@AUT-xxx  (or "npx playwright test --grep @AUT-xxx")
    │
    ▼
resolve_spec_by_tag ──→ read_spec_file
    │                        │
    │                   (understand full test intent before running)
    │
    ▼
run_test_and_analyze_failure (attemptNumber: 1)
    │
    ├── passed: true  ──→ propose_rule_evolution ──→ PENDING queue
    │
    ├── shouldStop: true  ──→ escalate to human (attempt limit reached)
    │
    └── passed: false
            │
            ▼
       analyze_and_fix_selector
       (error message + screenshot path + spec context + rule bundle)
            │
            ▼
       validate_and_apply_fix  ←── LOCKED DOOR: enforces LOCATOR_RULES
            │
            ├── error violations  ──→ regenerate fix, retry validate
            │
            └── ok
                    │
                    ▼
               run_test_and_analyze_failure (attemptNumber + 1)
               (loop until pass or shouldStop)

Rule Evolution Lifecycle

[Automated Loop]           [Human Review]           [Governance Workflow]
propose_rule_evolution  →  PENDING in queue  →  mark APPROVED / REJECTED
                                                          │
                                             "apply approved rules"
                                              triggers rule-evolution-review.mdc
                                                          │
                                             ┌────────────▼────────────┐
                                             │ writes to .mdc file     │
                                             │ removes from Pending    │
                                             │ appends to History Log  │
                                             └─────────────────────────┘

The "trainable parameters" are .mdc rule files — not model weights. The system improves without retraining anything.


Prerequisites

  • Node.js 18+
  • Cursor with MCP support
  • Playwright installed in your project (npm install -D @playwright/test)

Installation

Option A — npm (recommended)

# Install as a dev dependency in your Playwright project
npm install -D playwright-fixer-mcp

Option B — Clone from GitHub

git clone https://github.com/your-username/playwright-fixer-mcp.git
cd playwright-fixer-mcp
npm install

Setup

After installation, run the setup command to copy the Cursor rule templates into your project:

# From your Playwright project root
npx playwright-fixer-mcp setup

# With an explicit project root
npx playwright-fixer-mcp setup --project-root=/path/to/your/project

# Force overwrite existing rule files
npx playwright-fixer-mcp setup --force

This copies four files into .cursor/rules/ in your project:

File Layer Purpose
playwright-mcp.mdc Governance Workflow trigger, closed-loop procedure, hard constraints
playwright-test-standards.mdc Knowledge Locator priority, DSL conventions, evolved heuristics
rule-evolution-review.mdc Governance Rule promotion workflow (human-triggered)
rule-evolution-queue.md Queue Pending / history log for rule proposals

Note: rule-evolution-queue.md is never overwritten if it already contains [PENDING] entries, even with --force. Your pending proposals are safe.


Cursor MCP Configuration

Add to your Cursor MCP config. Create .cursor/mcp.json in your project root (or add to Cursor → Settings → MCP):

If installed as dev dependency

{
  "mcpServers": {
    "playwright-fixer": {
      "command": "node",
      "args": ["./node_modules/playwright-fixer-mcp/index.js"]
    }
  }
}

If cloned locally

{
  "mcpServers": {
    "playwright-fixer": {
      "command": "node",
      "args": ["/absolute/path/to/playwright-fixer-mcp/index.js"]
    }
  }
}

Restart Cursor after adding the configuration. Verify the server appears under MCP tools.


Usage

Running a Test by Tag

Type a test tag in the Cursor chat — the closed-loop repair activates automatically:

@AUT-589-1

or

npx playwright test --grep @AUT-589-1

The system will:

  1. Find the spec file containing @AUT-589-1 (via resolve_spec_by_tag)
  2. Read the full spec + page object to understand test intent (via read_spec_file)
  3. Run the test (via run_test_and_analyze_failure)
  4. If it fails: collect screenshot + error → analyze → generate fix → validate against locator rules → apply → re-run
  5. If it passes: propose the learned fix pattern as a rule update (propose_rule_evolution)

The loop retries up to 2 times before stopping and escalating to human review.

Reviewing and Promoting Rule Proposals

After a successful auto-fix, a PENDING entry is written to .cursor/rules/rule-evolution-queue.md.

To review and promote:

  1. Open .cursor/rules/rule-evolution-queue.md
  2. Read the proposed rule under ## Pending (awaiting review)
  3. Change <!-- APPROVED | REJECTED | APPLIED --> to either <!-- APPROVED --> or <!-- REJECTED -->
  4. Tell Cursor: "apply approved rules"

The rule-evolution-review.mdc governance workflow activates:

  • Appends approved rules to the target .mdc file
  • Removes the entry from the Pending section
  • Writes a permanent record to the History Log

Hard constraint: The AI cannot self-approve. Only entries the human has explicitly marked are processed.


Project Structure (after setup)

your-project/
├── .cursor/
│   ├── mcp.json                             ← Cursor MCP server config
│   └── rules/
│       ├── playwright-mcp.mdc               ← Governance: workflow + hard constraints
│       ├── playwright-test-standards.mdc    ← Knowledge: locators, DSL, evolved rules
│       ├── rule-evolution-review.mdc        ← Governance: rule promotion procedure
│       └── rule-evolution-queue.md          ← Queue: pending/history rule proposals
├── tests/
│   └── **/*.spec.js                         ← Your Playwright specs (tagged @AUT-xxx)
├── node_modules/
│   └── playwright-fixer-mcp/
│       └── index.js                         ← MCP server (Capability layer)
└── package.json

Test Tag Format

Every test must use the @AUT-xxx tag format for the system to locate and run it:

import helperFunctions from '../helpers.js';
import PageObjectName from '../../pageObjects/PageObjectName.js';

test("description @BaseCase @PageName @testCaseName @AUT-xxx", async () => {
  const { browser, context, page } = await helperFunctions.setup_Backgound_Step();
  const pageObj = new PageObjectName(page);
  await helperFunctions.given_A_Page(page, PageObjectName);
  await helperFunctions.click(page, pageObj.someButton);
  await helperFunctions.check_Element_Contains_Text(page, pageObj.result, 'expected text');
  await browser.close();
});

Always use helperFunctions — direct page.click() / page.fill() calls bypass the failure normalization layer that the CONTEXT_RESOLVERS depend on.


Failure Normalization

The system uses a DSL layer (helperFunctions) to convert raw Playwright errors into semantic signals:

// Raw Playwright error (no semantic information for the system):
// Timeout 5000ms exceeded

// Normalized error from helperFunctions (classifiable):
throw new Error(`Element "${elementName}" not found with selector: ${selector}`);

This is how CONTEXT_RESOLVERS deterministically classifies failures into hover, fill, iframe, or default — without LLM inference.


Locator Rules (Enforced)

validate_and_apply_fix is the only valid write path to spec files. It enforces these constraints before writing:

Rule ID Severity Description
NO_XPATH error XPath locators are blocked. Use getByRole, getByLabel, or getByText.
IFRAME_USE_FRAME_LOCATOR error In iframe context, page.locator() is blocked — must use frame.locator().
CSS_CLASS_SELECTOR warn CSS class selectors are warned. Prefer semantic locators.

Error-level violations block the write and return the violation list. The model must regenerate a compliant fix and call validate_and_apply_fix again.

Locator Priority (highest → lowest)

1. getByRole('button', { name: '...' })   ← preferred
2. getByLabel('...')
3. getByPlaceholder('...')
4. getByText('...')
5. [data-testid] / [data-qa]
6. CSS class selectors                     ← warn
7. XPath                                   ← blocked

Tool Reference

Automated Loop Tools (Layer 2 — Capability)

Tool Description
run_test_and_analyze_failure Run npx playwright test --grep @<tag>. Returns passed, resolvedContext, error artifacts. Enforces retry stop-loss (max 2 attempts before shouldStop: true).
analyze_and_fix_selector Build a fix suggestion payload from error message + screenshot path + resolved failure context + rule bundle. Returns structured guidance for the model.
validate_and_apply_fix Validate proposed code changes against LOCATOR_VIOLATION_RULES, then write to spec. The only valid write path. Returns violations if blocked.
resolve_spec_by_tag Scan tests/ to find the spec file containing a given @AUT-xxx tag. Deterministic — never guess.
read_spec_file Read the full spec + auto-resolve its imported page object. Provides complete test context before analysis.
read_page_object_selectors Extract getter → elementName → selector mapping from a page object file.
get_failure_artifacts Scan test-results/ for the latest screenshot and trace.zip. Optionally filter by tag.
get_iframe_context Extract iframe selector and frame-related operations from a spec + page object.
propose_rule_evolution Write a learned fix pattern as a PENDING proposal to the rule evolution queue. Does not modify any .mdc file — human approval required.

Reference / Info Tools

Tool Description
get_playwright_fix_workflow Return the full fix workflow reference (read spec → run → fail → fix → verify → evolve).
get_internal_locator_rules Return the locator priority rules.
get_tag_run_rule Return the tag trigger rule (when @AUT-xxx triggers the closed loop).

Not an MCP Tool (by design)

apply_approved_rules — This is a governance operation triggered by human intent ("apply approved rules"), not by the automated loop. It lives in rule-evolution-review.mdc as a Cursor rule procedure. Keeping it out of index.js enforces the architectural boundary: the executor cannot promote its own rule proposals.


Customization

Adding a New Failure Context

Edit index.js to add a new entry to both CONTEXT_RESOLVERS and FAILURE_CONTEXT_MAP:

// In CONTEXT_RESOLVERS — detection pattern
const CONTEXT_RESOLVERS = [
  // ...existing entries...
  {
    contextKey: "select",
    test: (msg) => /selectOption|dropdown|ant-select/i.test(msg || ""),
  },
];

// In FAILURE_CONTEXT_MAP — instructions and rule blocks for this context
const FAILURE_CONTEXT_MAP = {
  // ...existing entries...
  select: {
    instruction: "3. **This error relates to a dropdown/select**: use `getByRole('option')` or `page.selectOption()`.\n",
    extraRuleBlocks: [],
  },
};

No other code changes required. The resolver runs first-match-wins.

Adding a Custom Locator Violation Rule

const LOCATOR_VIOLATION_RULES = [
  // ...existing rules...
  {
    id: "NO_DATA_TESTID_WHEN_ROLE_EXISTS",
    severity: "warn",
    test: (code) => /\[data-testid\]/i.test(code),
    message: "Prefer getByRole over data-testid when a semantic role exists (LOCATOR_RULES).",
  },
];

severity: "error" blocks the write. severity: "warn" allows it but surfaces the issue.

Adjusting the Retry Limit

const MAX_AUTO_RETRIES = 3; // stop at attempt 3; allows attempts 1 and 2

Increase for environments with higher test flakiness.


Stage Evolution

This tool implements Stages 2–4 of the AI automation evolution path:

Stage 1 → Static rules in skill.md
           (AI reads knowledge before acting)

Stage 2 → MCP selects rule bundles by context     ← CONTEXT_RESOLVERS
           (System decides the framework)

Stage 3 → Closed verification loop                ← repair loop in playwright-mcp.mdc
           (Outputs become testable hypotheses)

Stage 4 → Rules evolve based on execution results ← rule-evolution-queue.md
           (System improves without retraining)

Stage 5 → Self-improving governed AI environment  ← you build this on top
           (The environment becomes the intelligence layer)

The "trainable parameters" are rule bundles in .mdc files — not model weights.


Architecture Articles

This tool was built based on the following series:

  1. Stop Prompting Your Way Out of Playwright Failures — The state reconstruction problem and closed-loop architecture
  2. The Environment Is the Prompt: Why MCP Rules Supersede Static Skill Files — Knowledge vs. constraints; why governance belongs in the environment
  3. Rules That Learn: How We Built a Self-Improving Test Governance System — The execution context separation; why apply_approved_rules is not an MCP tool
  4. The Three Layers of AI Automation Systems — Knowledge / Capability / Governance — and the cost of mixing them

License

MIT

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选