playwright-fixer-mcp
Automated Playwright E2E test repair powered by a self-improving, governed MCP server that runs failing tests, collects failure artifacts, reasons about root causes, validates and applies fixes, and re-runs to verify.
README
playwright-fixer-mcp
Automated Playwright E2E test repair powered by a self-improving, governed MCP server.
Built on the three-layer AI automation architecture — Knowledge, Capability, and Governance — this tool turns failing Playwright tests into a closed verification loop: it runs the test, collects failure artifacts, reasons about the root cause, applies a rule-validated fix, and re-runs to verify — without hallucinating about runtime state.
We don't increase model intelligence. We reduce model freedom. The result is a system that works not by magic, but by design.
Why This Exists
When a Playwright test fails, most teams do one of two things:
- Send the error to an LLM and hope it guesses right.
- Alert an engineer to investigate manually.
Both share the same flaw: the LLM is reasoning blind. Timeout 5000ms exceeded contains almost no actionable information. No selector. No DOM context. No iframe structure. No screenshot.
This tool solves that by treating E2E debugging as a state reconstruction problem, not a prompting problem.
Instead of better prompts, the system provides:
- Deterministic context — the model never guesses about runtime state; tools provide it
- Constrained reasoning — exactly one failure context is resolved, one rule bundle injected
- Closed verification — every proposed fix is validated by
validate_and_apply_fix(the locked door) and re-run before accepting - Self-improving governance — learned fix patterns become rules, reviewed by humans, promoted automatically
Architecture
The system separates three concerns that most AI automation tools mix together. Mixing them is the most common failure mode.
┌──────────────────────────────────────────────────────────┐
│ Layer 3 — Governance (.mdc rule files) │
│ │
│ Behavioral contracts, workflow procedures, approval │
│ gates. The AI executes these procedures — it doesn't │
│ decide whether to follow them. │
│ │
│ playwright-mcp.mdc · rule-evolution-review.mdc │
└──────────────────────────────┬───────────────────────────┘
│ constrains
┌──────────────────────────────▼───────────────────────────┐
│ Layer 2 — Capability (index.js — this MCP server) │
│ │
│ Neutral execution: run tests, read specs, analyze │
│ failures, validate + apply fixes, propose rule updates. │
│ │
│ LOCATOR_VIOLATION_RULES is enforced here as a hard │
│ constraint (locked-door pattern) — not as a suggestion. │
└──────────────────────────────┬───────────────────────────┘
│ consults
┌──────────────────────────────▼───────────────────────────┐
│ Layer 1 — Knowledge (playwright-test-standards.mdc) │
│ │
│ Locator priority, DSL conventions, evolved heuristics. │
│ Guides reasoning — not enforced. Accumulates learned │
│ rules over time via the rule evolution system. │
└──────────────────────────────────────────────────────────┘
Key distinction — apply_approved_rules is intentionally not an MCP tool. It lives in rule-evolution-review.mdc as a governed procedure. Keeping it out of the automated executor prevents the system from modifying its own rule layer without human approval.
Closed Repair Loop
@AUT-xxx (or "npx playwright test --grep @AUT-xxx")
│
▼
resolve_spec_by_tag ──→ read_spec_file
│ │
│ (understand full test intent before running)
│
▼
run_test_and_analyze_failure (attemptNumber: 1)
│
├── passed: true ──→ propose_rule_evolution ──→ PENDING queue
│
├── shouldStop: true ──→ escalate to human (attempt limit reached)
│
└── passed: false
│
▼
analyze_and_fix_selector
(error message + screenshot path + spec context + rule bundle)
│
▼
validate_and_apply_fix ←── LOCKED DOOR: enforces LOCATOR_RULES
│
├── error violations ──→ regenerate fix, retry validate
│
└── ok
│
▼
run_test_and_analyze_failure (attemptNumber + 1)
(loop until pass or shouldStop)
Rule Evolution Lifecycle
[Automated Loop] [Human Review] [Governance Workflow]
propose_rule_evolution → PENDING in queue → mark APPROVED / REJECTED
│
"apply approved rules"
triggers rule-evolution-review.mdc
│
┌────────────▼────────────┐
│ writes to .mdc file │
│ removes from Pending │
│ appends to History Log │
└─────────────────────────┘
The "trainable parameters" are .mdc rule files — not model weights. The system improves without retraining anything.
Prerequisites
- Node.js 18+
- Cursor with MCP support
- Playwright installed in your project (
npm install -D @playwright/test)
Installation
Option A — npm (recommended)
# Install as a dev dependency in your Playwright project
npm install -D playwright-fixer-mcp
Option B — Clone from GitHub
git clone https://github.com/your-username/playwright-fixer-mcp.git
cd playwright-fixer-mcp
npm install
Setup
After installation, run the setup command to copy the Cursor rule templates into your project:
# From your Playwright project root
npx playwright-fixer-mcp setup
# With an explicit project root
npx playwright-fixer-mcp setup --project-root=/path/to/your/project
# Force overwrite existing rule files
npx playwright-fixer-mcp setup --force
This copies four files into .cursor/rules/ in your project:
| File | Layer | Purpose |
|---|---|---|
playwright-mcp.mdc |
Governance | Workflow trigger, closed-loop procedure, hard constraints |
playwright-test-standards.mdc |
Knowledge | Locator priority, DSL conventions, evolved heuristics |
rule-evolution-review.mdc |
Governance | Rule promotion workflow (human-triggered) |
rule-evolution-queue.md |
Queue | Pending / history log for rule proposals |
Note:
rule-evolution-queue.mdis never overwritten if it already contains[PENDING]entries, even with--force. Your pending proposals are safe.
Cursor MCP Configuration
Add to your Cursor MCP config. Create .cursor/mcp.json in your project root (or add to Cursor → Settings → MCP):
If installed as dev dependency
{
"mcpServers": {
"playwright-fixer": {
"command": "node",
"args": ["./node_modules/playwright-fixer-mcp/index.js"]
}
}
}
If cloned locally
{
"mcpServers": {
"playwright-fixer": {
"command": "node",
"args": ["/absolute/path/to/playwright-fixer-mcp/index.js"]
}
}
}
Restart Cursor after adding the configuration. Verify the server appears under MCP tools.
Usage
Running a Test by Tag
Type a test tag in the Cursor chat — the closed-loop repair activates automatically:
@AUT-589-1
or
npx playwright test --grep @AUT-589-1
The system will:
- Find the spec file containing
@AUT-589-1(viaresolve_spec_by_tag) - Read the full spec + page object to understand test intent (via
read_spec_file) - Run the test (via
run_test_and_analyze_failure) - If it fails: collect screenshot + error → analyze → generate fix → validate against locator rules → apply → re-run
- If it passes: propose the learned fix pattern as a rule update (
propose_rule_evolution)
The loop retries up to 2 times before stopping and escalating to human review.
Reviewing and Promoting Rule Proposals
After a successful auto-fix, a PENDING entry is written to .cursor/rules/rule-evolution-queue.md.
To review and promote:
- Open
.cursor/rules/rule-evolution-queue.md - Read the proposed rule under
## Pending (awaiting review) - Change
<!-- APPROVED | REJECTED | APPLIED -->to either<!-- APPROVED -->or<!-- REJECTED --> - Tell Cursor: "apply approved rules"
The rule-evolution-review.mdc governance workflow activates:
- Appends approved rules to the target
.mdcfile - Removes the entry from the Pending section
- Writes a permanent record to the History Log
Hard constraint: The AI cannot self-approve. Only entries the human has explicitly marked are processed.
Project Structure (after setup)
your-project/
├── .cursor/
│ ├── mcp.json ← Cursor MCP server config
│ └── rules/
│ ├── playwright-mcp.mdc ← Governance: workflow + hard constraints
│ ├── playwright-test-standards.mdc ← Knowledge: locators, DSL, evolved rules
│ ├── rule-evolution-review.mdc ← Governance: rule promotion procedure
│ └── rule-evolution-queue.md ← Queue: pending/history rule proposals
├── tests/
│ └── **/*.spec.js ← Your Playwright specs (tagged @AUT-xxx)
├── node_modules/
│ └── playwright-fixer-mcp/
│ └── index.js ← MCP server (Capability layer)
└── package.json
Test Tag Format
Every test must use the @AUT-xxx tag format for the system to locate and run it:
import helperFunctions from '../helpers.js';
import PageObjectName from '../../pageObjects/PageObjectName.js';
test("description @BaseCase @PageName @testCaseName @AUT-xxx", async () => {
const { browser, context, page } = await helperFunctions.setup_Backgound_Step();
const pageObj = new PageObjectName(page);
await helperFunctions.given_A_Page(page, PageObjectName);
await helperFunctions.click(page, pageObj.someButton);
await helperFunctions.check_Element_Contains_Text(page, pageObj.result, 'expected text');
await browser.close();
});
Always use helperFunctions — direct page.click() / page.fill() calls bypass the failure normalization layer that the CONTEXT_RESOLVERS depend on.
Failure Normalization
The system uses a DSL layer (helperFunctions) to convert raw Playwright errors into semantic signals:
// Raw Playwright error (no semantic information for the system):
// Timeout 5000ms exceeded
// Normalized error from helperFunctions (classifiable):
throw new Error(`Element "${elementName}" not found with selector: ${selector}`);
This is how CONTEXT_RESOLVERS deterministically classifies failures into hover, fill, iframe, or default — without LLM inference.
Locator Rules (Enforced)
validate_and_apply_fix is the only valid write path to spec files. It enforces these constraints before writing:
| Rule ID | Severity | Description |
|---|---|---|
NO_XPATH |
error | XPath locators are blocked. Use getByRole, getByLabel, or getByText. |
IFRAME_USE_FRAME_LOCATOR |
error | In iframe context, page.locator() is blocked — must use frame.locator(). |
CSS_CLASS_SELECTOR |
warn | CSS class selectors are warned. Prefer semantic locators. |
Error-level violations block the write and return the violation list. The model must regenerate a compliant fix and call validate_and_apply_fix again.
Locator Priority (highest → lowest)
1. getByRole('button', { name: '...' }) ← preferred
2. getByLabel('...')
3. getByPlaceholder('...')
4. getByText('...')
5. [data-testid] / [data-qa]
6. CSS class selectors ← warn
7. XPath ← blocked
Tool Reference
Automated Loop Tools (Layer 2 — Capability)
| Tool | Description |
|---|---|
run_test_and_analyze_failure |
Run npx playwright test --grep @<tag>. Returns passed, resolvedContext, error artifacts. Enforces retry stop-loss (max 2 attempts before shouldStop: true). |
analyze_and_fix_selector |
Build a fix suggestion payload from error message + screenshot path + resolved failure context + rule bundle. Returns structured guidance for the model. |
validate_and_apply_fix |
Validate proposed code changes against LOCATOR_VIOLATION_RULES, then write to spec. The only valid write path. Returns violations if blocked. |
resolve_spec_by_tag |
Scan tests/ to find the spec file containing a given @AUT-xxx tag. Deterministic — never guess. |
read_spec_file |
Read the full spec + auto-resolve its imported page object. Provides complete test context before analysis. |
read_page_object_selectors |
Extract getter → elementName → selector mapping from a page object file. |
get_failure_artifacts |
Scan test-results/ for the latest screenshot and trace.zip. Optionally filter by tag. |
get_iframe_context |
Extract iframe selector and frame-related operations from a spec + page object. |
propose_rule_evolution |
Write a learned fix pattern as a PENDING proposal to the rule evolution queue. Does not modify any .mdc file — human approval required. |
Reference / Info Tools
| Tool | Description |
|---|---|
get_playwright_fix_workflow |
Return the full fix workflow reference (read spec → run → fail → fix → verify → evolve). |
get_internal_locator_rules |
Return the locator priority rules. |
get_tag_run_rule |
Return the tag trigger rule (when @AUT-xxx triggers the closed loop). |
Not an MCP Tool (by design)
apply_approved_rules — This is a governance operation triggered by human intent ("apply approved rules"), not by the automated loop. It lives in rule-evolution-review.mdc as a Cursor rule procedure. Keeping it out of index.js enforces the architectural boundary: the executor cannot promote its own rule proposals.
Customization
Adding a New Failure Context
Edit index.js to add a new entry to both CONTEXT_RESOLVERS and FAILURE_CONTEXT_MAP:
// In CONTEXT_RESOLVERS — detection pattern
const CONTEXT_RESOLVERS = [
// ...existing entries...
{
contextKey: "select",
test: (msg) => /selectOption|dropdown|ant-select/i.test(msg || ""),
},
];
// In FAILURE_CONTEXT_MAP — instructions and rule blocks for this context
const FAILURE_CONTEXT_MAP = {
// ...existing entries...
select: {
instruction: "3. **This error relates to a dropdown/select**: use `getByRole('option')` or `page.selectOption()`.\n",
extraRuleBlocks: [],
},
};
No other code changes required. The resolver runs first-match-wins.
Adding a Custom Locator Violation Rule
const LOCATOR_VIOLATION_RULES = [
// ...existing rules...
{
id: "NO_DATA_TESTID_WHEN_ROLE_EXISTS",
severity: "warn",
test: (code) => /\[data-testid\]/i.test(code),
message: "Prefer getByRole over data-testid when a semantic role exists (LOCATOR_RULES).",
},
];
severity: "error" blocks the write. severity: "warn" allows it but surfaces the issue.
Adjusting the Retry Limit
const MAX_AUTO_RETRIES = 3; // stop at attempt 3; allows attempts 1 and 2
Increase for environments with higher test flakiness.
Stage Evolution
This tool implements Stages 2–4 of the AI automation evolution path:
Stage 1 → Static rules in skill.md
(AI reads knowledge before acting)
Stage 2 → MCP selects rule bundles by context ← CONTEXT_RESOLVERS
(System decides the framework)
Stage 3 → Closed verification loop ← repair loop in playwright-mcp.mdc
(Outputs become testable hypotheses)
Stage 4 → Rules evolve based on execution results ← rule-evolution-queue.md
(System improves without retraining)
Stage 5 → Self-improving governed AI environment ← you build this on top
(The environment becomes the intelligence layer)
The "trainable parameters" are rule bundles in .mdc files — not model weights.
Architecture Articles
This tool was built based on the following series:
- Stop Prompting Your Way Out of Playwright Failures — The state reconstruction problem and closed-loop architecture
- The Environment Is the Prompt: Why MCP Rules Supersede Static Skill Files — Knowledge vs. constraints; why governance belongs in the environment
- Rules That Learn: How We Built a Self-Improving Test Governance System — The execution context separation; why
apply_approved_rulesis not an MCP tool - The Three Layers of AI Automation Systems — Knowledge / Capability / Governance — and the cost of mixing them
License
MIT
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。