MCP Playwright Server
Enables AI assistants to control web browsers through Playwright automation, providing 50+ tools for navigation, interaction, testing, accessibility audits, and visual testing across Chromium, Firefox, and WebKit.
README
MCP Playwright Server
A comprehensive Model Context Protocol (MCP) server for browser automation using Playwright. This server enables AI assistants to control browsers via the MCP protocol, providing powerful tools for web testing, accessibility audits, and browser automation.
Features
- 🌐 Multi-Browser Support - Chromium, Firefox, and WebKit
- 🔧 50+ Browser Automation Tools - Navigation, interaction, assertions, screenshots
- ♿ Accessibility Testing - Built-in axe-core integration with WCAG compliance checks
- 🔒 Session Management - Isolated browser contexts with rate limiting
- 📸 Visual Testing - Screenshots, video recording, and tracing
- 🧪 AI Test Agents - Automated test planning, generation, and healing
Technology Stack
| Category | Technologies |
|---|---|
| Runtime | Node.js 18+ |
| Language | TypeScript 5.8 |
| Browser Automation | Playwright 1.52+ |
| Protocol | Model Context Protocol (MCP) SDK |
| Validation | Zod |
| Accessibility | @axe-core/playwright |
| Logging | Winston |
| Testing | @playwright/test |
| Linting | ESLint 9, Prettier |
Architecture
AI Client → MCP Protocol → Tool Handler → BrowserManager → Action Module → Playwright API
↑
SessionManager (lifecycle, rate limiting)
Layer Responsibilities
| Layer | Location | Purpose |
|---|---|---|
| Entry | src/index.ts |
Bootstrap, graceful shutdown |
| MCP Server | src/server/mcp-server.ts |
Tool/resource registration, session cleanup |
| Handlers | src/server/handlers/ |
Tool definitions grouped by category |
| Browser Manager | src/playwright/browser-manager.ts |
Orchestrates action modules |
| Actions | src/playwright/actions/ |
Domain-specific Playwright operations |
| Session Manager | src/playwright/session-manager.ts |
Session/page lifecycle, rate limiting |
Getting Started
Prerequisites
- Node.js 18 or higher
- npm or yarn
Installation
# Clone the repository
git clone https://github.com/j0hanz/playwright-mcp.git
cd playwright-mcp
# Install dependencies
npm install
# Install Playwright browsers
npm run install:browsers
Configuration
Create a .env file in the project root (optional):
LOG_LEVEL=info # debug, info, warn, error
DEFAULT_BROWSER=chromium # chromium, firefox, webkit
HEADLESS=true # Run headless mode
MAX_SESSIONS=5 # Concurrent sessions (1-20)
SESSION_TIMEOUT=1800000 # Session expiry in ms (30 min)
TIMEOUT_ACTION=20000 # Element action timeout in ms
TIMEOUT_NAVIGATION=30000 # Page navigation timeout in ms
Running the Server
# Development mode (with hot reload)
npm run dev
# Production build and run
npm run build
npm start
Project Structure
├── src/
│ ├── index.ts # Application entry point
│ ├── config/
│ │ ├── server-config.ts # Environment configuration
│ │ └── types.ts # TypeScript type definitions
│ ├── server/
│ │ ├── mcp-server.ts # MCP server implementation
│ │ └── handlers/ # Tool handler categories
│ │ ├── browser-tools.ts # Browser lifecycle tools
│ │ ├── navigation-tools.ts # Navigation tools
│ │ ├── interaction-tools.ts# Click, fill, hover tools
│ │ ├── assertion-tools.ts # Web-first assertions
│ │ ├── page-tools.ts # Screenshots, content, a11y
│ │ ├── test-tools.ts # Test file management
│ │ ├── advanced-tools.ts # Network, tracing, dialogs
│ │ └── schemas.ts # Zod validation schemas
│ ├── playwright/
│ │ ├── browser-manager.ts # Central browser orchestration
│ │ ├── session-manager.ts # Session lifecycle
│ │ ├── browser-launcher.ts # Browser launch logic
│ │ └── actions/ # Domain-specific actions
│ │ ├── assertion-actions.ts
│ │ ├── interaction-actions.ts
│ │ ├── navigation-actions.ts
│ │ └── ...
│ └── utils/
│ ├── error-handler.ts # Centralized error handling
│ └── logger.ts # Winston logger
├── tests/ # Playwright test files
├── specs/ # Human-readable test plans
├── .github/
│ ├── agents/ # AI agent definitions
│ ├── prompts/ # Agent prompts
│ └── copilot-instructions.md # Development guidelines
└── playwright.config.ts # Playwright test configuration
Available Tools
Browser Lifecycle
| Tool | Description |
|---|---|
browser_launch |
Launch browser (Chromium, Firefox, WebKit) with optional auth state |
browser_close |
Close browser session |
browser_tabs |
List, create, close, or select browser tabs |
sessions_list |
List all active browser sessions |
save_storage_state |
Save cookies/localStorage for auth reuse |
session_reset_state |
Clear session data for test isolation |
Navigation
| Tool | Description |
|---|---|
browser_navigate |
Navigate to URL |
browser_history |
Go back/forward in history |
browser_reload |
Reload current page |
handle_dialog |
Accept/dismiss browser dialogs |
Interaction
| Tool | Description |
|---|---|
element_click |
Click by role, text, testid, or selector |
element_fill |
Fill inputs by label, placeholder, or selector |
element_hover |
Hover over elements |
select_option |
Select dropdown options |
keyboard_press |
Press keys (Enter, Tab, shortcuts) |
keyboard_type |
Type text character by character |
checkbox_set |
Check/uncheck checkboxes |
file_upload |
Upload files |
drag_and_drop |
Drag and drop elements |
Assertions
| Tool | Description |
|---|---|
assert_element |
Assert state (visible, hidden, enabled, disabled) |
assert_text |
Assert element text content |
assert_value |
Assert input value |
assert_url |
Assert page URL |
assert_title |
Assert page title |
assert_attribute |
Assert element attribute |
assert_css |
Assert CSS property |
assert_checked |
Assert checkbox state |
assert_count |
Assert element count |
Page Operations
| Tool | Description |
|---|---|
page_screenshot |
Capture screenshots (full page, element, region) |
page_pdf |
Generate PDF from page (Chromium only) |
page_content |
Get HTML and text content |
page_evaluate |
Execute JavaScript (read-only) |
wait_for_selector |
Wait for elements |
page_wait_for_load_state |
Wait for page load |
accessibility_scan |
Run axe-core accessibility audit |
browser_snapshot |
Get accessibility tree snapshot |
Cookie Management
| Tool | Description |
|---|---|
cookies_get |
Retrieve cookies from browser context |
cookies_set |
Add cookies (auth tokens, sessions) |
cookies_clear |
Clear all or specific cookies |
Advanced
| Tool | Description |
|---|---|
network_mock |
Mock network responses |
network_unroute |
Remove network mocks |
tracing_start / tracing_stop |
Record execution traces |
console_capture |
Capture console messages |
har_record_start |
Record HTTP archive |
clock_install |
Control time in tests |
video_path |
Get video recording path |
Best Practices for Stable Tests
Following these practices will ensure your tests are resilient, maintainable, and less prone to flakiness. See the full Best Practices Guide for detailed examples.
Core Principles
-
Use Semantic, User-Facing Locators
- Role-based locators are most reliable:
getByRole('button', { name: 'Submit' }) - Avoid CSS selectors and XPath — these break when styling changes
- Priority: Role → Label → Placeholder → Text → TestId → CSS (last resort)
- Role-based locators are most reliable:
-
Use Locator Chaining and Filtering
- Chain locators to narrow searches:
page.getByRole('listitem').filter({ hasText: 'Product 2' }) - Filter by text or other locators for dynamic content
- This reduces strict mode violations and increases clarity
- Chain locators to narrow searches:
-
Always Use Web-First Assertions
- Use
expect()assertions which auto-wait:await expect(page.getByText('Success')).toBeVisible() - Don't use direct checks like
isVisible()without expect - Assertions wait up to 5 seconds (configurable) before failing
- Use
-
Avoid Common Pitfalls
- ❌
waitForTimeout()— use specific waits instead - ❌
waitForLoadState('networkidle')— use'domcontentloaded'or wait for elements - ❌ CSS class selectors — use role/label/text locators
- ❌ Screenshots as selectors — use
browser_snapshotfor finding elements - ❌
test.only()ortest.skip()— remove before committing
- ❌
Example: Good Test Structure
test('Add todo and verify', async ({ page }) => {
// Navigate
await page.goto('/');
// Get accessibility snapshot to understand page structure
const snapshot = await page.accessibility.snapshot();
// Interact using semantic locators (role > label > text)
await page.getByPlaceholder('What needs to be done?').fill('Buy groceries');
await page.getByRole('button', { name: 'Add' }).click();
// Verify using web-first assertions (auto-wait)
await expect(page.getByText('Buy groceries')).toBeVisible();
await expect(page.getByRole('listitem')).toHaveCount(1);
});
Locator Priority
When interacting with elements, prefer user-facing locators (most reliable first):
- Role ⭐ -
element_click(locatorType: 'role', role: 'button', name: 'Submit') - Label ⭐ -
element_fill(locatorType: 'label', value: 'Email', text: '...') - Text -
element_click(locatorType: 'text', value: 'Learn more') - Placeholder -
element_fill(locatorType: 'placeholder', value: 'Search...') - TestId -
element_click(locatorType: 'testid', value: 'submit-btn') - Selector - CSS selector (last resort only)
Development Workflow
# Watch mode with hot reload
npm run dev
# Build TypeScript to dist/
npm run build
# Run ESLint
npm run lint
npm run lint:fix
# Type check without emit
npm run type-check
# Format with Prettier
npm run format
# Run tests
npm test
npm run test:ui # Interactive UI
npm run test:headed # Visible browser
npm run test:debug # Debug mode
Before committing: Run npm run lint && npm run type-check && npm run build
Coding Standards
Tool Registration Pattern
server.registerTool(
'tool_name',
{
title: 'Human Title',
description: 'What this tool does',
inputSchema: {
/* Zod schemas */
},
outputSchema: {
/* Result shape */
},
},
createToolHandler(async (input) => {
const result = await browserManager.someMethod(input);
return {
content: [{ type: 'text', text: 'Human readable' }],
structuredContent: result, // Machine readable
};
}, 'Error prefix message')
);
Action Module Pattern
export class MyActions extends BaseAction {
async myOperation(sessionId: string, pageId: string, options: Options) {
return this.executePageOperation(
sessionId,
pageId,
'My operation',
async (page) => {
// Playwright operations
return { success: true, data: '...' };
}
);
}
}
Error Handling
import {
ErrorCode,
ErrorHandler,
validateUUID,
} from '../utils/error-handler.js';
validateUUID(sessionId, 'sessionId'); // Throws on invalid
throw ErrorHandler.sessionNotFound(id); // Factory methods
throw ErrorHandler.handlePlaywrightError(e); // Maps Playwright errors
Testing
Tests use @playwright/test framework. Configuration is in playwright.config.ts.
npm test # Run all tests
npm run test:ui # Interactive test UI
npm run test:headed # With visible browser
npm run test:debug # Debug mode with inspector
npm run test:trace # Record traces
npm run test:report # Show HTML report
Test Configuration
- Timeout: 30 seconds per test
- Retries: 2 on CI, 0 locally
- Browsers: Chromium, Firefox, WebKit, Mobile Chrome, Mobile Safari
- Viewport: 1366x900
- Test ID Attribute:
data-testid
AI Test Agents
Three AI agents for automated test workflows:
| Agent | Input | Output |
|---|---|---|
| Planner | App URL + seed test | specs/*.md test plans |
| Generator | Test plan | tests/*.spec.ts files |
| Healer | Failing test | Fixed test file |
Usage
- Planner: Explore app and create test plans in
specs/ - Generator: Transform plans into Playwright tests
- Healer: Debug and fix failing tests
Agent definitions are in .github/agents/ with prompts in .github/prompts/.
Security
- URL validation: Only
http://andhttps://protocols allowed - UUID validation: All session/page IDs validated
- Rate limiting: Configurable
MAX_SESSIONS_PER_MINUTE - Session isolation: Each browser context is isolated
- Script restrictions: Only safe, read-only JavaScript evaluation
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Follow the coding standards in
.github/copilot-instructions.md - Run linting and type checking (
npm run lint && npm run type-check) - Ensure tests pass (
npm test) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Adding a New Tool
- Add method to action class in
src/playwright/actions/ - Register in handler file in
src/server/handlers/ - Add schemas to
schemas.tsif new input shapes needed - Add tests for the new functionality
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Playwright - Browser automation framework
- Model Context Protocol - AI assistant protocol
- axe-core - Accessibility testing engine
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。