MCP Browser Automation Server
MCP server for browser automation with screenshot and console logging capabilities
weir1
README
MCP Browser Automation Server
A simple but powerful browser automation server that allows you to control browsers, take screenshots, and monitor console logs through a REST API.
Features
- Create browser sessions
- Navigate to URLs
- Take screenshots (full page or specific elements)
- Click elements
- Fill form inputs
- Monitor console logs in real-time through WebSocket
- Close sessions
Installation
- Clone this repository:
git clone https://github.com/weir1/mcp-browser-automation.git
cd mcp-browser-automation
- Create a virtual environment and activate it:
python -m venv venv
.\venv\Scripts\Activate
- Install dependencies:
pip install -r requirements.txt
- Install Playwright browsers:
playwright install
Usage
- Start the server:
python server.py
The server will start on http://localhost:8000
API Endpoints
Create a new session
POST /session/create
Response: { "session_id": "..." }
Navigate to a URL
POST /session/{session_id}/navigate?url=https://example.com
Take a screenshot
POST /session/{session_id}/screenshot?name=screenshot1&selector=.my-element
If selector is not provided, takes a full page screenshot.
Click an element
POST /session/{session_id}/click?selector=.my-button
Fill an input
POST /session/{session_id}/fill?selector=input[name="username"]&value=myuser
Monitor console logs
WebSocket /session/{session_id}/console
Close a session
POST /session/{session_id}/close
Example Usage with Python
import requests
import websockets
import asyncio
import json
# Create a session
response = requests.post("http://localhost:8000/session/create")
session_id = response.json()["session_id"]
# Navigate to a URL
requests.post(f"http://localhost:8000/session/{session_id}/navigate?url=https://example.com")
# Take a screenshot
response = requests.post(f"http://localhost:8000/session/{session_id}/screenshot?name=example")
with open("screenshot.png", "wb") as f:
f.write(response.content)
# Monitor console logs
async def monitor_console():
async with websockets.connect(f"ws://localhost:8000/session/{session_id}/console") as ws:
while True:
message = await ws.recv()
print(json.loads(message))
asyncio.get_event_loop().run_until_complete(monitor_console())
License
MIT
推荐服务器
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Playwright MCP Server
提供一个利用模型上下文协议的服务器,以实现类人浏览器的自动化,该服务器使用 Playwright,允许控制浏览器行为,例如导航、元素交互和滚动。
@kazuph/mcp-fetch
用于获取网页内容和处理图像的模型上下文协议服务器。这使得 Claude Desktop(或任何 MCP 客户端)能够适当地获取网页内容和处理图像。
DuckDuckGo MCP Server
一个模型上下文协议 (MCP) 服务器,通过 DuckDuckGo 提供网页搜索功能,并具有内容获取和解析的附加功能。
YouTube Transcript MCP Server
这个服务器用于获取指定 YouTube 视频 URL 的字幕,从而可以与 Goose CLI 或 Goose Desktop 集成,进行字幕提取和处理。
serper-search-scrape-mcp-server
这个 Serper MCP 服务器支持搜索和网页抓取,并且支持 Serper API 引入的所有最新参数,例如位置信息。
The Verge News MCP Server
提供从The Verge的RSS feed获取和搜索新闻的工具,允许用户获取今日新闻、检索过去一周的随机文章,以及在最近的Verge内容中搜索特定关键词。
Tavily MCP Server
使用 Tavily 的搜索 API 提供 AI 驱动的网络搜索功能,使 LLM 能够执行复杂的网络搜索、获得问题的直接答案以及搜索最近的新闻文章。
mcp-pinterest
用于图像搜索和信息检索的 Pinterest 模型上下文协议 (MCP) 服务器

Crawlab MCP Server