Browser Automation MCP Server
Enables intelligent web scraping through a browser automation tool that can search Google, navigate to webpages, and extract content from various websites including GitHub, Stack Overflow, and documentation sites.
README
🤖 Browser Automation Agent
A powerful browser automation tool built with MCP (Model Controlled Program) that combines web scraping capabilities with LLM-powered intelligence. This agent can search Google, navigate to webpages, and intelligently scrape content from various websites including GitHub, Stack Overflow, and documentation sites.
🚀 Features
- 🔍 Google Search Integration: Finds and retrieves top search results for any query
- 🕸️ Intelligent Web Scraping: Tailored scraping strategies for different website types:
- 📂 GitHub repositories
- 💬 Stack Overflow questions and answers
- 📚 Documentation pages
- 🌐 Generic websites
- 🧠 AI-Powered Processing: Uses Mistral AI for understanding and processing scraped content
- 🥷 Stealth Mode: Implements browser fingerprint protection to avoid detection
- 💾 Content Saving: Automatically saves both screenshots and text content from scraped pages
🏗️ Architecture
This project uses a client-server architecture powered by MCP:
- 🖥️ Server: Handles browser automation and web scraping tasks
- 👤 Client: Provides the AI interface using Mistral AI and LangGraph
- 📡 Communication: Uses stdio for client-server communication
⚙️ Requirements
- 🐍 Python 3.8+
- 🎭 Playwright
- 🧩 MCP (Model Controlled Program)
- 🔑 Mistral AI API key
📥 Installation
- Clone the repository:
git clone https://github.com/yourusername/browser-automation-agent.git
cd browser-automation-agent
- Install dependencies:
pip install -r requirements.txt
- Install Playwright browsers:
playwright install
- Create a
.envfile in the project root and add your Mistral AI API key:
MISTRAL_API_KEY=your_api_key_here
📋 Usage
Running the Server
python main.py
Running the Client
python client.py
Sample Interaction
Once both the server and client are running:
- Enter your query when prompted
- The agent will:
- 🔍 Search Google for relevant results
- 🧭 Navigate to the top result
- 📊 Scrape content based on the website type
- 📸 Save screenshots and content to files
- 📤 Return processed information
🛠️ Tool Functions
get_top_google_url
🔍 Searches Google and returns the top result URL for a given query.
browse_and_scrape
🌐 Navigates to a URL and scrapes content based on the website type.
scrape_github
📂 Specializes in extracting README content and code blocks from GitHub repositories.
scrape_stackoverflow
💬 Extracts questions, answers, comments, and code blocks from Stack Overflow pages.
scrape_documentation
📚 Optimized for extracting documentation content and code examples.
scrape_generic
🌐 Extracts paragraph text and code blocks from generic websites.
📁 File Structure
browser-automation-agent/
├── main.py # MCP server implementation
├── client.py # Mistral AI client implementation
├── requirements.txt # Project dependencies
├── .env # Environment variables (API keys)
└── README.md # Project documentation
📤 Output Files
The agent generates two types of output files with timestamps:
- 📸
final_page_YYYYMMDD_HHMMSS.png: Screenshot of the final page state - 📄
scraped_content_YYYYMMDD_HHMMSS.txt: Extracted text content from the page
⚙️ Customization
You can modify the following parameters in the code:
- 🖥️ Browser window size: Adjust
widthandheightinbrowse_and_scrape - 👻 Headless mode: Set
headless=Truefor invisible browser operation - 🔢 Number of Google results: Change
num_resultsinget_top_google_url
❓ Troubleshooting
- 🔌 Connection Issues: Ensure both server and client are running in separate terminals
- 🎭 Playwright Errors: Make sure browsers are installed with
playwright install - 🔑 API Key Errors: Verify your Mistral API key is correctly set in the
.envfile - 🛣️ Path Errors: Update the path to
main.pyinclient.pyif needed
📜 License
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Built with 🧩 MCP, 🎭 Playwright, and 🧠 Mistral AI
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。
mcp-server-qdrant
这个仓库展示了如何为向量搜索引擎 Qdrant 创建一个 MCP (Managed Control Plane) 服务器的示例。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。