weixin-articles-mcp

weixin-articles-mcp

Read WeChat (微信) Official Account articles with native multimodal output — body, images, and video keyframes returned as MCP content blocks. Handles all three embed types: Tencent Video, WeChat-native, and Channels (视频号 metadata via public API).

Category
访问服务器

README

weixin-articles-mcp

MCP server for reading WeChat (微信) Official Account articles, with native multimodal output — images and video keyframes returned as content blocks, not URLs.

For personal/research use. This tool reads only publicly accessible article URLs and does not bypass any authentication or anti-bot measures. See Disclaimer before using.

License: MIT Python 3.10+ GitHub stars

Why this exists

Other tools that read WeChat articles for LLMs return a list of image URLs — your LLM has to click through to actually see them, costing extra round-trips and context.

This server returns the images themselves. And the video keyframes. Your LLM sees what you see, in one shot.

Tool Article text Images Videos
WebFetch (built-in) ✅ (often blocked by anti-bot) ❌ URLs only
Existing WeChat MCPs / Skills ❌ URLs only
weixin-articles-mcp Native image blocks Keyframes as image blocks

Features

  • 📰 Reliable WeChat scraping — pure Python httpx GET, no Rust binary or headless browser required
  • 🖼️ Native image content — PNG/JPG returned as MCP Image blocks, GIFs filtered, capped at 10 per article
  • 🎬 Video handling for all three embed types:
    • WeChat Official Account native videos (<iframe data-mpvid="wxv_*">): mp4 extracted from inline JS, 8 evenly-spaced keyframes via ffmpeg
    • Tencent Video (v.qq.com iframes): yt-dlp + ffmpeg keyframes
    • WeChat Channels (视频号, <mp-common-videosnap>): full metadata via the public batch_get_video_snap API (duration, dimensions, hi-res cover, full description, like count, publisher verification) + cover image. mp4 stream is locked behind WeChat's finder protocol — see Why no Channels mp4? below
  • 🕒 Publish time recovery — extracts var ct Unix timestamp that other parsers miss
  • 🪶 Minimal installpip install + optional ffmpeg for video; no Chromium, no Rust

Install

# Core (article + images)
pip install weixin-articles-mcp

# With video keyframe support
pip install "weixin-articles-mcp[video]"
brew install ffmpeg   # or apt install ffmpeg on Linux

Configure

Claude Desktop / Claude Code

{
  "mcpServers": {
    "weixin-articles": {
      "command": "weixin-articles-mcp"
    }
  }
}

Cursor / Cline

Same JSON, drop into the MCP server config of your client.

Usage

Once configured, just paste a WeChat article URL into your conversation:

Read https://mp.weixin.qq.com/s/cexkyzQBRDG3uIF6g5cEbQ

Your LLM will receive:

  • Article metadata (title, account, publish time, cover URL)
  • Full article body in Markdown
  • All inline PNG/JPG images as native image content blocks
  • For each video, 8 keyframe images

Tool reference

read_article(url: str) -> list[content_block]

Returns a list of MCP content blocks:

  • [0] — text block: metadata + article body markdown
  • [1..N] — image blocks: article images (max 10, GIFs filtered)
  • For each video (max 3):
    • WeChat-native or Tencent: one text marker + 8 keyframe image blocks
    • WeChat Channels: one text marker (with duration, dimensions, like count, publisher, description) + 1 hi-res cover image block

On failure, returns a single text block starting with Error:.

Roadmap

  • [x] WeChat article fetching with anti-bot handling
  • [x] Native image content blocks
  • [x] WeChat Official Account native video keyframe extraction
  • [x] Tencent Video keyframe extraction
  • [x] WeChat Channels (视频号) metadata enrichment via public API
  • [ ] ASR subtitles via faster-whisper (for native + Tencent videos)
  • [ ] Full-text search across read articles
  • [ ] Account subscription / new-article notifications

Why no Channels mp4?

Short answer: WeChat Channels (视频号) videos in articles intentionally don't expose a downloadable mp4 stream to public web access. The mp4 lives inside WeChat's finder protocol, which requires (a) a logged-in WeChat client session, (b) finder-specific encryption (the first 128KB of the mp4 is XOR-encrypted with a fixed key), and (c) intercepting the stream from the WeChat PC client at network level.

Every open-source WeChat Channels downloader in the wild — ltaoo/wx_channels_download, qiye45/wechatVideoDownload, putyy/res-downloader, KingsleyYau/WeChatChannelsDownloader and others — solves this with a MITM HTTPS proxy + WeChat PC client + root CA installation. That model is fundamentally incompatible with how an MCP server runs (no client, no user interaction, no admin install).

What we do instead: call WeChat's public batch_get_video_snap API (no cookie or session required) to give your LLM the next-best thing — high-resolution cover image, full description, duration, dimensions, like count, and publisher verification. For most use cases (reading and summarizing articles), this is enough to convey the video's substance.

If you specifically need the mp4 file, install one of the dedicated tools above alongside this MCP — they complement each other.

Architecture

src/weixin_articles_mcp/
├── server.py     # FastMCP entrypoint, tool registration
├── fetcher.py    # httpx GET with browser UA
├── parser.py     # WeChat DOM extraction (BeautifulSoup + lxml)
├── markdown.py   # HTML → Markdown (markdownify subclass)
└── media.py      # Image download + video download/keyframe extraction

Contributing

PRs welcome. Particularly looking for help on:

  • WeChat Channels (视频号) URL handling
  • Resilience to template variants from less common publishers
  • More test fixtures (different article styles)

Open an issue: https://github.com/jj-cheng25/weixin-articles-mcp/issues

Disclaimer

This tool is provided for personal, educational, and research use only.

What this tool does:

  • Reads publicly accessible WeChat article URLs (mp.weixin.qq.com/s/...) using a standard browser User-Agent — the same content any user with a web browser can view
  • Calls only public WeChat API endpoints that accept empty authentication fields (i.e. designed by WeChat to be reachable without login)
  • Enforces a default 1-second minimum interval between requests to prevent the tool from being repurposed as a high-volume crawler

What this tool does not do:

  • Use cookies, login sessions, or any form of user credential
  • Bypass any technical protection, anti-bot measure, or encrypted stream (e.g. WeChat Channels mp4 is intentionally not supported — see Why no Channels mp4?)
  • Decrypt, reverse-engineer, or circumvent WeChat's protocol-level protections
  • Store, cache, or redistribute fetched content beyond the immediate response

User responsibilities:

  • Respect WeChat's Terms of Service when using this tool. Personal/research use of publicly accessible articles is generally aligned with how the content is intended to be consumed; high-volume scraping or commercial redistribution likely is not.
  • Respect copyright of fetched content. Article content remains the property of its original authors and publishers; this tool only fetches and forwards it to your LLM for inline processing.
  • Do not flood mp.weixin.qq.com — keep usage at human reading rates. The default rate limit is set conservatively, but you can tighten it further by setting WEIXIN_FETCH_INTERVAL_S=2.0 (or higher) in your environment.

The authors and contributors of this project disclaim all liability arising from misuse. By using this software you accept full responsibility for ensuring your usage complies with applicable laws and the terms of service of the services it connects to.

License

MIT — see LICENSE.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选