MCP Wayback Machine Server
Enables interaction with the Internet Archive's Wayback Machine to save web pages, retrieve archived versions, search historical snapshots, and check archive statistics without requiring API keys.
README
MCP Wayback Machine Server
Build Status
Release Status
An MCP (Model Context Protocol) server and CLI tool for interacting with the Internet Archive's Wayback Machine without requiring API keys.
Built with: MCP TypeScript Template
Overview
This tool can be used in two ways:
- As an MCP server - Integrate with Claude Desktop for AI-powered interactions
- As a CLI tool - Use directly from the command line with
npxor global installation
Features:
- Save web pages to the Wayback Machine
- Retrieve archived versions of web pages
- Check archive status and statistics
- Search the Wayback Machine CDX API for available snapshots
Features
- 🔐 No API keys required - Uses public Wayback Machine endpoints
- 💾 Save pages - Archive any publicly accessible URL
- 🔄 Retrieve archives - Get archived versions with optional timestamps
- 📊 Archive statistics - Get capture counts and yearly statistics
- 🔍 Search archives - Query available snapshots with date filtering
- ⏱️ Rate limiting - Built-in rate limiting to respect service limits
- 💻 Dual mode - Use as MCP server or standalone CLI tool
- 🎨 Rich CLI output - Colorized output with progress indicators
- 🔒 TypeScript - Full type safety with Zod validation
Tools
1. save_url
Archive a URL to the Wayback Machine.
- Input:
url(required) - The URL to save - Output: Success status, archived URL, and timestamp
- Handles rate limiting automatically
2. get_archived_url
Retrieve an archived version of a URL.
- Input:
url(required) - The URL to retrievetimestamp(optional) - Specific timestamp (YYYYMMDDhhmmss) or "latest"
- Output: Archived URL, timestamp, and availability status
3. search_archives
Search for all archived versions of a URL.
- Input:
url(required) - The URL to search forfrom(optional) - Start date (YYYY-MM-DD)to(optional) - End date (YYYY-MM-DD)limit(optional) - Maximum results (default: 10)
- Output: List of snapshots with dates, URLs, status codes, and mime types
4. check_archive_status
Check archival statistics for a URL.
- Input:
url(required) - The URL to check - Output: Archive status, first/last capture dates, total captures, yearly statistics
Technical Details
- Transport: Stdio (for Claude Desktop integration)
- HTTP Client: Built-in fetch with timeout support
- Rate Limiting: 15 requests per minute (conservative limit)
- Error Handling: Graceful handling with detailed error messages
- Validation: URL and timestamp validation
- TypeScript: Full type safety with Zod schema validation
API Endpoints (No Keys Required)
- Save Page Now:
https://web.archive.org/save/{url}- Archive pages on demand - Availability API:
http://archive.org/wayback/available?url={url}- Check archive status - CDX Server API:
http://web.archive.org/cdx/search/cdx?url={url}- Advanced search and filtering - TimeMap API:
http://web.archive.org/web/timemap/link/{url}- Get all timestamps for a URL - Metadata API:
https://archive.org/metadata/{identifier}- Get Internet Archive item metadata - Search API:
https://archive.org/advancedsearch.php?q={query}&output=json- Search collections
Project Structure
mcp-wayback-machine/
├── src/
│ ├── index.ts # MCP server entry point
│ ├── tools/ # Tool implementations
│ │ ├── save.ts # save_url tool
│ │ ├── retrieve.ts # get_archived_url tool
│ │ ├── search.ts # search_archives tool
│ │ └── status.ts # check_archive_status tool
│ ├── utils/ # Utilities
│ │ ├── http.ts # HTTP client with timeout
│ │ ├── validation.ts # URL/timestamp validation
│ │ └── rate-limit.ts # Rate limiting implementation
│ └── *.test.ts # Test files (alongside source)
├── dist/ # Built JavaScript files
├── package.json
├── tsconfig.json
└── README.md
Installation
As a CLI Tool (Quick Start)
Use directly with npx (no installation needed):
npx mcp-wayback-machine save https://example.com
Or install globally:
npm install -g mcp-wayback-machine
wayback save https://example.com
As an MCP Server
Install for use with Claude Desktop:
npm install -g mcp-wayback-machine
From Source
git clone https://github.com/Mearman/mcp-wayback-machine.git
cd mcp-wayback-machine
yarn install
yarn build
Usage
CLI Usage
The tool provides a wayback command (or use npx mcp-wayback-machine):
Save a URL
wayback save https://example.com
# or
npx mcp-wayback-machine save https://example.com
Get an archived version
wayback get https://example.com
wayback get https://example.com --timestamp 20231225120000
wayback get https://example.com --timestamp latest
Search archives
wayback search https://example.com
wayback search https://example.com --limit 20
wayback search https://example.com --from 2023-01-01 --to 2023-12-31
Check archive status
wayback status https://example.com
Get help
wayback --help
wayback save --help
Claude Desktop Configuration
Add to your Claude Desktop settings:
Using npm installation
{
"mcpServers": {
"wayback-machine": {
"command": "npx",
"args": ["mcp-wayback-machine"]
}
}
}
Using local installation
{
"mcpServers": {
"wayback-machine": {
"command": "node",
"args": ["/absolute/path/to/mcp-wayback-machine/dist/index.js"]
}
}
}
For development (without building)
{
"mcpServers": {
"wayback-machine": {
"command": "npx",
"args": ["tsx", "/absolute/path/to/mcp-wayback-machine/src/index.ts"]
}
}
}
Development
Available Commands
yarn dev # Run in development mode with hot reload
yarn test # Run tests with coverage
yarn test:watch # Run tests in watch mode
yarn build # Build for production
yarn start # Run production build
yarn lint # Check code style
yarn lint:fix # Auto-fix code style issues
yarn format # Format code with Biome
Testing
The project uses Vitest for testing with the following features:
- Unit tests for all tools and utilities
- Integration tests for CLI commands
- Coverage reporting with c8
- Tests located alongside source files (
.test.ts)
Run tests:
# Run all tests with coverage
yarn test
# Run tests in watch mode during development
yarn test:watch
# Run CI tests with JSON reporter
yarn test:ci
Examples
Using with Claude Desktop
Once configured, you can ask Claude to:
- "Save https://example.com to the Wayback Machine"
- "Find archived versions of https://example.com from 2023"
- "Check if https://example.com has been archived"
- "Get the latest archived version of https://example.com"
CLI Script Examples
# Archive multiple URLs
for url in "https://example.com" "https://example.org"; do
wayback save "$url"
sleep 5 # Be respectful with rate limiting
done
# Check if a URL was archived today
wayback search "https://example.com" --from $(date +%Y-%m-%d) --to $(date +%Y-%m-%d)
# Export archive data
wayback search "https://example.com" --limit 100 > archives.txt
Troubleshooting
Common Issues
- "URL not found in archive": The URL may not have been archived yet. Try saving it first.
- Rate limit errors: Add delays between requests or reduce request frequency.
- Connection timeouts: Check your internet connection and try again.
- Invalid timestamp format: Use YYYYMMDDhhmmss format (e.g., 20231225120000).
Debug Mode
# Enable debug output
DEBUG=* wayback save https://example.com
# Check MCP server logs
DEBUG=* node dist/index.js
Resources
Official Documentation
- Wayback Machine APIs Overview
- Internet Archive API Documentation
- CDX Server Documentation
- Save Page Now 2 (SPN2) API
- Memento Protocol Guide
Rate Limits & Best Practices
- No hard rate limits for public APIs
- Be respectful - add delays between requests
- Use specific date ranges to reduce CDX result sets
- Cache responses when possible
- Include descriptive User-Agent header
Community
- MCP Discord - Get help and share your experience
- Internet Archive Forum - Wayback Machine discussions
Authenticated APIs (Not Implemented)
For completeness, here are Internet Archive APIs that require authentication but are not included in this MCP server:
S3-Compatible API (IAS3)
- Authentication: S3-style access keys from
https://archive.org/account/s3.php - Features: Upload files, modify metadata, create items, manage collections
- Documentation:
Authenticated Search API
- Authentication: S3 credentials
- Features: Advanced search capabilities, higher rate limits
- Access: Requires Internet Archive account
- Documentation:
Save Page Now 2 (SPN2) - Enhanced Features
- Authentication: Partnership agreement typically required
- Features: Bulk captures, priority processing, higher rate limits
- Documentation:
Partner/Bulk Access APIs
- Authentication: Special partnership agreement
- Features: Bulk downloads, custom data exports, direct database access
- Access: Contact Internet Archive directly
- Documentation:
Getting API Keys
- Create account at archive.org
- Visit S3 API page (requires login)
- Generate Access Key and Secret Key pair
- Configure using
ia configurecommand or manual configuration
Note: This MCP server focuses on public, keyless APIs to maintain simplicity and avoid credential management.
License
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made
- NonCommercial — You may not use the material for commercial purposes
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license
For commercial use or licensing inquiries, please contact the copyright holder.
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。
