Jina AI Remote MCP Server
Provides web content extraction, search capabilities (web, arXiv, SSRN, images), semantic deduplication, and reranking through Jina AI's Reader, Embeddings, and Reranker APIs.
README
Jina AI Remote MCP Server
A remote Model Context Protocol (MCP) server that provides access to Jina Reader, Embeddings and Reranker APIs with a suite of URL-to-markdown, web search, image search, and embeddings/reranker tools:
| Tool | Description | Is Jina API Key Required? |
|---|---|---|
primer |
Get current contextual information for localized, time-aware responses | No |
read_url |
Extract clean, structured content from web pages as markdown via Reader API | Optional* |
capture_screenshot_url |
Capture high-quality screenshots of web pages via Reader API | Optional* |
guess_datetime_url |
Analyze web pages for last update/publish datetime with confidence scores | No |
search_web |
Search the entire web for current information and news via Reader API | Yes |
search_arxiv |
Search academic papers and preprints on arXiv repository via Reader API | Yes |
search_ssrn |
Search academic papers on SSRN (Social Science Research Network) via Reader API | Yes |
search_images |
Search for images across the web (similar to Google Images) via Reader API | Yes |
expand_query |
Expand and rewrite search queries based on the query expansion model via Reader API | Yes |
parallel_read_url |
Read multiple web pages in parallel for efficient content extraction via Reader API | Optional* |
parallel_search_web |
Run multiple web searches in parallel for comprehensive topic coverage and diverse perspectives via Reader API | Yes |
parallel_search_arxiv |
Run multiple arXiv searches in parallel for comprehensive research coverage and diverse academic angles via Reader API | Yes |
parallel_search_ssrn |
Run multiple SSRN searches in parallel for comprehensive social science research coverage via Reader API | Yes |
sort_by_relevance |
Rerank documents by relevance to a query via Reranker API | Yes |
deduplicate_strings |
Get top-k semantically unique strings via Embeddings API and submodular optimization | Yes |
deduplicate_images |
Get top-k semantically unique images via Embeddings API and submodular optimization | Yes |
Optional tools work without an API key but have rate limits. For higher rate limits and better performance, use a Jina API key. You can get a free Jina API key from https://jina.ai
Usage
[!WARNING] Some clients do not support env variable, so you may need to replace
${JINA_API_KEY}below to a hardcoded real API keyjina_xxx.
For client that supports remote MCP server:
{
"mcpServers": {
"jina-mcp-server": {
"url": "https://mcp.jina.ai/sse",
"headers": {
"Authorization": "Bearer ${JINA_API_KEY}" // optional
}
}
}
}
For client that does not support remote MCP server yet, you need mcp-remote a local proxy to connect to the remote MCP server.
{
"mcpServers": {
"jina-mcp-server": {
"command": "npx",
"args": [
"mcp-remote",
"https://mcp.jina.ai/sse"
"--header",
"Authorization: Bearer ${JINA_API_KEY}"
]
}
}
}
For Claude Code:
claude mcp add --transport sse jina https://mcp.jina.ai/sse \
--header "Authorization : Bearer ${JINA_API_KEY}"
For OpenAI Codex: find ~/.codex/config.toml and add the following:
[mcp_servers.jina-mcp-server]
command = "npx"
args = [
"-y",
"mcp-remote",
"https://mcp.jina.ai/sse",
"--header",
"Authorization: Bearer ${JINA_API_KEY}"]
Troubleshooting
I got stuck in a tool calling loop - what happened?
This is a common issue with LMStudio when the default context window is 4096 and you're using a thinking model like gpt-oss-120b or qwen3-4b-thinking. As the thinking and tool calling continue, once you hit the context window limit, the AI starts losing track of the beginning of the task. That's how it gets trapped in this rolling context window.
The solution is to load the model with enough context length to contain the full tool calling chain and thought process.

I can't see all tools.
Some MCP clients have local caching and do not actively update tool definitions. If you're not seeing all the available tools or if tools seem outdated, you may need to remove and re-add the jina-mcp-server to your MCP client configuration. This will force the client to refresh its cached tool definitions. In LMStudio, you can click the refresh button to load new tools.

Claude Desktop says "Server disconnected" on Windows
Cursor and Claude Desktop (Windows) have a bug where spaces inside args aren't escaped when it invokes npx, which ends up mangling these values. You can work around it using:
{
// rest of config...
"args": [
"mcp-remote",
"https://mcp.jina.ai/sse",
"--header",
"Authorization:${AUTH_HEADER}" // note no spaces around ':'
],
"env": {
"AUTH_HEADER": "Bearer <JINA_API_KEY>" // spaces OK in env vars
}
},
Cursor shows a red dot on this MCP status
Likely a UI bug from Cursor, but the MCP works correctly without any problem. You can toggle off/on to "restart" the MCP if you find the red dot annoying (fact is, since you are using this as a remote MCP, it's not a real "server restart" but mostly a local proxy restart).

My LLM never uses some tools
Assuming all tools are enabled in your MCP client but LLM still never uses some tools or favors some over others, this is pretty common when an LLM is trained with a specific set of tools. For example, we rarely see parallel_* tools being used organically by LLMs unless they are explicitly instructed to do so. Some research says LLMs must be trained to use parallel_*. Models like Qwen3-Next natively prefer to call the singleton version but with multiple queries in an array to achieve parallelism (which our MCP also support now). Either way, in Cursor, you can add the following rule to your .mdc file:
---
alwaysApply: true
---
When you are uncertain about knowledge, or the user doubts your answer, always use Jina MCP tools to search and read best practices and latest information. Use search_arxiv and read_url together when questions relate to theoretical deep learning or algorithm details. Use search_ssrn for social sciences, economics, law, and finance research. search_web, search_arxiv, and search_ssrn cannot be used alone - always combine with read_url or parallel_read_url to read from multiple sources. Remember: every search must be complemented with read_url to read the source URL content. For maximum efficiency, use parallel_* versions of search and read when necessary.
Why is my content truncated?
Claude Code, Claude Desktop, and Cursor enforce a fixed 25k token limit on MCP tool responses. To prevent these clients from rejecting large responses entirely, this server applies a token guardrail specifically for read_url and parallel_read_url tools when connecting from these clients. For a single large content item, the text is truncated proportionally to fit within the token budget. For responses containing multiple items, the server keeps items in order until adding the next item would exceed the limit, then stops there. This ensures the response always fits within client constraints while preserving as much content as possible. Other clients like OpenAI Codex have configurable limits (tool_output_token_limit in config) so no server-side truncation is applied.
Using parallel tools vs singleton tools with arrays
Claude Code recently started preferring parallel_* tools (like parallel_search_web, parallel_read_url) for concurrent operations. However, models like Qwen3-Next prefer calling singleton tools with multiple queries in an array. Both approaches work: the singleton versions (search_web, search_arxiv, search_ssrn, read_url) accept either a single string or an array of strings for the query/url parameter. When given an array, these tools automatically execute all queries in parallel internally, producing the same concurrent behavior as explicitly calling parallel_* tools. Use whichever style your model prefers.
Developer Guide
Local Development
# Clone the repository
git clone https://github.com/jina-ai/MCP.git
cd MCP
# Install dependencies
npm install
# Start development server
npm run start
Deploy to Cloudflare Workers
This will deploy your MCP server to a URL like: jina-mcp-server.<your-account>.workers.dev/sse
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。