Hugging Face MCP Server
Enables access to 200,000+ machine learning models through the Hugging Face Inference API. Supports text generation, image creation, classification, translation, speech processing, embeddings, and more AI tasks.
README
Hugging Face MCP Server
MCP server for accessing the Hugging Face Inference API. Run 200,000+ machine learning models including LLMs, image generation, text classification, embeddings, and more.
Features
- Text Generation: LLMs like Llama-3, Mistral, Gemma
- Image Generation: FLUX, Stable Diffusion XL, SD 2.1
- Text Classification: Sentiment analysis, topic classification
- Token Classification: Named entity recognition, POS tagging
- Question Answering: Extract answers from context
- Summarization: Condense long text
- Translation: 200+ language pairs
- Image-to-Text: Image captioning
- Image Classification: Classify images into categories
- Object Detection: Detect objects with bounding boxes
- Text-to-Speech: Convert text to audio
- Speech Recognition: Transcribe audio (Whisper)
- Embeddings: Get text/sentence embeddings
- And more: Fill-mask, sentence similarity
Setup
Prerequisites
- Hugging Face account
- API token (free or Pro)
Environment Variables
HUGGINGFACE_API_TOKEN(required): Your Hugging Face API token
How to get an API token:
- Go to huggingface.co/settings/tokens
- Click "New token"
- Give it a name and select permissions (read is sufficient for inference)
- Copy the token (starts with
hf_) - Store as
HUGGINGFACE_API_TOKEN
Available Tools
Text Generation Tools
text_generation
Generate text using large language models.
Parameters:
prompt(string, required): Input text promptmodel_id(string, optional): Model ID (default: 'mistralai/Mistral-7B-Instruct-v0.3')max_new_tokens(int, optional): Maximum tokens to generatetemperature(float, optional): Sampling temperature 0-2 (higher = more random)top_p(float, optional): Nucleus sampling 0-1top_k(int, optional): Top-k samplingrepetition_penalty(float, optional): Penalty for repetitionreturn_full_text(bool, optional): Return prompt + generation (default: False)
Popular models:
meta-llama/Llama-3.2-3B-Instruct- Meta's Llama 3.2mistralai/Mistral-7B-Instruct-v0.3- Mistral 7Bgoogle/gemma-2-2b-it- Google Gemma 2HuggingFaceH4/zephyr-7b-beta- Zephyr 7Btiiuae/falcon-7b-instruct- Falcon 7B
Example:
result = await text_generation(
prompt="Write a Python function to calculate fibonacci numbers:",
model_id="mistralai/Mistral-7B-Instruct-v0.3",
max_new_tokens=200,
temperature=0.7,
top_p=0.9
)
Classification Tools
text_classification
Classify text into categories (sentiment, topics, etc.).
Parameters:
text(string, required): Text to classifymodel_id(string, optional): Model ID (default: 'distilbert-base-uncased-finetuned-sst-2-english')
Popular models:
distilbert-base-uncased-finetuned-sst-2-english- Sentiment (positive/negative)facebook/bart-large-mnli- Zero-shot classificationcardiffnlp/twitter-roberta-base-sentiment-latest- Twitter sentimentfiniteautomata/bertweet-base-sentiment-analysis- Tweet sentiment
Example:
result = await text_classification(
text="I love this product! It exceeded my expectations.",
model_id="distilbert-base-uncased-finetuned-sst-2-english"
)
# Returns: [{'label': 'POSITIVE', 'score': 0.9998}]
token_classification
Token-level classification for NER, POS tagging, etc.
Parameters:
text(string, required): Input textmodel_id(string, optional): Model ID (default: 'dslim/bert-base-NER')
Popular models:
dslim/bert-base-NER- Named Entity RecognitionJean-Baptiste/roberta-large-ner-english- Large NER modeldbmdz/bert-large-cased-finetuned-conll03-english- CoNLL-2003 NER
Example:
result = await token_classification(
text="Apple Inc. is located in Cupertino, California.",
model_id="dslim/bert-base-NER"
)
# Returns entities: ORG (Apple Inc.), LOC (Cupertino), LOC (California)
Question Answering & Text Processing
question_answering
Answer questions based on provided context.
Parameters:
question(string, required): Question to answercontext(string, required): Context containing the answermodel_id(string, optional): Model ID (default: 'deepset/roberta-base-squad2')
Popular models:
deepset/roberta-base-squad2- RoBERTa on SQuAD 2.0distilbert-base-cased-distilled-squad- DistilBERT on SQuAD
Example:
result = await question_answering(
question="Where is the Eiffel Tower located?",
context="The Eiffel Tower is a landmark in Paris, France. It was built in 1889.",
model_id="deepset/roberta-base-squad2"
)
# Returns: {'answer': 'Paris, France', 'score': 0.98, 'start': 35, 'end': 48}
summarization
Summarize long text into shorter version.
Parameters:
text(string, required): Text to summarizemodel_id(string, optional): Model ID (default: 'facebook/bart-large-cnn')max_length(int, optional): Maximum summary lengthmin_length(int, optional): Minimum summary length
Popular models:
facebook/bart-large-cnn- BART CNN summarizationgoogle/pegasus-xsum- PEGASUS XSumsshleifer/distilbart-cnn-12-6- Distilled BART
Example:
result = await summarization(
text="Long article text here...",
model_id="facebook/bart-large-cnn",
max_length=130,
min_length=30
)
translation
Translate text between languages.
Parameters:
text(string, required): Text to translatemodel_id(string, required): Model ID for language pair
Popular models:
Helsinki-NLP/opus-mt-en-es- English to SpanishHelsinki-NLP/opus-mt-es-en- Spanish to EnglishHelsinki-NLP/opus-mt-en-fr- English to FrenchHelsinki-NLP/opus-mt-en-de- English to Germanfacebook/mbart-large-50-many-to-many-mmt- Multilingual (50 languages)
Example:
result = await translation(
text="Hello, how are you?",
model_id="Helsinki-NLP/opus-mt-en-es"
)
# Returns: "Hola, ¿cómo estás?"
Image Generation Tools
text_to_image
Generate images from text prompts.
Parameters:
prompt(string, required): Text description of desired imagemodel_id(string, optional): Model ID (default: 'black-forest-labs/FLUX.1-dev')negative_prompt(string, optional): What to avoid in imagenum_inference_steps(int, optional): Number of denoising stepsguidance_scale(float, optional): How closely to follow prompt
Popular models:
black-forest-labs/FLUX.1-dev- FLUX.1 (high quality)stabilityai/stable-diffusion-xl-base-1.0- SDXLstabilityai/stable-diffusion-2-1- SD 2.1runwayml/stable-diffusion-v1-5- SD 1.5
Example:
result = await text_to_image(
prompt="A serene mountain landscape at sunset, photorealistic, 8k",
model_id="black-forest-labs/FLUX.1-dev",
negative_prompt="blurry, low quality, distorted",
guidance_scale=7.5
)
# Returns: {'image': 'base64_encoded_image', 'format': 'base64'}
Computer Vision Tools
image_to_text
Generate text descriptions from images (captioning).
Parameters:
image_base64(string, required): Base64 encoded imagemodel_id(string, optional): Model ID (default: 'Salesforce/blip-image-captioning-large')
Popular models:
Salesforce/blip-image-captioning-large- BLIP largenlpconnect/vit-gpt2-image-captioning- ViT-GPT2
Example:
result = await image_to_text(
image_base64="base64_encoded_image_data",
model_id="Salesforce/blip-image-captioning-large"
)
# Returns: [{'generated_text': 'a dog playing in the park'}]
image_classification
Classify images into categories.
Parameters:
image_base64(string, required): Base64 encoded imagemodel_id(string, optional): Model ID (default: 'google/vit-base-patch16-224')
Popular models:
google/vit-base-patch16-224- Vision Transformermicrosoft/resnet-50- ResNet-50
Example:
result = await image_classification(
image_base64="base64_encoded_image_data",
model_id="google/vit-base-patch16-224"
)
# Returns: [{'label': 'golden retriever', 'score': 0.95}, ...]
object_detection
Detect objects in images with bounding boxes.
Parameters:
image_base64(string, required): Base64 encoded imagemodel_id(string, optional): Model ID (default: 'facebook/detr-resnet-50')
Popular models:
facebook/detr-resnet-50- DETR with ResNet-50hustvl/yolos-tiny- YOLOS tiny
Example:
result = await object_detection(
image_base64="base64_encoded_image_data",
model_id="facebook/detr-resnet-50"
)
# Returns: [{'label': 'dog', 'score': 0.98, 'box': {...}}, ...]
Audio Tools
text_to_speech
Convert text to speech audio.
Parameters:
text(string, required): Text to synthesizemodel_id(string, optional): Model ID (default: 'facebook/mms-tts-eng')
Popular models:
facebook/mms-tts-eng- MMS TTS Englishespnet/kan-bayashi_ljspeech_vits- VITS LJSpeech
Example:
result = await text_to_speech(
text="Hello, this is a test of text to speech.",
model_id="facebook/mms-tts-eng"
)
# Returns: {'audio': 'base64_encoded_audio', 'format': 'base64'}
automatic_speech_recognition
Transcribe audio to text (speech recognition).
Parameters:
audio_base64(string, required): Base64 encoded audiomodel_id(string, optional): Model ID (default: 'openai/whisper-large-v3')
Popular models:
openai/whisper-large-v3- Whisper large v3 (best quality)openai/whisper-medium- Whisper medium (faster)facebook/wav2vec2-base-960h- Wav2Vec 2.0
Example:
result = await automatic_speech_recognition(
audio_base64="base64_encoded_audio_data",
model_id="openai/whisper-large-v3"
)
# Returns: {'text': 'transcribed audio text here'}
Embedding & Similarity Tools
sentence_similarity
Compute similarity between sentences.
Parameters:
source_sentence(string, required): Reference sentencesentences(list, required): List of sentences to comparemodel_id(string, optional): Model ID (default: 'sentence-transformers/all-MiniLM-L6-v2')
Popular models:
sentence-transformers/all-MiniLM-L6-v2- Fast, good qualitysentence-transformers/all-mpnet-base-v2- Best qualityBAAI/bge-small-en-v1.5- BGE small
Example:
result = await sentence_similarity(
source_sentence="The cat sits on the mat",
sentences=[
"A cat is sitting on a mat",
"The dog runs in the park",
"Cats are great pets"
],
model_id="sentence-transformers/all-MiniLM-L6-v2"
)
# Returns: [0.95, 0.23, 0.65]
feature_extraction
Get embeddings (feature vectors) for text.
Parameters:
text(string, required): Input textmodel_id(string, optional): Model ID (default: 'sentence-transformers/all-MiniLM-L6-v2')
Popular models:
sentence-transformers/all-MiniLM-L6-v2- 384 dimensionssentence-transformers/all-mpnet-base-v2- 768 dimensionsBAAI/bge-small-en-v1.5- 384 dimensions
Example:
result = await feature_extraction(
text="This is a sample sentence.",
model_id="sentence-transformers/all-MiniLM-L6-v2"
)
# Returns: [[0.012, -0.034, 0.056, ...]] (384-dimensional vector)
fill_mask
Fill in masked words in text.
Parameters:
text(string, required): Text with [MASK] tokenmodel_id(string, optional): Model ID (default: 'bert-base-uncased')
Popular models:
bert-base-uncased- BERT baseroberta-base- RoBERTa basedistilbert-base-uncased- DistilBERT
Example:
result = await fill_mask(
text="Paris is the [MASK] of France.",
model_id="bert-base-uncased"
)
# Returns: [{'token_str': 'capital', 'score': 0.95}, ...]
Model Loading & Cold Starts
Important: Models may take 20-60 seconds to load on first request (cold start). Subsequent requests are faster.
Tips:
- Use popular models for faster loading
- Implement retry logic for timeouts
- Consider caching model responses
- Use smaller models for faster inference
Rate Limits
Free Tier
- Rate limited to prevent abuse
- Suitable for testing and small projects
- May experience queuing during high load
Pro Subscription ($9/month)
- No rate limits
- Priority access to models
- Faster inference
- No queuing
Visit huggingface.co/pricing for details.
Base64 Encoding
For images and audio, you need to provide base64 encoded data:
Python example:
import base64
# Encode image
with open("image.jpg", "rb") as f:
image_base64 = base64.b64encode(f.read()).decode('utf-8')
# Encode audio
with open("audio.wav", "rb") as f:
audio_base64 = base64.b64encode(f.read()).decode('utf-8')
# Decode image response
image_bytes = base64.b64decode(response['image'])
with open("generated.jpg", "wb") as f:
f.write(image_bytes)
Parameter Tuning
Text Generation
- temperature (0-2): Higher = more creative/random, Lower = more focused/deterministic
- top_p (0-1): Nucleus sampling, typically 0.9-0.95
- top_k: Number of highest probability tokens to keep
- repetition_penalty: Penalize repeated tokens (>1.0 reduces repetition)
Image Generation
- guidance_scale (1-20): Higher = follows prompt more strictly (typical: 7-7.5)
- num_inference_steps: More steps = higher quality but slower (typical: 20-50)
- negative_prompt: Describe what you don't want in the image
Error Handling
Common errors:
- 503 Service Unavailable: Model is loading (cold start), retry after 20-60 seconds
- 401 Unauthorized: Invalid or missing API token
- 429 Too Many Requests: Rate limit exceeded (upgrade to Pro)
- 400 Bad Request: Invalid parameters or model ID
- 504 Gateway Timeout: Model took too long to respond
Retry logic example:
import time
max_retries = 3
for attempt in range(max_retries):
try:
result = await text_generation(prompt="Hello")
break
except httpx.HTTPStatusError as e:
if e.response.status_code == 503 and attempt < max_retries - 1:
time.sleep(20) # Wait for model to load
continue
raise
Finding Models
Browse models:
- Visit huggingface.co/models
- Filter by task (Text Generation, Image Generation, etc.)
- Sort by downloads, likes, or trending
- Check model card for usage examples
Popular categories:
- Text Generation: 50,000+ models
- Text Classification: 30,000+ models
- Image Generation: 10,000+ models
- Translation: 5,000+ models
- Embeddings: 3,000+ models
Best Practices
- Use popular models: Faster loading and better maintained
- Implement timeouts: Set appropriate timeouts (60-120 seconds)
- Cache responses: Store results to reduce API calls
- Handle cold starts: Implement retry logic for 503 errors
- Monitor usage: Track API calls and costs
- Test locally: Use Hugging Face Transformers library for testing
- Read model cards: Understand model capabilities and limitations
- Optimize parameters: Tune settings for your use case
Use Cases
- Chatbots: LLM-powered conversational AI
- Content Generation: Blog posts, articles, creative writing
- Image Creation: Art, illustrations, product images
- Sentiment Analysis: Customer feedback analysis
- Translation: Multi-language support
- Transcription: Meeting notes, podcast transcripts
- Semantic Search: Embedding-based search
- Data Extraction: NER for document processing
- Content Moderation: Text and image classification
API Documentation
Support
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。