HuggingFace Platform and Hub
The GitHub of machine learning -- a French company running the world's largest open-source model hub (1M+ models), with Inference Endpoints deployable to EU regions, making it the default infrastructure for model choice and no vendor lock-in.
The GitHub of machine learning – a French company running the world’s largest open-source model hub (1M+ models), with Inference Endpoints deployable to EU regions, making it the default infrastructure for enterprises that want model choice, EU data residency, and no vendor lock-in.
Company Overview
HuggingFace was founded in 2016 in Paris, France. Originally a chatbot startup, it pivoted to become the central hub for open-source ML. The company has raised over $400M (Series D at $4.5B valuation, August 2023).
Key facts:
- Headquartered in Paris, France (with US offices in New York)
- French company under EU jurisdiction
- 1M+ models hosted on the Hub
- 500k+ datasets, 300k+ Spaces (ML demo apps)
- Used by virtually every ML team in the world
Platform Components
1. Model Hub
The core product. A Git-based hosting platform for ML models.
Scale: 1M+ models from every major lab and community contributor – foundation models, fine-tuned variants, specialized models for every modality.
Key features: Model cards (standardized documentation), Git-based versioning with LFS, gated models requiring license acceptance, Safetensors (safe model serialization format), GGUF support for quantized models.
2. Transformers Library
The most widely used ML library in the world.
1
2
3
4
5
from transformers import pipeline
# Three lines to run any model
classifier = pipeline("text-classification", model="mistralai/Mistral-Nemo-12B")
result = classifier("This product is excellent!")
Ecosystem libraries: datasets, tokenizers, accelerate, peft (LoRA/QLoRA fine-tuning), trl (RLHF training), text-generation-inference (TGI, production serving).
3. Inference Endpoints (Managed Deployment)
Deploy any model from the Hub as a managed API endpoint with region selection.
EU Region Options:
| Provider | Region | Location | GPU Options |
|---|---|---|---|
| AWS | eu-west-1 | Ireland | A10G, A100, H100 |
| AWS | eu-west-2 | London | A10G |
| GCP | europe-west1 | Belgium | T4, A100 |
| GCP | europe-west4 | Netherlands | A100 |
Pricing: Pay per GPU-hour, zero cost when scaled to zero.
| GPU | Hourly Cost | Use Case |
|---|---|---|
| NVIDIA T4 | $0.50-0.80/hr | Small models (<7B), embeddings |
| NVIDIA A10G | $1.00-1.50/hr | Medium models (7-13B) |
| NVIDIA A100 (80GB) | $5.00-6.00/hr | Very large models (70B+) |
| NVIDIA H100 | $8.00-12.00/hr | Frontier models, high throughput |
4. Text Generation Inference (TGI)
Open-source (Apache 2.0) production inference server for LLMs. Key features: continuous batching, Flash Attention 2, tensor parallelism, quantization, OpenAI-compatible API endpoint.
Enterprise Hub
| Feature | Free Org | Enterprise Hub |
|---|---|---|
| Private model repos | Yes (limited) | Unlimited |
| SSO (SAML) | No | Yes |
| Audit logs | No | Yes |
| SLA | None | 99.9% uptime |
| Compliance reports | No | SOC2 Type II |
Pricing: Free, Pro ($9/user/month), Enterprise ($20/user/month), Custom.
EU Data Residency Story
HuggingFace as a French Company
- GDPR-compliant by default
- Not subject to US CLOUD Act for EU operations
Inference Endpoints: True EU Residency
- Compute runs in EU
- Model weights stored in EU
- Inference data never leaves EU
- No training on your data
- VPC peering available for private endpoints
Self-Hosted Option
- Download any open model from the Hub
- Run TGI on your own infrastructure
- Zero HuggingFace dependency at runtime – complete air-gap possible
When to Use HuggingFace
Use When
- You want model choice and no vendor lock-in
- Self-hosting open models
- Fine-tuning is part of your strategy
- EU data residency via self-hosting
- Rapid prototyping (Spaces for demos, Inference API for quick tests)
Avoid When
- You want a turnkey managed AI API (like OpenAI or Anthropic)
- You need frontier reasoning capability (open-weight models lag 6-12 months)
- Your team lacks ML engineering skills
- You want agent frameworks (use LangChain, CrewAI, or Anthropic Agent SDK)
HuggingFace vs. Alternatives
| Dimension | HuggingFace | Vertex AI Model Garden | Azure AI Model Catalog |
|---|---|---|---|
| Model count | 1M+ | ~200 | ~100 |
| Vendor lock-in | None (everything portable) | GCP-centric | Azure-centric |
| Self-host support | TGI (open-source) | Limited | Limited |
| Enterprise features | Enterprise Hub ($20/user) | GCP Enterprise | Azure Enterprise |