Post

HuggingFace Platform and Hub

The GitHub of machine learning -- a French company running the world's largest open-source model hub (1M+ models), with Inference Endpoints deployable to EU regions, making it the default infrastructure for model choice and no vendor lock-in.

HuggingFace Platform and Hub

The GitHub of machine learning – a French company running the world’s largest open-source model hub (1M+ models), with Inference Endpoints deployable to EU regions, making it the default infrastructure for enterprises that want model choice, EU data residency, and no vendor lock-in.


Company Overview

HuggingFace was founded in 2016 in Paris, France. Originally a chatbot startup, it pivoted to become the central hub for open-source ML. The company has raised over $400M (Series D at $4.5B valuation, August 2023).

Key facts:

  • Headquartered in Paris, France (with US offices in New York)
  • French company under EU jurisdiction
  • 1M+ models hosted on the Hub
  • 500k+ datasets, 300k+ Spaces (ML demo apps)
  • Used by virtually every ML team in the world

Platform Components

1. Model Hub

The core product. A Git-based hosting platform for ML models.

Scale: 1M+ models from every major lab and community contributor – foundation models, fine-tuned variants, specialized models for every modality.

Key features: Model cards (standardized documentation), Git-based versioning with LFS, gated models requiring license acceptance, Safetensors (safe model serialization format), GGUF support for quantized models.

2. Transformers Library

The most widely used ML library in the world.

1
2
3
4
5
from transformers import pipeline

# Three lines to run any model
classifier = pipeline("text-classification", model="mistralai/Mistral-Nemo-12B")
result = classifier("This product is excellent!")

Ecosystem libraries: datasets, tokenizers, accelerate, peft (LoRA/QLoRA fine-tuning), trl (RLHF training), text-generation-inference (TGI, production serving).

3. Inference Endpoints (Managed Deployment)

Deploy any model from the Hub as a managed API endpoint with region selection.

EU Region Options:

Provider Region Location GPU Options
AWS eu-west-1 Ireland A10G, A100, H100
AWS eu-west-2 London A10G
GCP europe-west1 Belgium T4, A100
GCP europe-west4 Netherlands A100

Pricing: Pay per GPU-hour, zero cost when scaled to zero.

GPU Hourly Cost Use Case
NVIDIA T4 $0.50-0.80/hr Small models (<7B), embeddings
NVIDIA A10G $1.00-1.50/hr Medium models (7-13B)
NVIDIA A100 (80GB) $5.00-6.00/hr Very large models (70B+)
NVIDIA H100 $8.00-12.00/hr Frontier models, high throughput

4. Text Generation Inference (TGI)

Open-source (Apache 2.0) production inference server for LLMs. Key features: continuous batching, Flash Attention 2, tensor parallelism, quantization, OpenAI-compatible API endpoint.


Enterprise Hub

Feature Free Org Enterprise Hub
Private model repos Yes (limited) Unlimited
SSO (SAML) No Yes
Audit logs No Yes
SLA None 99.9% uptime
Compliance reports No SOC2 Type II

Pricing: Free, Pro ($9/user/month), Enterprise ($20/user/month), Custom.


EU Data Residency Story

HuggingFace as a French Company

  • GDPR-compliant by default
  • Not subject to US CLOUD Act for EU operations

Inference Endpoints: True EU Residency

  • Compute runs in EU
  • Model weights stored in EU
  • Inference data never leaves EU
  • No training on your data
  • VPC peering available for private endpoints

Self-Hosted Option

  • Download any open model from the Hub
  • Run TGI on your own infrastructure
  • Zero HuggingFace dependency at runtime – complete air-gap possible

When to Use HuggingFace

Use When

  • You want model choice and no vendor lock-in
  • Self-hosting open models
  • Fine-tuning is part of your strategy
  • EU data residency via self-hosting
  • Rapid prototyping (Spaces for demos, Inference API for quick tests)

Avoid When

  • You want a turnkey managed AI API (like OpenAI or Anthropic)
  • You need frontier reasoning capability (open-weight models lag 6-12 months)
  • Your team lacks ML engineering skills
  • You want agent frameworks (use LangChain, CrewAI, or Anthropic Agent SDK)

HuggingFace vs. Alternatives

Dimension HuggingFace Vertex AI Model Garden Azure AI Model Catalog
Model count 1M+ ~200 ~100
Vendor lock-in None (everything portable) GCP-centric Azure-centric
Self-host support TGI (open-source) Limited Limited
Enterprise features Enterprise Hub ($20/user) GCP Enterprise Azure Enterprise

References

This post is licensed under CC BY 4.0 by the author.