Portkey AI Gateway

Portkey is the most feature-complete purpose-built LLM gateway -- it combines routing, guardrails, observability, and prompt management in one platform, which is both its strength and its vendor lock-in risk.

Posted Jan 28, 2026

3 min read

Portkey AI Gateway

Portkey is the most feature-complete purpose-built LLM gateway – it combines routing, guardrails, observability, and prompt management in one platform, which is both its strength and its vendor lock-in risk.

What It Is

Portkey is a purpose-built AI gateway and control plane for production LLM applications. It provides a unified API to 200+ LLM providers, built-in guardrails, virtual keys for credential management, smart caching, automatic fallbacks, and deep observability – all through a single integration point.

Unlike Kong (which adds AI to an API gateway) or LiteLLM (which focuses on the proxy layer), Portkey is a full production platform: gateway + observability + governance + prompt management.

Key Features

Unified API

Single API endpoint that routes to 200+ LLM providers across text, image, audio, and embedding modalities. OpenAI-compatible format – switch providers by changing a config, not code.

Virtual Keys

Abstraction layer over raw API keys. Create virtual keys with:

Budget limits (per key, per user, per team)
Rate limits
Model access restrictions
Expiration dates

Teams never touch raw provider credentials. Rotate provider keys without updating any application code.

Guardrails

40+ pre-built guardrails that run on input and output:

PII detection and masking
Toxicity filtering
Prompt injection detection
Custom regex and semantic rules
Third-party guardrail provider integration
Guardrails execute at the gateway level, before/after LLM calls

Smart Caching

Simple caching (exact match) and semantic caching (similarity-based). Reduces cost and latency for repetitive queries.

Automatic Fallbacks & Retries

Configure fallback chains: if Provider A fails or is slow, automatically route to Provider B. Configurable retry logic with exponential backoff.

Observability

Built-in logging, tracing, and analytics dashboard. Track:

Cost per request, per user, per model
Latency percentiles
Token usage patterns
Error rates by provider
Guardrail trigger rates

Prompt Management

Version-controlled prompt templates with A/B testing. Change prompts without redeploying code.

Architecture

Client App (OpenAI-compatible SDK)
    |
    v
+----------------------------------+
| Portkey Gateway                   |
|  (runs in your infra or cloud)   |
|                                   |
|  Virtual Key Resolution           |
|       |                           |
|  Input Guardrails                 |
|       |                           |
|  Cache Check (simple/semantic)    |
|       |                           |
|  Router (provider selection,      |
|    fallback, load balancing)      |
|       |                           |
|  Provider API Call                |
|       |                           |
|  Output Guardrails                |
|       |                           |
|  Logging & Analytics              |
+----------------------------------+
    |
    v
LLM Providers (OpenAI, Anthropic, Google, Bedrock, Azure, etc.)

In hybrid/self-hosted mode, the gateway data plane runs in your infrastructure as containerized workloads. Configuration objects (virtual keys, configs, guardrails) are cached locally – no real-time round-trip to the control plane for request processing.

Self-Hosting

Portkey offers multiple deployment models:

Mode	Description	Data Residency
Cloud	Fully managed SaaS	Portkey’s infrastructure
Hybrid	Data plane in your infra, control plane in Portkey cloud	LLM traffic stays in your network
Private Cloud	Full deployment in your VPC	Complete data isolation
On-Premise	Air-gapped deployment	Full control

The gateway is containerized and deployable via Kubernetes, ECS, or Docker. Enterprise self-hosted deployments include SOC2 Type 2, GDPR, HIPAA compliance, and custom BAAs.

Pricing / Cost Model

Tier	Cost	Key Limits
Developer	Free	10K logs/month
Production	$49/mo	1M logs/month, team features
Scale	Based on usage	Higher limits, priority support
Enterprise	Custom	Self-hosted, compliance, SLAs, dedicated support

Portkey bills based on recorded logs, not raw API requests. This is a distinctive model – you pay for observability, not for proxying. The gateway proxying itself has no per-request fee.

For enterprise self-hosted, expect custom pricing in the $30K-100K+/year range depending on scale and deployment model.

When to Use

Strong fit:

Teams that want gateway + observability + guardrails in one platform
Organizations that need granular access control (virtual keys per team/project)
Production workloads with compliance requirements (SOC2, HIPAA, GDPR)
Multi-provider strategies with automatic fallback needs

Weak fit:

Cost-sensitive teams that only need basic routing – LiteLLM is free
Teams that want full open-source with no SaaS dependency
Organizations that already have observability (Langfuse, Datadog) and just need a proxy

References

AI & Agents, AI Ops

agent-frameworks

This post is licensed under CC BY 4.0 by the author.