Post

Portkey AI Gateway

Portkey is the most feature-complete purpose-built LLM gateway -- it combines routing, guardrails, observability, and prompt management in one platform, which is both its strength and its vendor lock-in risk.

Portkey AI Gateway

Portkey is the most feature-complete purpose-built LLM gateway – it combines routing, guardrails, observability, and prompt management in one platform, which is both its strength and its vendor lock-in risk.


What It Is

Portkey is a purpose-built AI gateway and control plane for production LLM applications. It provides a unified API to 200+ LLM providers, built-in guardrails, virtual keys for credential management, smart caching, automatic fallbacks, and deep observability – all through a single integration point.

Unlike Kong (which adds AI to an API gateway) or LiteLLM (which focuses on the proxy layer), Portkey is a full production platform: gateway + observability + governance + prompt management.


Key Features

Unified API

Single API endpoint that routes to 200+ LLM providers across text, image, audio, and embedding modalities. OpenAI-compatible format – switch providers by changing a config, not code.

Virtual Keys

Abstraction layer over raw API keys. Create virtual keys with:

  • Budget limits (per key, per user, per team)
  • Rate limits
  • Model access restrictions
  • Expiration dates

Teams never touch raw provider credentials. Rotate provider keys without updating any application code.

Guardrails

40+ pre-built guardrails that run on input and output:

  • PII detection and masking
  • Toxicity filtering
  • Prompt injection detection
  • Custom regex and semantic rules
  • Third-party guardrail provider integration
  • Guardrails execute at the gateway level, before/after LLM calls

Smart Caching

Simple caching (exact match) and semantic caching (similarity-based). Reduces cost and latency for repetitive queries.

Automatic Fallbacks & Retries

Configure fallback chains: if Provider A fails or is slow, automatically route to Provider B. Configurable retry logic with exponential backoff.

Observability

Built-in logging, tracing, and analytics dashboard. Track:

  • Cost per request, per user, per model
  • Latency percentiles
  • Token usage patterns
  • Error rates by provider
  • Guardrail trigger rates

Prompt Management

Version-controlled prompt templates with A/B testing. Change prompts without redeploying code.


Architecture

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Client App (OpenAI-compatible SDK)
    |
    v
+----------------------------------+
| Portkey Gateway                   |
|  (runs in your infra or cloud)   |
|                                   |
|  Virtual Key Resolution           |
|       |                           |
|  Input Guardrails                 |
|       |                           |
|  Cache Check (simple/semantic)    |
|       |                           |
|  Router (provider selection,      |
|    fallback, load balancing)      |
|       |                           |
|  Provider API Call                |
|       |                           |
|  Output Guardrails                |
|       |                           |
|  Logging & Analytics              |
+----------------------------------+
    |
    v
LLM Providers (OpenAI, Anthropic, Google, Bedrock, Azure, etc.)

In hybrid/self-hosted mode, the gateway data plane runs in your infrastructure as containerized workloads. Configuration objects (virtual keys, configs, guardrails) are cached locally – no real-time round-trip to the control plane for request processing.


Self-Hosting

Portkey offers multiple deployment models:

Mode Description Data Residency
Cloud Fully managed SaaS Portkey’s infrastructure
Hybrid Data plane in your infra, control plane in Portkey cloud LLM traffic stays in your network
Private Cloud Full deployment in your VPC Complete data isolation
On-Premise Air-gapped deployment Full control

The gateway is containerized and deployable via Kubernetes, ECS, or Docker. Enterprise self-hosted deployments include SOC2 Type 2, GDPR, HIPAA compliance, and custom BAAs.


Pricing / Cost Model

Tier Cost Key Limits
Developer Free 10K logs/month
Production $49/mo 1M logs/month, team features
Scale Based on usage Higher limits, priority support
Enterprise Custom Self-hosted, compliance, SLAs, dedicated support

Portkey bills based on recorded logs, not raw API requests. This is a distinctive model – you pay for observability, not for proxying. The gateway proxying itself has no per-request fee.

For enterprise self-hosted, expect custom pricing in the $30K-100K+/year range depending on scale and deployment model.


When to Use

Strong fit:

  • Teams that want gateway + observability + guardrails in one platform
  • Organizations that need granular access control (virtual keys per team/project)
  • Production workloads with compliance requirements (SOC2, HIPAA, GDPR)
  • Multi-provider strategies with automatic fallback needs

Weak fit:

  • Cost-sensitive teams that only need basic routing – LiteLLM is free
  • Teams that want full open-source with no SaaS dependency
  • Organizations that already have observability (Langfuse, Datadog) and just need a proxy

References

This post is licensed under CC BY 4.0 by the author.