Apache APISIX AI Gateway
APISIX is the plugin-driven alternative -- if you already run APISIX for API management, its AI plugins add LLM proxying, smart routing, and token rate limiting without adopting a new gateway.
APISIX is the plugin-driven alternative – if you already run APISIX for API management, its AI plugins add LLM proxying, smart routing, and token rate limiting without adopting a new gateway.
What It Is
Apache APISIX is a cloud-native, open-source API gateway (Apache 2.0) that has added AI-specific capabilities through its plugin ecosystem. The AI features include LLM proxying, multi-model load balancing, token-based rate limiting, context-aware routing, and health checks for AI backends.
APISIX positions itself similarly to Kong: a general-purpose API gateway with AI extensions, not a purpose-built AI/agent gateway. The difference is that APISIX is fully open-source with no enterprise-gated features.
Key Features
LLM Proxy (ai-proxy plugin)
Unified proxy to multiple LLM providers: OpenAI, Anthropic (Claude), DeepSeek, Mistral, Gemini, and more. Translates between provider API formats.
Multi-LLM Load Balancing
Dynamically adjust routing weights across LLM providers based on:
- Latency
- Cost
- Availability / stability
- Health check status
Context-Aware Routing
Route requests to different models based on prompt content, task type, or custom headers. Route simple queries to cheaper models, complex ones to capable models.
Token-Based Rate Limiting
Rate limit by token consumption, not just request count. Critical for LLM workloads.
Health Checks for AI Backends
As of v3.14.0: monitor AI backend health and dynamically route to healthy endpoints. Detect and avoid degraded or rate-limited providers.
AI-Specific Observability
New logging variables:
- Request type classification (traditional HTTP, AI chat, AI stream)
llm_time_to_first_tokenfor latency tracking- Token usage per request
Plugin Ecosystem
200+ plugins for auth, security, traffic management, observability. AI features compose with existing plugins – add JWT auth + token rate limiting + AI proxy in one configuration.
Architecture
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Client / Agent
|
v
+----------------------------------+
| Apache APISIX |
| (Nginx + Lua, cloud-native) |
| |
| [Auth Plugin] |
| | |
| [ai-proxy Plugin] |
| Provider normalization |
| | |
| [ai-rate-limit Plugin] |
| Token-based throttling |
| | |
| [Context-Aware Router] |
| Route by task type / content |
| | |
| [Health Check] |
| Monitor backend availability |
| | |
| [Logging / Metrics] |
+----------------------------------+
|
v
LLM Providers / Self-hosted Models
Self-Hosting & Pricing
Fully open-source (Apache 2.0). All AI features are available in the open-source version – no enterprise-gated plugins (unlike Kong).
| Component | License | Cost |
|---|---|---|
| APISIX core | Apache 2.0 | Free |
| All AI plugins | Apache 2.0 | Free |
| APISIX Dashboard | Apache 2.0 | Free |
API7.ai offers commercial support and a managed platform, but the self-hosted version is feature-complete.
Limitations
- No A2A protocol support – does not understand agent-to-agent communication natively
- No MCP gateway – no MCP server proxying or tool governance
- No semantic caching – exact match only
- No virtual keys / budget management – more infrastructure-level than application-level
- No guardrails – no PII detection, content filtering at the gateway level
- Smaller AI community than Kong or LiteLLM – fewer AI-specific examples and integrations
- Primary strength is as an LLM proxy, not an agent gateway – included here because it’s mentioned alongside Kong and Envoy as API gateways adding AI features
When to Use
Strong fit:
- Already running APISIX for API management and want to extend for LLM traffic
- Want fully open-source AI gateway with zero enterprise licensing (Kong alternative)
- Kubernetes-native deployments with existing APISIX infrastructure
- Need plugin-driven extensibility for custom AI traffic logic
Weak fit:
- Need agent-to-agent (A2A) routing – use Kong or agentgateway
- Need MCP gateway – use Kong, Cloudflare, or Portkey
- Need guardrails, virtual keys, budget management – use Portkey or Kong
- Greenfield AI platform – LiteLLM or Portkey is more purpose-built