Apache APISIX AI Gateway

APISIX is the plugin-driven alternative -- if you already run APISIX for API management, its AI plugins add LLM proxying, smart routing, and token rate limiting without adopting a new gateway.

Posted Feb 15, 2026

3 min read

APISIX is the plugin-driven alternative – if you already run APISIX for API management, its AI plugins add LLM proxying, smart routing, and token rate limiting without adopting a new gateway.

What It Is

Apache APISIX is a cloud-native, open-source API gateway (Apache 2.0) that has added AI-specific capabilities through its plugin ecosystem. The AI features include LLM proxying, multi-model load balancing, token-based rate limiting, context-aware routing, and health checks for AI backends.

APISIX positions itself similarly to Kong: a general-purpose API gateway with AI extensions, not a purpose-built AI/agent gateway. The difference is that APISIX is fully open-source with no enterprise-gated features.

Key Features

LLM Proxy (ai-proxy plugin)

Unified proxy to multiple LLM providers: OpenAI, Anthropic (Claude), DeepSeek, Mistral, Gemini, and more. Translates between provider API formats.

Multi-LLM Load Balancing

Dynamically adjust routing weights across LLM providers based on:

Latency
Cost
Availability / stability
Health check status

Context-Aware Routing

Route requests to different models based on prompt content, task type, or custom headers. Route simple queries to cheaper models, complex ones to capable models.

Token-Based Rate Limiting

Rate limit by token consumption, not just request count. Critical for LLM workloads.

Health Checks for AI Backends

As of v3.14.0: monitor AI backend health and dynamically route to healthy endpoints. Detect and avoid degraded or rate-limited providers.

AI-Specific Observability

New logging variables:

Request type classification (traditional HTTP, AI chat, AI stream)
llm_time_to_first_token for latency tracking
Token usage per request

Plugin Ecosystem

200+ plugins for auth, security, traffic management, observability. AI features compose with existing plugins – add JWT auth + token rate limiting + AI proxy in one configuration.

Architecture

Client / Agent
    |
    v
+----------------------------------+
| Apache APISIX                     |
|  (Nginx + Lua, cloud-native)     |
|                                   |
|  [Auth Plugin]                   |
|       |                           |
|  [ai-proxy Plugin]              |
|    Provider normalization        |
|       |                           |
|  [ai-rate-limit Plugin]         |
|    Token-based throttling        |
|       |                           |
|  [Context-Aware Router]         |
|    Route by task type / content  |
|       |                           |
|  [Health Check]                  |
|    Monitor backend availability  |
|       |                           |
|  [Logging / Metrics]            |
+----------------------------------+
    |
    v
LLM Providers / Self-hosted Models

Self-Hosting & Pricing

Fully open-source (Apache 2.0). All AI features are available in the open-source version – no enterprise-gated plugins (unlike Kong).

Component	License	Cost
APISIX core	Apache 2.0	Free
All AI plugins	Apache 2.0	Free
APISIX Dashboard	Apache 2.0	Free

API7.ai offers commercial support and a managed platform, but the self-hosted version is feature-complete.

Limitations

No A2A protocol support – does not understand agent-to-agent communication natively
No MCP gateway – no MCP server proxying or tool governance
No semantic caching – exact match only
No virtual keys / budget management – more infrastructure-level than application-level
No guardrails – no PII detection, content filtering at the gateway level
Smaller AI community than Kong or LiteLLM – fewer AI-specific examples and integrations
Primary strength is as an LLM proxy, not an agent gateway – included here because it’s mentioned alongside Kong and Envoy as API gateways adding AI features

When to Use

Strong fit:

Already running APISIX for API management and want to extend for LLM traffic
Want fully open-source AI gateway with zero enterprise licensing (Kong alternative)
Kubernetes-native deployments with existing APISIX infrastructure
Need plugin-driven extensibility for custom AI traffic logic

Weak fit:

Need agent-to-agent (A2A) routing – use Kong or agentgateway
Need MCP gateway – use Kong, Cloudflare, or Portkey
Need guardrails, virtual keys, budget management – use Portkey or Kong
Greenfield AI platform – LiteLLM or Portkey is more purpose-built

References

AI & Agents, AI Ops

agent-frameworks

This post is licensed under CC BY 4.0 by the author.