Post

Apache APISIX AI Gateway

APISIX is the plugin-driven alternative -- if you already run APISIX for API management, its AI plugins add LLM proxying, smart routing, and token rate limiting without adopting a new gateway.

Apache APISIX AI Gateway

APISIX is the plugin-driven alternative – if you already run APISIX for API management, its AI plugins add LLM proxying, smart routing, and token rate limiting without adopting a new gateway.


What It Is

Apache APISIX is a cloud-native, open-source API gateway (Apache 2.0) that has added AI-specific capabilities through its plugin ecosystem. The AI features include LLM proxying, multi-model load balancing, token-based rate limiting, context-aware routing, and health checks for AI backends.

APISIX positions itself similarly to Kong: a general-purpose API gateway with AI extensions, not a purpose-built AI/agent gateway. The difference is that APISIX is fully open-source with no enterprise-gated features.


Key Features

LLM Proxy (ai-proxy plugin)

Unified proxy to multiple LLM providers: OpenAI, Anthropic (Claude), DeepSeek, Mistral, Gemini, and more. Translates between provider API formats.

Multi-LLM Load Balancing

Dynamically adjust routing weights across LLM providers based on:

  • Latency
  • Cost
  • Availability / stability
  • Health check status

Context-Aware Routing

Route requests to different models based on prompt content, task type, or custom headers. Route simple queries to cheaper models, complex ones to capable models.

Token-Based Rate Limiting

Rate limit by token consumption, not just request count. Critical for LLM workloads.

Health Checks for AI Backends

As of v3.14.0: monitor AI backend health and dynamically route to healthy endpoints. Detect and avoid degraded or rate-limited providers.

AI-Specific Observability

New logging variables:

  • Request type classification (traditional HTTP, AI chat, AI stream)
  • llm_time_to_first_token for latency tracking
  • Token usage per request

Plugin Ecosystem

200+ plugins for auth, security, traffic management, observability. AI features compose with existing plugins – add JWT auth + token rate limiting + AI proxy in one configuration.


Architecture

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Client / Agent
    |
    v
+----------------------------------+
| Apache APISIX                     |
|  (Nginx + Lua, cloud-native)     |
|                                   |
|  [Auth Plugin]                   |
|       |                           |
|  [ai-proxy Plugin]              |
|    Provider normalization        |
|       |                           |
|  [ai-rate-limit Plugin]         |
|    Token-based throttling        |
|       |                           |
|  [Context-Aware Router]         |
|    Route by task type / content  |
|       |                           |
|  [Health Check]                  |
|    Monitor backend availability  |
|       |                           |
|  [Logging / Metrics]            |
+----------------------------------+
    |
    v
LLM Providers / Self-hosted Models

Self-Hosting & Pricing

Fully open-source (Apache 2.0). All AI features are available in the open-source version – no enterprise-gated plugins (unlike Kong).

Component License Cost
APISIX core Apache 2.0 Free
All AI plugins Apache 2.0 Free
APISIX Dashboard Apache 2.0 Free

API7.ai offers commercial support and a managed platform, but the self-hosted version is feature-complete.


Limitations

  • No A2A protocol support – does not understand agent-to-agent communication natively
  • No MCP gateway – no MCP server proxying or tool governance
  • No semantic caching – exact match only
  • No virtual keys / budget management – more infrastructure-level than application-level
  • No guardrails – no PII detection, content filtering at the gateway level
  • Smaller AI community than Kong or LiteLLM – fewer AI-specific examples and integrations
  • Primary strength is as an LLM proxy, not an agent gateway – included here because it’s mentioned alongside Kong and Envoy as API gateways adding AI features

When to Use

Strong fit:

  • Already running APISIX for API management and want to extend for LLM traffic
  • Want fully open-source AI gateway with zero enterprise licensing (Kong alternative)
  • Kubernetes-native deployments with existing APISIX infrastructure
  • Need plugin-driven extensibility for custom AI traffic logic

Weak fit:

  • Need agent-to-agent (A2A) routing – use Kong or agentgateway
  • Need MCP gateway – use Kong, Cloudflare, or Portkey
  • Need guardrails, virtual keys, budget management – use Portkey or Kong
  • Greenfield AI platform – LiteLLM or Portkey is more purpose-built

References

This post is licensed under CC BY 4.0 by the author.