Post

KAgent (Kubernetes Agent)

A Kubernetes-native platform for deploying, managing, and operating AI agents as first-class Kubernetes resources, built on Microsoft AutoGen and contributed to the CNCF ecosystem.

KAgent (Kubernetes Agent)

A Kubernetes-native platform for deploying, managing, and operating AI agents as first-class Kubernetes resources, built on Microsoft AutoGen and contributed to the CNCF ecosystem – bridging the gap between agent frameworks and production infrastructure.


What Kagent Is

Kagent (stylized as “kagent”) emerged in early 2025 as one of the first serious attempts to make AI agents Kubernetes-native. Rather than treating agents as arbitrary containers that happen to run on K8s, Kagent introduces Custom Resource Definitions (CRDs) that let you define agents, their tools, their models, and their configurations declaratively in YAML – the same way you define Deployments, Services, and Ingresses.

The project builds on Microsoft’s AutoGen framework for multi-agent orchestration but wraps it in Kubernetes primitives so that platform teams can manage agent lifecycles with the same GitOps tooling (ArgoCD, Flux) they already use for the rest of their infrastructure.


Architecture

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
+---------------------------------------------------------------+
|  Kubernetes Cluster                                            |
|                                                                |
|  +-------------------+     +-----------------------------+    |
|  | Kagent Controller  |     | Agent Pod                    |   |
|  | (watches CRDs)     |---->| +-------------------------+  |   |
|  +-------------------+     | | AutoGen Runtime         |  |   |
|                             | | +-----+ +-----+ +----+ |  |   |
|  +-------------------+     | | |Agent1| |Agent2| |Tool| |  |   |
|  | CRDs:              |     | | +-----+ +-----+ +----+ |  |   |
|  | - Agent             |     | +-------------------------+  |   |
|  | - AgentTeam         |     +-----------------------------+    |
|  | - Tool              |                                        |
|  | - ModelConfig       |     +-----------------------------+    |
|  +-------------------+     | LLM Backend (vLLM, Ollama,  |    |
|                             | or cloud API)                |    |
|                             +-----------------------------+    |
+---------------------------------------------------------------+

Core CRDs

Kagent introduces several Custom Resource Definitions:

Agent – defines a single AI agent with its system prompt, model configuration, and tool bindings:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: kagent.dev/v1alpha1
kind: Agent
metadata:
  name: k8s-troubleshooter
  namespace: ai-agents
spec:
  description: "Diagnoses Kubernetes issues using cluster context"
  systemMessage: |
    You are a Kubernetes troubleshooting expert. When given a problem,
    investigate using the available tools and provide a diagnosis with
    recommended remediation steps.
  modelConfigRef:
    name: gpt4-config
  tools:
  - name: kubectl-get
  - name: kubectl-describe
  - name: kubectl-logs

ModelConfig – decouples model selection from agent definition:

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: kagent.dev/v1alpha1
kind: ModelConfig
metadata:
  name: gpt4-config
spec:
  model: gpt-4o
  apiKeySecretRef:
    name: openai-api-key
    key: api-key
  # Or point to a self-hosted vLLM endpoint:
  # baseUrl: http://vllm-service:8000/v1
  # model: meta-llama/Llama-3-70B-Instruct

Tool – defines tools agents can use (Kubernetes operations, HTTP calls, custom functions):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: kagent.dev/v1alpha1
kind: Tool
metadata:
  name: kubectl-get
spec:
  type: kubernetes
  description: "Get Kubernetes resources"
  parameters:
    resource:
      type: string
      description: "Resource type (pods, services, deployments)"
    namespace:
      type: string
      description: "Target namespace"

AgentTeam – defines multi-agent collaboration patterns:

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: kagent.dev/v1alpha1
kind: AgentTeam
metadata:
  name: incident-response
spec:
  teamType: RoundRobin  # or Selector, Swarm
  agents:
  - name: diagnostics-agent
  - name: remediation-agent
  - name: communication-agent
  terminationCondition:
    type: TextMention
    text: "RESOLVED"

Key Capabilities

Declarative Agent Lifecycle

Agents are managed like any K8s resource. Create, update, delete with kubectl. Version in Git. Deploy with ArgoCD. This is the fundamental value proposition – it brings agents into the existing K8s operational model.

1
2
3
4
kubectl apply -f agents/k8s-troubleshooter.yaml
kubectl get agents -n ai-agents
kubectl describe agent k8s-troubleshooter -n ai-agents
kubectl delete agent k8s-troubleshooter -n ai-agents

Built-in Tool Ecosystem

Kagent ships with tool integrations relevant to platform engineering:

  • Kubernetes tools – kubectl operations, Helm chart management
  • Prometheus tools – query metrics, alert inspection
  • GitHub tools – PR creation, issue management
  • HTTP tools – generic REST API calls
  • Custom tools – define your own via the Tool CRD

Multi-Agent Patterns

Through AutoGen underneath, Kagent supports several agent collaboration patterns:

Pattern Description Use Case
RoundRobin Agents take turns in fixed order Step-by-step workflows
Selector An LLM-based selector picks the next agent Dynamic task routing
Swarm Agents hand off based on tool calls Complex problem-solving

Web UI

Kagent includes a web interface for interacting with deployed agents, viewing conversation history, and monitoring agent activity. Useful for demos and debugging, though production usage would typically go through the API.


Relationship to the CNCF Ecosystem

Kagent sits at the intersection of several CNCF trends:

  • KNative – serverless scaling for agent pods (scale to zero when idle, scale up on demand)
  • Gateway API – routing agent traffic through K8s-native ingress
  • Prometheus/OpenTelemetry – observability for agent execution
  • Cert-Manager – TLS for agent-to-agent and agent-to-LLM communication
  • ArgoCD/Flux – GitOps deployment of agent definitions

The broader CNCF AI landscape as of 2025-2026 includes several complementary projects:

Project Focus Relationship to Kagent
KServe Model serving on K8s Kagent agents call KServe endpoints
llm-d LLM-optimized serving Backend inference engine
Dapr Distributed app runtime Alternative infra layer
OpenTelemetry Observability Traces agent execution
KEDA Event-driven autoscaling Scale agents on demand

Deployment

Prerequisites

  • Kubernetes 1.28+
  • Helm 3.x
  • An LLM endpoint (cloud API key or self-hosted vLLM/Ollama)

Installation

1
2
3
4
5
6
7
8
9
10
11
# Add the Kagent Helm repo
helm repo add kagent https://kagent-dev.github.io/kagent/
helm repo update

# Install Kagent controller
helm install kagent kagent/kagent \
  --namespace kagent-system \
  --create-namespace

# Verify
kubectl get pods -n kagent-system

Creating Your First Agent

1
2
3
4
5
6
7
8
9
10
11
12
# Create a namespace for agents
kubectl create namespace ai-agents

# Create model config with API key
kubectl create secret generic openai-api-key \
  -n ai-agents \
  --from-literal=api-key=$OPENAI_API_KEY

# Apply agent definitions
kubectl apply -f model-config.yaml -n ai-agents
kubectl apply -f tools.yaml -n ai-agents
kubectl apply -f agent.yaml -n ai-agents

When to Use Kagent

Use Kagent when:

  • You are a platform team that wants to offer “agents as a service” to internal users
  • You want GitOps-driven agent deployment with the same tooling as your other K8s workloads
  • You need multi-agent collaboration patterns (teams, handoffs, routing) out of the box
  • You want declarative tool management and model configuration as K8s resources
  • You are building Kubernetes-focused operational agents (SRE, incident response, infrastructure)

Skip Kagent when:

  • You are not on Kubernetes – Kagent is K8s-native by design
  • You need a lightweight agent setup and do not want K8s CRD overhead
  • You are deeply invested in a different agent framework (LangGraph, CrewAI) and want to keep it
  • You need production-hardened agent orchestration (Kagent is still early-stage, alpha CRDs)

Maturity and Risks

As of early 2026, Kagent is a young project. The CRD API is v1alpha1, meaning breaking changes are expected. It has community momentum and Microsoft backing through AutoGen, but it is not yet a graduated CNCF project. Evaluate it for internal/platform use cases where you control the upgrade cycle, not for customer-facing production systems that need API stability guarantees.


References

  • Kagent GitHub: https://github.com/kagent-dev/kagent
  • Kagent documentation: https://kagent.dev
  • Microsoft AutoGen: https://github.com/microsoft/autogen
  • CNCF AI landscape: https://landscape.cncf.io
  • KubeCon 2025 talks on AI-native Kubernetes
This post is licensed under CC BY 4.0 by the author.