Post

Microsoft AutoGen

An open-source framework for building multi-agent systems through structured conversations between agents -- featuring built-in code execution, group chat orchestration, human-in-the-loop participation, and a modular architecture that separates agent behavior from the models driving them.

Microsoft AutoGen

An open-source framework for building multi-agent systems through structured conversations between agents – featuring built-in code execution, group chat orchestration, human-in-the-loop participation, and a modular architecture that separates agent behavior from the models driving them.


What Is AutoGen?

AutoGen is Microsoft Research’s multi-agent framework. Its core insight: complex tasks are best solved through conversations between specialized agents, where each agent can be backed by an LLM, a human, a tool, or a code executor.

AutoGen v0.4 (codenamed “AutoGen 0.4” or “AG2” by the community fork) introduced a complete rewrite with a modular, event-driven architecture. The framework now consists of:

  • autogen-core: Event-driven agent runtime, message passing
  • autogen-agentchat: High-level multi-agent conversation patterns
  • autogen-ext: Extensions (model clients, code executors, tools)

Key Philosophy:

  • Agents converse, not just execute. The conversation is the orchestration mechanism
  • Code execution is first-class. Agents can write and run code as part of their reasoning
  • Human participation is natural. Humans join conversations as agents, not through special hooks
  • Modular runtime. Agents can run in-process or distributed across machines

Status:

  • v0.4+ stable, actively developed by Microsoft Research
  • Open-source (MIT license)
  • Python-first, experimental .NET support
  • Strong research community, used extensively in Microsoft’s own AI products

Core Concepts

Agent Types (AgentChat Layer)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from autogen_agentchat.agents import AssistantAgent, UserProxyAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

# Model client (works with OpenAI, Azure, Anthropic via compatible API)
model_client = OpenAIChatCompletionClient(
    model="gpt-4o",
    api_key="..."
)

# AssistantAgent: LLM-powered agent with optional tools
assistant = AssistantAgent(
    name="research_assistant",
    model_client=model_client,
    system_message="""You are a senior research analyst. When asked to analyze 
    data, write Python code to perform the analysis. Be thorough and precise.""",
    tools=[search_web, read_document]
)

# UserProxyAgent: represents a human in the conversation
user = UserProxyAgent(
    name="user",
    input_func=input  # prompts for human input
)

Two-Agent Conversation

The simplest pattern: two agents conversing until a task is complete.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination

# Termination condition: stop when "DONE" appears
termination = TextMentionTermination("DONE")

# Two agents take turns
team = RoundRobinGroupChat(
    participants=[user, assistant],
    termination_condition=termination,
    max_turns=10
)

# Run the conversation
result = await team.run(task="Analyze Q4 sales trends for the electronics division")
print(result.messages)  # full conversation history

Group Chat

Multiple agents in a managed conversation, with a strategy for who speaks next.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import SelectorGroupChat

analyst = AssistantAgent(
    name="data_analyst",
    model_client=model_client,
    system_message="You analyze data using Python code. Present findings with numbers."
)

strategist = AssistantAgent(
    name="strategist",
    model_client=model_client,
    system_message="You interpret analytical findings and propose business strategies."
)

critic = AssistantAgent(
    name="critic",
    model_client=model_client,
    system_message="You challenge assumptions and identify risks in proposed strategies."
)

# SelectorGroupChat: an LLM decides who speaks next based on conversation context
team = SelectorGroupChat(
    participants=[analyst, strategist, critic],
    model_client=model_client,  # selector model
    termination_condition=termination,
    max_turns=15
)

result = await team.run(task="Develop a strategy for AI-powered customer service")

Speaker Selection Strategies

Strategy Class Description
Round Robin RoundRobinGroupChat Agents take turns in fixed order
Selector SelectorGroupChat LLM picks next speaker based on context
Swarm Swarm Agents hand off to specific next agents via HandoffMessage

Code Execution

AutoGen’s distinguishing feature: agents can write code and execute it within sandboxed environments.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
from autogen_agentchat.agents import CodeExecutorAgent

# Docker-based sandbox for safe code execution
code_executor = DockerCommandLineCodeExecutor(
    image="python:3.12-slim",
    timeout=60,
    work_dir="./code_output"
)

# Agent that executes code blocks from conversations
executor_agent = CodeExecutorAgent(
    name="code_executor",
    code_executor=code_executor
)

# Pair with an assistant that writes code
coder = AssistantAgent(
    name="coder",
    model_client=model_client,
    system_message="""Write Python code to solve tasks. 
    Always include print statements for results.
    Wrap code in ```python blocks."""
)

team = RoundRobinGroupChat(
    participants=[coder, executor_agent],
    max_turns=10,
    termination_condition=termination
)

result = await team.run(
    task="Download AAPL stock data for 2025 and plot a 30-day moving average"
)

Code Executor Options

Executor Use Case Safety
DockerCommandLineCodeExecutor Production – full isolation High (containerized)
LocalCommandLineCodeExecutor Development/testing Low (runs on host)
AzureContainerCodeExecutor Cloud-native execution High (Azure Container Instances)

Human-in-the-Loop

Humans participate as agents in the conversation, not through external callbacks.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from autogen_agentchat.agents import UserProxyAgent

# Human approver
approver = UserProxyAgent(
    name="manager",
    input_func=input,  # blocks for terminal input
    description="A human manager who approves or rejects proposals"
)

team = RoundRobinGroupChat(
    participants=[analyst, strategist, approver],
    max_turns=20,
    termination_condition=termination
)

# The conversation naturally includes the human
result = await team.run(
    task="Propose a budget for the AI platform initiative"
)

Swarm Pattern (Agent Handoffs)

Agents explicitly hand off to the next agent, similar to OpenAI’s Swarm pattern.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import Swarm
from autogen_agentchat.messages import HandoffMessage

triage = AssistantAgent(
    name="triage",
    model_client=model_client,
    handoffs=["billing", "technical_support"],
    system_message="""You are a customer service triage agent. 
    Determine the customer's issue and hand off to the appropriate specialist.
    Use handoff_to_billing for payment issues, handoff_to_technical_support for technical issues."""
)

billing = AssistantAgent(
    name="billing",
    model_client=model_client,
    handoffs=["triage"],
    system_message="You handle billing and payment inquiries."
)

technical = AssistantAgent(
    name="technical_support",
    model_client=model_client,
    handoffs=["triage"],
    system_message="You handle technical support issues."
)

team = Swarm(
    participants=[triage, billing, technical],
    termination_condition=termination
)

result = await team.run(task="My last payment was charged twice")

Architecture: Core Runtime

For advanced use cases, the core runtime provides lower-level primitives.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from autogen_core import AgentRuntime, MessageContext, RoutedAgent, message_handler

class ResearchAgent(RoutedAgent):
    def __init__(self):
        super().__init__("Research agent")

    @message_handler
    async def handle_request(self, message: ResearchRequest, ctx: MessageContext):
        # Process the message
        result = await self.do_research(message.topic)
        # Publish result for other agents to consume
        await self.publish_message(ResearchResult(data=result), topic_id=ctx.topic_id)

# Runtime manages agent lifecycle, message routing, and concurrency
runtime = AgentRuntime()
await runtime.register("researcher", ResearchAgent)
await runtime.start()

AutoGen Studio

A visual interface for building and testing multi-agent workflows without code.

  • Drag-and-drop agent and team configuration
  • Live testing with conversation visualization
  • Export configurations to Python code
  • Gallery of pre-built agent templates
1
2
pip install autogenstudio
autogenstudio ui --port 8080

Key Properties

Property AutoGen
Primary model Multi-agent conversations
Agent types AssistantAgent, UserProxyAgent, CodeExecutorAgent, custom
Orchestration RoundRobin, Selector (LLM-based), Swarm (handoffs)
Code execution First-class (Docker, local, Azure)
Human-in-the-loop Native (humans are agents)
State/persistence Message history; checkpointing via runtime
Streaming Supported via async generators
Model support OpenAI, Azure OpenAI, Anthropic (via adapter), local models
Language Python (primary), .NET (experimental)
License MIT

AutoGen vs Alternatives

Dimension AutoGen LangGraph CrewAI
Mental model Conversations Graphs Teams/roles
Code execution Built-in, sandboxed External tools External tools
Human participation Humans are agents Interrupt/resume Callbacks
Orchestration Chat patterns Graph edges Process types
Setup complexity Moderate High Low
Flexibility High Very high Moderate
Enterprise backing Microsoft LangChain Inc CrewAI Inc
Best for Code-heavy, research, analysis Complex stateful workflows Quick role-based teams

When to Use

Choose AutoGen when:

  • Your agents need to write and execute code as part of their reasoning
  • The “conversation between specialists” model fits your problem
  • You want humans to naturally participate in multi-agent discussions
  • You are building data analysis, research, or software engineering workflows
  • You need the flexibility of the core runtime for custom agent topologies

Avoid AutoGen when:

  • You need minimal setup / fastest path to a working multi-agent system (use CrewAI)
  • You need durable persistence and checkpointing out of the box (use LangGraph)
  • You want tight integration with a specific cloud provider’s agent ecosystem
  • Your use case is simple single-agent tool use (overkill)

References

  • AutoGen docs: https://microsoft.github.io/autogen/
  • GitHub: https://github.com/microsoft/autogen
  • AutoGen Studio: https://microsoft.github.io/autogen/docs/autogen-studio/
  • Research paper: “AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation” (2023)
This post is licensed under CC BY 4.0 by the author.