Microsoft AutoGen
An open-source framework for building multi-agent systems through structured conversations between agents -- featuring built-in code execution, group chat orchestration, human-in-the-loop participation, and a modular architecture that separates agent behavior from the models driving them.
An open-source framework for building multi-agent systems through structured conversations between agents – featuring built-in code execution, group chat orchestration, human-in-the-loop participation, and a modular architecture that separates agent behavior from the models driving them.
What Is AutoGen?
AutoGen is Microsoft Research’s multi-agent framework. Its core insight: complex tasks are best solved through conversations between specialized agents, where each agent can be backed by an LLM, a human, a tool, or a code executor.
AutoGen v0.4 (codenamed “AutoGen 0.4” or “AG2” by the community fork) introduced a complete rewrite with a modular, event-driven architecture. The framework now consists of:
- autogen-core: Event-driven agent runtime, message passing
- autogen-agentchat: High-level multi-agent conversation patterns
- autogen-ext: Extensions (model clients, code executors, tools)
Key Philosophy:
- Agents converse, not just execute. The conversation is the orchestration mechanism
- Code execution is first-class. Agents can write and run code as part of their reasoning
- Human participation is natural. Humans join conversations as agents, not through special hooks
- Modular runtime. Agents can run in-process or distributed across machines
Status:
- v0.4+ stable, actively developed by Microsoft Research
- Open-source (MIT license)
- Python-first, experimental .NET support
- Strong research community, used extensively in Microsoft’s own AI products
Core Concepts
Agent Types (AgentChat Layer)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from autogen_agentchat.agents import AssistantAgent, UserProxyAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
# Model client (works with OpenAI, Azure, Anthropic via compatible API)
model_client = OpenAIChatCompletionClient(
model="gpt-4o",
api_key="..."
)
# AssistantAgent: LLM-powered agent with optional tools
assistant = AssistantAgent(
name="research_assistant",
model_client=model_client,
system_message="""You are a senior research analyst. When asked to analyze
data, write Python code to perform the analysis. Be thorough and precise.""",
tools=[search_web, read_document]
)
# UserProxyAgent: represents a human in the conversation
user = UserProxyAgent(
name="user",
input_func=input # prompts for human input
)
Two-Agent Conversation
The simplest pattern: two agents conversing until a task is complete.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination
# Termination condition: stop when "DONE" appears
termination = TextMentionTermination("DONE")
# Two agents take turns
team = RoundRobinGroupChat(
participants=[user, assistant],
termination_condition=termination,
max_turns=10
)
# Run the conversation
result = await team.run(task="Analyze Q4 sales trends for the electronics division")
print(result.messages) # full conversation history
Group Chat
Multiple agents in a managed conversation, with a strategy for who speaks next.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import SelectorGroupChat
analyst = AssistantAgent(
name="data_analyst",
model_client=model_client,
system_message="You analyze data using Python code. Present findings with numbers."
)
strategist = AssistantAgent(
name="strategist",
model_client=model_client,
system_message="You interpret analytical findings and propose business strategies."
)
critic = AssistantAgent(
name="critic",
model_client=model_client,
system_message="You challenge assumptions and identify risks in proposed strategies."
)
# SelectorGroupChat: an LLM decides who speaks next based on conversation context
team = SelectorGroupChat(
participants=[analyst, strategist, critic],
model_client=model_client, # selector model
termination_condition=termination,
max_turns=15
)
result = await team.run(task="Develop a strategy for AI-powered customer service")
Speaker Selection Strategies
| Strategy | Class | Description |
|---|---|---|
| Round Robin | RoundRobinGroupChat |
Agents take turns in fixed order |
| Selector | SelectorGroupChat |
LLM picks next speaker based on context |
| Swarm | Swarm |
Agents hand off to specific next agents via HandoffMessage |
Code Execution
AutoGen’s distinguishing feature: agents can write code and execute it within sandboxed environments.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
from autogen_agentchat.agents import CodeExecutorAgent
# Docker-based sandbox for safe code execution
code_executor = DockerCommandLineCodeExecutor(
image="python:3.12-slim",
timeout=60,
work_dir="./code_output"
)
# Agent that executes code blocks from conversations
executor_agent = CodeExecutorAgent(
name="code_executor",
code_executor=code_executor
)
# Pair with an assistant that writes code
coder = AssistantAgent(
name="coder",
model_client=model_client,
system_message="""Write Python code to solve tasks.
Always include print statements for results.
Wrap code in ```python blocks."""
)
team = RoundRobinGroupChat(
participants=[coder, executor_agent],
max_turns=10,
termination_condition=termination
)
result = await team.run(
task="Download AAPL stock data for 2025 and plot a 30-day moving average"
)
Code Executor Options
| Executor | Use Case | Safety |
|---|---|---|
DockerCommandLineCodeExecutor |
Production – full isolation | High (containerized) |
LocalCommandLineCodeExecutor |
Development/testing | Low (runs on host) |
AzureContainerCodeExecutor |
Cloud-native execution | High (Azure Container Instances) |
Human-in-the-Loop
Humans participate as agents in the conversation, not through external callbacks.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from autogen_agentchat.agents import UserProxyAgent
# Human approver
approver = UserProxyAgent(
name="manager",
input_func=input, # blocks for terminal input
description="A human manager who approves or rejects proposals"
)
team = RoundRobinGroupChat(
participants=[analyst, strategist, approver],
max_turns=20,
termination_condition=termination
)
# The conversation naturally includes the human
result = await team.run(
task="Propose a budget for the AI platform initiative"
)
Swarm Pattern (Agent Handoffs)
Agents explicitly hand off to the next agent, similar to OpenAI’s Swarm pattern.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import Swarm
from autogen_agentchat.messages import HandoffMessage
triage = AssistantAgent(
name="triage",
model_client=model_client,
handoffs=["billing", "technical_support"],
system_message="""You are a customer service triage agent.
Determine the customer's issue and hand off to the appropriate specialist.
Use handoff_to_billing for payment issues, handoff_to_technical_support for technical issues."""
)
billing = AssistantAgent(
name="billing",
model_client=model_client,
handoffs=["triage"],
system_message="You handle billing and payment inquiries."
)
technical = AssistantAgent(
name="technical_support",
model_client=model_client,
handoffs=["triage"],
system_message="You handle technical support issues."
)
team = Swarm(
participants=[triage, billing, technical],
termination_condition=termination
)
result = await team.run(task="My last payment was charged twice")
Architecture: Core Runtime
For advanced use cases, the core runtime provides lower-level primitives.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from autogen_core import AgentRuntime, MessageContext, RoutedAgent, message_handler
class ResearchAgent(RoutedAgent):
def __init__(self):
super().__init__("Research agent")
@message_handler
async def handle_request(self, message: ResearchRequest, ctx: MessageContext):
# Process the message
result = await self.do_research(message.topic)
# Publish result for other agents to consume
await self.publish_message(ResearchResult(data=result), topic_id=ctx.topic_id)
# Runtime manages agent lifecycle, message routing, and concurrency
runtime = AgentRuntime()
await runtime.register("researcher", ResearchAgent)
await runtime.start()
AutoGen Studio
A visual interface for building and testing multi-agent workflows without code.
- Drag-and-drop agent and team configuration
- Live testing with conversation visualization
- Export configurations to Python code
- Gallery of pre-built agent templates
1
2
pip install autogenstudio
autogenstudio ui --port 8080
Key Properties
| Property | AutoGen |
|---|---|
| Primary model | Multi-agent conversations |
| Agent types | AssistantAgent, UserProxyAgent, CodeExecutorAgent, custom |
| Orchestration | RoundRobin, Selector (LLM-based), Swarm (handoffs) |
| Code execution | First-class (Docker, local, Azure) |
| Human-in-the-loop | Native (humans are agents) |
| State/persistence | Message history; checkpointing via runtime |
| Streaming | Supported via async generators |
| Model support | OpenAI, Azure OpenAI, Anthropic (via adapter), local models |
| Language | Python (primary), .NET (experimental) |
| License | MIT |
AutoGen vs Alternatives
| Dimension | AutoGen | LangGraph | CrewAI |
|---|---|---|---|
| Mental model | Conversations | Graphs | Teams/roles |
| Code execution | Built-in, sandboxed | External tools | External tools |
| Human participation | Humans are agents | Interrupt/resume | Callbacks |
| Orchestration | Chat patterns | Graph edges | Process types |
| Setup complexity | Moderate | High | Low |
| Flexibility | High | Very high | Moderate |
| Enterprise backing | Microsoft | LangChain Inc | CrewAI Inc |
| Best for | Code-heavy, research, analysis | Complex stateful workflows | Quick role-based teams |
When to Use
Choose AutoGen when:
- Your agents need to write and execute code as part of their reasoning
- The “conversation between specialists” model fits your problem
- You want humans to naturally participate in multi-agent discussions
- You are building data analysis, research, or software engineering workflows
- You need the flexibility of the core runtime for custom agent topologies
Avoid AutoGen when:
- You need minimal setup / fastest path to a working multi-agent system (use CrewAI)
- You need durable persistence and checkpointing out of the box (use LangGraph)
- You want tight integration with a specific cloud provider’s agent ecosystem
- Your use case is simple single-agent tool use (overkill)
References
- AutoGen docs: https://microsoft.github.io/autogen/
- GitHub: https://github.com/microsoft/autogen
- AutoGen Studio: https://microsoft.github.io/autogen/docs/autogen-studio/
- Research paper: “AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation” (2023)