Skip to contentNew: Does ChatGPT recommend your brand? Free 60-second AI visibility check →
By The DDH Team · Digital Dashboard Hub

Multi-Agent Coordination Patterns (2026): Supervisor, Swarm, Hierarchical, Network

By The DDH Team at Digital Dashboard HubUpdated

Stop writing AI prompts from scratch.

Tell us your business + your task + your model. We write the prompt — perfectly tuned for ChatGPT, Claude, Grok, Gemini, Midjourney, or any model. Plus 500+ pre-built prompts in your library.

14 days, no card. Cancel in 2 clicks.

Multi-agent systems in 2026 are not one pattern — they are four distinct architectural choices that trade off coordination overhead, parallelism, flexibility, and cost in different ways. The supervisor pattern has a single controller that routes tasks to specialized workers; the swarm pattern has agents that hand off to each other peer-to-peer; the hierarchical pattern nests supervisors within supervisors for large teams; and the agent network pattern has fully decentralized agents communicating via shared memory or message bus. Choosing the wrong pattern for your task is the most common cause of multi-agent systems that cost 3-5x more than they should. See our multi-agent cost per task calculator for cost models of specific frameworks.

The pattern you choose is not just about architecture — it determines how you debug failures, how you add new agents, and how cost scales with task complexity. A supervisor system is easy to debug (the supervisor's routing decisions are explicit and logged) but can become a bottleneck when the supervisor model is overwhelmed with complex coordination. A swarm is maximally parallel but hard to debug when agents hand off in unexpected sequences. Hierarchical systems scale to large teams but incur coordination overhead at every level. Network patterns are the most flexible but the most expensive. Source: LangGraph multi-agent docs, AutoGen multi-agent patterns.

Below: each pattern explained with code, cost analysis, and real production use cases. Steps cover choosing your pattern, implementing the core routing logic, adding error handling and fallbacks, scaling with sub-agents, and running end-to-end tests. For eval pipelines to test these patterns, see our agent eval with Langfuse tutorial. For single-agent baselines, see our LangGraph tutorial and CrewAI tutorial.

Digital Dashboard Hub

Writing good prompts for ONE AI is hard. Writing them for GPT-5, Claude, Gemini, Perplexity, Midjourney and 6 more is a full-time job. DDH's AI Prompt Builder writes once, runs everywhere — locked to your niche, voice, and brand tone.

Free 14 days, no card.

Multi-agent coordination pattern comparison, June 2026

Feature
Supervisor
Swarm
Hierarchical
Network
Central coordinatorYes (1 supervisor)No (peer-to-peer)Yes (nested supervisors)No (shared memory)
Control flowSupervisor decides routingAgents self-routeLevel-based delegationEvent-driven
ParallelismVia Send APINaturalPer-levelNatural
DebuggabilityHigh (explicit routing)Medium (trace handoffs)Medium (level-by-level)Low (emergent)
Cost overheadLow (1 supervisor LLM)Very lowMedium (N supervisors)Variable
Best forClear task typesDynamic handoffsLarge agent teamsComplex networks
Failure modeSupervisor bottleneckRouting loopsCoordination cascadeEmergent deadlock
Primary frameworkLangGraphLangGraph/AutoGenLangGraph/AutoGenAutoGen
Worker count sweet spot2-6 workers3-8 agents10-50 agents (nested)Unlimited
Typical task cost multiple vs single agent1.8-3x1.5-2.5x2.5-5x3-10x
Production maturity (2026)HighMediumMediumLow/Research
Example use caseResearch + write + editCustomer support routingEnterprise research teamAutonomous research org

Sources, fetched 2026-06-21: LangGraph multi-agent concepts (https://langchain-ai.github.io/langgraph/concepts/multi_agent/), AutoGen multi-agent documentation (https://microsoft.github.io/autogen/stable/). Cost multiples modeled on Claude Sonnet 4.6 baseline single-agent loop ($0.137/query). Supervisor pattern 1.8-3x accounts for supervisor overhead and worker execution. Swarm 1.5-2.5x accounts for handoff tokens but lower coordination overhead. Hierarchical 2.5-5x accounts for N-level coordination overhead. Network 3-10x accounts for emergent communication overhead and variable agent counts. Production maturity ratings based on documented production deployments in 2026 technical case studies.

Pattern 1: supervisor — the most common production pattern

**The supervisor pattern** has a central controller LLM that receives the task, decides which worker agent to invoke, receives the worker's result, and decides whether to invoke another worker, loop back to a prior worker, or terminate. All coordination flows through the supervisor — workers never communicate directly. This is the most common production multi-agent pattern in 2026 because it's easy to reason about, easy to debug (the supervisor's routing decisions are explicit in the logs), and easy to extend (add a new worker by registering it with the supervisor). Source: LangGraph supervisor pattern.

**LangGraph supervisor implementation.** `from langgraph.graph import StateGraph, END; from langchain_core.messages import HumanMessage, AIMessage; supervisor_prompt = SystemMessage(content='You are a supervisor coordinating a research team. Available workers: researcher (web search), writer (drafts), editor (reviews). Route to the appropriate worker based on what needs to be done next. Say FINISH when the task is complete.'); def supervisor(state): messages = state['messages'] + [supervisor_prompt]; response = llm.invoke(messages); return {'next_worker': parse_next_worker(response), 'messages': state['messages'] + [response]}`. The supervisor reads the full conversation state and decides which worker to invoke next.

**Parallel worker invocation via LangGraph's Send API.** For supervisors that can dispatch multiple workers simultaneously: `from langgraph.types import Send; def supervisor_parallel(state): tasks = analyze_and_split(state['task']); return [Send('worker', {'task': t, 'context': state['context']}) for t in tasks]`. The `Send` API creates independent worker invocations that run in parallel. Results are collected into the supervisor's state when all workers complete. This is the key LangGraph feature that separates it from sequential frameworks — genuine parallel execution, not just async simulation. Source: LangGraph Send API docs.

**Supervisor cost analysis.** The supervisor's cost is: (1) supervisor LLM call per routing decision (typically 300-800 input + 50-200 output tokens), (2) all worker LLM calls (the dominant cost component), (3) supervisor LLM call for synthesis (if the supervisor assembles the final output). For a 3-worker supervisor on Sonnet 4.6 with 4 routing decisions: supervisor overhead = 4 × (600 input × $3/1M + 150 output × $15/1M) = 4 × ($0.0018 + $0.00225) = $0.016. Workers = 3 × $0.063 (3-turn loop) = $0.189. Synthesis = $0.025. **Total: $0.230/task** — vs $0.149 for CrewAI on same task. The overhead is the price of explicit routing control and debuggability.

**When to use the supervisor pattern.** Supervisor is optimal when: (1) workers have distinct capabilities that the supervisor needs to choose between (not just a fixed sequence), (2) the task requires adaptive routing (the right next step depends on prior results), (3) some tasks complete with just 1-2 workers while others need 5-6 (dynamic crew size), (4) quality control requires a supervisor to check worker output before passing it downstream. It's NOT optimal for fixed pipelines (use CrewAI sequential), for peer-to-peer workflows where any agent should be able to hand off to any other (use swarm), or for massive teams >10 workers (overhead grows with routing decisions).

**Supervisor as quality gate.** A common production extension: add a supervisor routing decision after each worker that explicitly checks output quality. `def should_continue(state): supervisor_check = llm.invoke(state['messages'] + [quality_check_prompt]); if 'RETRY' in supervisor_check.content: return {'next': 'researcher'}; elif 'DONE' in supervisor_check.content: return {'next': END}; else: return {'next': 'writer'}`. This adds 1 supervisor LLM call per worker completion (cost: ~$0.007/check on Sonnet 4.6) but catches low-quality worker outputs before they flow downstream. At 10K tasks/month: $700/month in quality gate overhead — justified if it prevents 5% of low-quality outputs reaching users.


Pattern 2: swarm — peer-to-peer agent handoff

**The swarm pattern** has agents that transfer control directly to each other based on task context — no central coordinator. Each agent decides when it's done and which agent should handle the next step. This mirrors how human teams work: a researcher hands off to a writer when they have enough material; the writer hands off to a reviewer when the draft is complete. The pattern emerged from OpenAI's experimental Swarm library (2024) and was productionized in LangGraph and AutoGen in 2025. Source: LangGraph swarm implementation.

**LangGraph swarm implementation.** Each agent has a `handoff_to_*` tool that transfers control to another agent: `@tool; def handoff_to_writer(context: str) -> str: '''Transfer to the writer agent when research is complete. context: summary of research findings.'''; return context`. The tool doesn't execute anything — it's a structured handoff signal. The current agent outputs a call to `handoff_to_writer`, the graph routing function detects the handoff tool call and routes to the writer node. `def route_from_agent(state): last = state['messages'][-1]; if hasattr(last, 'tool_calls') and last.tool_calls: tool_name = last.tool_calls[0]['name']; if tool_name.startswith('handoff_to_'): return tool_name.replace('handoff_to_', ''); return 'tools'`.

**AutoGen's conversable agent handoff pattern.** Microsoft's AutoGen implements swarm coordination via the `ConversableAgent` API with dynamic speaker selection. Each agent in the conversation can be the next speaker based on context: `from autogen import ConversableAgent, GroupChat, GroupChatManager; researcher = ConversableAgent('researcher', ...); writer = ConversableAgent('writer', ...); group_chat = GroupChat(agents=[researcher, writer, editor], messages=[], max_round=10); manager = GroupChatManager(groupchat=group_chat, llm_config=llm_config); result = researcher.initiate_chat(manager, message=task)`. The `GroupChatManager` decides who speaks next based on conversation context — it's a lightweight supervisor with automatic routing.

**Swarm cost advantages.** The swarm pattern has the lowest coordination overhead of all multi-agent patterns because there's no central coordinator processing every message. Handoff messages are compact (the handoff tool call is 50-100 output tokens; the context summary passed with the handoff is 200-400 tokens). For a 3-agent swarm on Sonnet 4.6 with 2 handoffs: handoff overhead = 2 × (100 output + 300 input received) = 2 × ($0.0015 + $0.0009) = $0.0048. Workers = 3 × $0.063 = $0.189. **Total: $0.194/task** — 16% cheaper than the supervisor pattern for equivalent task quality. The tradeoff is debuggability.

**Swarm failure modes and mitigation.** The swarm's main risks: (1) routing loops — agent A hands off to B, B hands off back to A, infinite loop. Mitigation: add a `visited_agents` list to state; agents that have been visited twice refuse to accept another handoff. (2) No-handoff termination — an agent completes its work but forgets to hand off, leaving the task incomplete. Mitigation: add a `FINISH` handoff tool that agents call when the overall task is done. (3) Context loss between handoffs — the receiving agent doesn't have enough context to continue. Mitigation: require a mandatory context summary in every handoff tool call (the `context` parameter in the handoff tool). Source: AutoGen swarm patterns.

**When to use the swarm pattern.** Swarm is optimal for: dynamic workflows where the right sequence of agents depends on the content (a customer support swarm where the agent handling the query decides whether to pass to billing, technical, or escalation); workflows with natural peer expertise (each agent knows which other agent to call for specific sub-tasks); and high-volume workflows where supervisor overhead is a significant cost (at 100K tasks/month, the $0.036/task cost difference between supervisor and swarm is $3,600/month). It's NOT optimal for fixed pipelines (use sequential), for complex multi-level task decomposition (use hierarchical), or for tasks where you need explicit audit trails of routing decisions (supervisor gives you this; swarm makes it harder).


Pattern 3: hierarchical — nested supervisors for large teams

**The hierarchical pattern** extends the supervisor pattern to multiple levels — a top-level supervisor delegates to team supervisors, which delegate to individual worker agents. This mirrors large human organizations: a project manager (top supervisor) delegates to team leads (sub-supervisors), who delegate to specialists (workers). The pattern is appropriate when you have 10+ specialized agents that can't all be effectively managed by a single supervisor's routing decision. Source: LangGraph hierarchical multi-agent.

**LangGraph hierarchical implementation.** Each team is a compiled LangGraph graph that acts as a node in the top-level graph: `research_team = build_research_team_graph(); writing_team = build_writing_team_graph(); top_graph = StateGraph(TopState); top_graph.add_node('research_team', research_team.invoke); top_graph.add_node('writing_team', writing_team.invoke); top_graph.add_node('top_supervisor', top_supervisor_node)`. The top supervisor receives the task, decides which team to activate, and the team's internal supervisor handles worker routing. The top-level graph only sees team-level inputs and outputs, not individual worker calls. This abstraction makes large systems manageable.

**AutoGen's nested agent pattern.** AutoGen supports nested conversations — an agent can initiate a sub-conversation with a group of agents and return the result: `top_agent.register_nested_chats([{'recipient': research_team, 'message': 'Research the topic', 'summary_method': 'last_msg'}])`. The nested conversation runs independently, returns a summary, and the top agent continues with the summary. This is AutoGen's hierarchical pattern — the parent agent delegates to a sub-team and gets back a concise summary without seeing all the sub-team's internal communication. Source: AutoGen nested conversation docs.

**Hierarchical coordination overhead.** The cost of hierarchy is a full LLM call at every level of coordination. For a 3-level hierarchy (top supervisor → 3 team supervisors → 3 workers each = 9 workers total): top supervisor calls = 3-5 routing decisions × $0.007 = $0.021-$0.035. Team supervisor calls = 3 teams × 3-4 routing decisions × $0.007 = $0.063-$0.084. Worker execution = 9 workers × $0.063 (Sonnet 4.6) = $0.567. Synthesis across levels = $0.040. **Total: $0.691-$0.726/task on Sonnet 4.6.** This is 5x a single-agent loop — justified only when 9 specialized workers are genuinely needed. Most tasks that seem to need 9 workers can be accomplished by 3 workers with better prompts.

**When hierarchical is justified.** Hierarchical multi-agent systems make sense for: (1) enterprise research tasks with genuinely distinct specializations (legal research team + financial analysis team + competitive intelligence team each with their own tools and models), (2) software development agents with separate coding, testing, and documentation teams, (3) content production pipelines at scale where teams work independently on parallel tracks. The rule of thumb: if you can't explain why each team needs its own supervisor (rather than being managed by the top supervisor directly), you don't need the hierarchy. Add levels only when the top supervisor's routing decisions become too complex to be reliable with a single context window.

**State management in hierarchical systems.** Each level of hierarchy has its own state. The challenge: state updates at the worker level need to propagate up to higher levels without overwhelming higher-level supervisors with low-level detail. Pattern: workers write to team-scoped state; team supervisors summarize team results into team-level state fields; the top supervisor reads only team-level summaries. `team_state = {'team_summary': summarize(worker_results), 'full_results': worker_results}; top_state = {'team_summaries': [team_state['team_summary'] for team_state in all_team_states]}`. The top supervisor reasons from summaries; detailed results are available for inspection but not surfaced by default. Source: LangGraph state management docs.


Pattern 4: agent network — decentralized coordination

**The agent network pattern** has agents communicating through a shared memory store or message bus rather than through a central coordinator or direct handoffs. Each agent subscribes to events, processes what it can, and publishes results back to the shared store. Other agents pick up those results and continue work. This is the most autonomous and least structured pattern — appropriate for research systems where the sequence of steps isn't known upfront and agents need to cooperate without predetermined structure. Source: AutoGen agent network concepts.

**AutoGen's actor-model implementation.** AutoGen v0.5 introduced an actor-model multi-agent system where agents are autonomous actors that process messages from their inbox: `from autogen_agentchat.agents import AssistantAgent; from autogen_agentchat.teams import RoundRobinGroupChat; agent_1 = AssistantAgent('researcher', model_client=claude_client, tools=[search_tool]); agent_2 = AssistantAgent('analyst', model_client=claude_client, tools=[analysis_tool]); team = RoundRobinGroupChat([agent_1, agent_2], max_turns=10)`. In round-robin mode, each agent takes a turn in sequence. In `SelectorGroupChat` mode, a model selects who speaks next. For fully decentralized networks, each agent publishes to a shared topic and all subscribers see the message. Source: AutoGen agentchat docs.

**LangGraph as shared-state network.** LangGraph's shared state object can serve as the network's communication bus: `class NetworkState(TypedDict): published_facts: Annotated[list, operator.add]; pending_tasks: Annotated[list, operator.add]; completed_tasks: list`. Any agent can read all published_facts and append new ones. An orchestrator node monitors `pending_tasks` and routes to the appropriate agent. This is a hybrid between supervisor and network — shared state provides the decentralized communication; a lightweight orchestrator handles routing. For most production network patterns, this hybrid is more controllable than a fully decentralized approach.

**Agent network cost structure.** Networks are the most expensive pattern because of emergent communication overhead. Every time an agent publishes to shared state, that state potentially grows — and each subsequent agent reads the full state. In a 5-agent network where each agent adds 500 tokens to shared state and there are 3 passes of the network: state grows to 5 × 500 × 3 = 7,500 tokens of accumulated shared state by the final pass. Each of the 5 × 3 = 15 agent calls reads the full state: 15 × 7,500 tokens × $3/1M (Sonnet) = $0.34 just from state reading overhead. Without careful state pruning, agent networks compound to 10x single-agent cost with no quality guarantee. Source: cost modeling from multi-agent cost calculator.

**When agent networks are appropriate.** Networks are appropriate for: autonomous research systems where the exploration path is unknown (agents discover what to search based on what other agents found), simulation environments (multiple agents representing different stakeholders need to react to each other), and experimental systems where you're willing to trade cost for emergent behavior. In production in 2026, pure agent networks without some form of coordination (at minimum a lightweight orchestrator that monitors the network state) are rare — the cost and debuggability challenges make them impractical for most business applications. Use networks for research and experimentation; use supervisor or hierarchical for production.

**Grounding networks with a lightweight orchestrator.** The practical version of the agent network pattern for production is a hybrid: `class OrchestratedNetwork: def __init__(self, agents, orchestrator_llm): ...; def step(self, state): # orchestrator reads state and picks next agent; next_agent = orchestrator_llm.invoke(ORCHESTRATOR_PROMPT.format(state=summarize_state(state))); result = agents[next_agent].run(state); return update_state(state, result)`. The orchestrator adds ~$0.005/routing decision in cost but dramatically improves debuggability and prevents emergent routing loops. This orchestrated-network hybrid is the practical production version of the network pattern — fully decentralized networks remain in the research category.


Choosing the right pattern: the decision framework

**The decision tree.** Start with the task structure: (1) Is the sequence of steps fixed and predetermined? → Use sequential single-crew (CrewAI sequential or LangGraph fixed pipeline). (2) Is the sequence adaptive but managed by a single controller? → Use supervisor. (3) Do agents have natural peer expertise that determines who handles what? → Use swarm. (4) Are there 10+ specialized agents that need team-level organization? → Use hierarchical. (5) Is the exploration path unknown and emergent? → Use agent network (with caution, high cost).

**Cost as a selection criterion.** If budget is a binding constraint, the pattern order from cheapest to most expensive for equivalent task quality is: sequential/swarm (lowest overhead) < supervisor (medium overhead) < hierarchical (N-level overhead) < network (emergent overhead). At 100K tasks/month on Sonnet 4.6: sequential $13,700, swarm $14,000-$19,400, supervisor $17,000-$23,000, hierarchical $34,500+, network variable ($30K-$70K+). For most SaaS applications, sequential or swarm patterns deliver adequate quality at the lowest cost. Source: multi-agent cost per task calculator.

**Debuggability as a selection criterion.** In production, the ability to debug failures quickly is as important as quality. Supervisor pattern is the most debuggable — every routing decision is a logged LLM call with explicit input and output. Hierarchical is debuggable level by level. Swarm is harder to debug because handoffs can happen in unexpected sequences. Networks are the hardest to debug because behavior is emergent. For regulated industries (finance, healthcare, legal) where audit trails are required, supervisor and hierarchical are the only viable patterns — every decision must be explainable.

**Start with the simplest pattern that works.** The most common mistake in multi-agent architecture is over-engineering at the start. A sequential CrewAI crew handles 80% of production use cases that teams initially think require a hierarchical network. Build the simplest pattern first, measure quality and cost, and only add complexity when you hit a measurable quality ceiling. 'We might need a network pattern eventually' is not a reason to build one now. The upgrade path is real: sequential → supervisor → hierarchical → network, each adding capability and complexity. Don't skip steps.

**Mixing patterns in a single system.** Production systems often mix patterns at different levels: a supervisor pattern at the top level (which of 3 teams should handle this request?), a swarm pattern within the research team (agents hand off based on what they find), and sequential execution within each agent (fixed research → analyze → summarize pipeline). This is normal — patterns are architectural choices at specific levels of the system, not global constraints. Design each level independently with the right pattern for that level's coordination requirements.

**AutoGen vs LangGraph for pattern implementation.** LangGraph is the production choice for supervisor and hierarchical patterns — its explicit state graph and Send API give you fine-grained control over routing and state management. AutoGen is better for conversational multi-agent patterns (swarm and network) where agents naturally communicate through messages rather than routing through a state graph. For the same supervisor pattern, LangGraph gives you more control; AutoGen gives you faster setup. Both are production-ready in 2026. Source: LangGraph multi-agent docs, AutoGen stable docs.


Error handling and reliability across patterns

**Error propagation differs by pattern.** In a supervisor system, an error in one worker reaches the supervisor's state — the supervisor can observe the error, route to a different worker, or escalate. In a swarm, an error in one agent means no handoff occurs — the task stalls unless you add an explicit timeout and fallback route. In a hierarchical system, an error in a worker propagates up to the team supervisor, which can retry within the team before escalating to the top supervisor. In a network, an error in one agent may leave a required fact unpublished, causing other agents to work without complete information. **Design error handling at the architecture level, not just at the tool level.**

**Supervisor error recovery.** Add explicit error states to the supervisor's routing logic: `def supervisor_node(state): response = llm.invoke(state['messages']); if is_error(state): route_decision = 'error_handler'; elif is_complete(state): route_decision = END; else: route_decision = parse_next_worker(response); return {'next': route_decision, 'messages': state['messages'] + [response]}`. The error_handler node can attempt to diagnose the error (ask the last worker what went wrong), route to a backup worker, or escalate to a human review queue. Building this error recovery logic in the supervisor is one of the key advantages of the supervisor pattern over swarm or network.

**Fallback agent patterns.** For mission-critical agent systems, define a fallback path for every primary path: `graph_builder.add_conditional_edges('researcher', route_researcher, {'writer': 'writer', 'error': 'fallback_researcher', END: END})`. The `fallback_researcher` is a simpler, more conservative agent (lower max_iter, smaller tool set, more explicit prompt instructions) that handles cases the primary researcher can't complete. The fallback costs more per task than the primary but less than a full failure that requires human escalation. Budget: 5-10% of tasks hitting the fallback is acceptable; above 10% means the primary agent needs improvement. Source: LangGraph error handling.

**Maximum turn limits across all patterns.** Every multi-agent system needs a global turn limit — not just per-agent recursion limits, but a total turn count across all agents in the system. Implement as a counter in shared state: `class SharedState(TypedDict): total_turns: int; MAX_TURNS = 30`. Every node increments `total_turns`; a terminal condition triggers when `total_turns >= MAX_TURNS`. This is the last line of defense against infinite loops that slip past individual agent recursion limits in complex systems with unexpected routing paths. At $0.137/turn on Sonnet 4.6, a 30-turn global limit caps per-task cost at $4.11 — set your limit to 2x your expected normal task turn count.

**Testing multi-agent coordination patterns.** Test each pattern at three levels: (1) unit tests for individual agents (does the researcher produce useful research? does the supervisor route correctly given specific state?), (2) integration tests for the full coordination flow (does the supervisor-to-worker handoff work end-to-end?), (3) chaos tests for error recovery (what happens when a worker fails mid-task? does the supervisor detect and recover?). For chaos testing: `def test_worker_failure(graph, monkeypatch): monkeypatch.setattr(researcher_tool, 'run', lambda *args: raise ValueError('Tool failed')); result = graph.invoke({'task': 'Test task'}); assert result['status'] != 'error'`. Chaos tests catch the failure modes that normal integration tests miss. Source: Langfuse eval tutorial for the full test harness.


State management and communication across patterns

**State design is the most critical architectural decision across all coordination patterns.** In a supervisor system, the state carries the supervisor's routing history, all worker outputs, and the accumulated conversation. In a swarm, state carries handoff context between agents. In a hierarchical system, state is layered — each level has its own scope. In a network, state is the shared blackboard all agents read and write. The wrong state design causes either information loss (agents can't access what they need) or information overflow (agents drown in context they don't need). Design state schema before designing agents. Source: LangGraph state management.

**Supervisor state design.** The minimal supervisor state: `class SupervisorState(TypedDict): task: str; messages: Annotated[list, operator.add]; next_worker: str; worker_outputs: dict[str, str]; turn_count: int`. The `worker_outputs` dict stores each worker's result keyed by worker name — the supervisor can reference prior workers' outputs when making routing decisions without replaying full conversation history. The `turn_count` provides the global termination condition. Keep the `messages` list for full audit trail; use `worker_outputs` for the supervisor's routing context. This separates 'what happened' (messages) from 'what the supervisor needs to know' (worker_outputs).

**Swarm state design: the handoff payload.** In swarm patterns, the state must carry enough context for the receiving agent to continue without re-requesting information the prior agent already collected. Minimum handoff state: `class SwarmState(TypedDict): task: str; current_agent: str; visited_agents: list[str]; context: str; partial_result: str`. The `context` field is the accumulated knowledge — each agent appends its findings. The `visited_agents` list prevents routing loops (an agent that has been visited twice refuses the next handoff). The `partial_result` carries in-progress output so the receiving agent can extend it rather than start fresh. Keep handoff payloads compact — target 300-500 tokens of context, not full conversation histories.

**Hierarchical state scoping.** Each level of a hierarchical system should have access only to the state relevant to its decisions. A top-level supervisor doesn't need individual worker outputs — it needs team summaries. A team supervisor doesn't need other teams' internal states — it needs the task assignment from the top supervisor and its workers' outputs. Implement state scoping with separate TypedDict classes: `class TeamState(TypedDict): team_task: str; worker_results: list[str]; team_summary: str`. The top-level supervisor receives `team_summary` only, not `worker_results`. This scoping prevents context overflow at higher levels and is the key architectural difference between a well-designed hierarchical system and a poorly-designed one that passes everything everywhere.

**Shared memory patterns in agent networks.** For decentralized agent networks, use a structured shared memory object that agents can read and append to: `class NetworkMemory: facts: list[Fact]; pending_questions: list[str]; completed_analyses: list[Analysis]`. Each `Fact` has: `{content: str, source_agent: str, confidence: float, timestamp: float}`. Agents search the `facts` list before making tool calls — if a fact already exists with confidence > 0.8, no new lookup is needed. This caching behavior naturally reduces redundant tool calls in networks where multiple agents might otherwise look up the same information. The shared memory pattern trades coordination overhead (all agents read the full memory at each step) for quality (no duplicate work). Source: AutoGen shared memory patterns.

**State persistence and recovery.** Use LangGraph's checkpointer for state persistence across all coordination patterns — this enables mid-run recovery if an agent fails. With `AsyncPostgresSaver`, every node completion checkpoints the full state. If a worker fails at step 3 of 5, the graph resumes from the last checkpoint (step 2 complete) rather than restarting from scratch. For long-running hierarchical tasks (10+ minute research pipelines), checkpoint recovery saves both time and cost. `from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver; async with AsyncPostgresSaver.from_conn_string(DB_URL) as checkpointer: graph = supervisor_graph.compile(checkpointer=checkpointer)`. Add checkpoint IDs to your monitoring dashboard — a task with 10+ checkpoints and no completion is likely stuck and needs investigation. Source: LangGraph persistence docs.


Real production examples and benchmarks

**Production supervisor: customer support routing.** A SaaS company routes customer support tickets to specialized agents: billing questions → BillingAgent (tools: billing API), technical issues → TechAgent (tools: diagnostic API, knowledge base search), feature requests → ProductAgent (tools: product roadmap database). Supervisor model: GPT-5.4-mini ($0.001/routing decision). Worker models: Sonnet 4.6 ($0.030-$0.063/worker call). Average tickets requiring 2 worker invocations: 70%; 1 invocation: 20%; 3+ invocations: 10%. Average per-ticket cost: 0.7 × (2 × $0.040) + 0.2 × $0.040 + 0.1 × (3 × $0.040) = $0.056 + $0.008 + $0.012 + supervisor overhead $0.007 = $0.083/ticket. At 50,000 tickets/month: $4,150/month. Compared to $0.137 single-agent loop for all tickets: $6,850/month — **supervisor pattern saves $2,700/month by routing easy tickets to fewer worker calls.**

**Production swarm: research exploration.** An academic research assistant uses a swarm of specialized agents: LiteratureAgent (searches arxiv, Google Scholar), DataAgent (queries datasets, computes statistics), SynthesisAgent (synthesizes findings into structured summaries), CitationAgent (formats and validates citations). Each agent decides when to hand off to another based on what it needs. Average 4 agent activations per research query, 2 handoffs: handoff overhead $0.010, worker costs 4 × $0.050 = $0.200. **Total: $0.210/query** — comparable to supervisor for same task but faster (agents activate in parallel when their topic appears). At 5,000 research queries/month: $1,050/month on Sonnet 4.6.

**Production hierarchical: enterprise market analysis.** A fintech company runs daily market analysis using 12 specialized agents organized into 3 teams: Data Team (price fetcher, news aggregator, SEC filing reader), Analysis Team (quantitative analyst, risk assessor, trend identifier), Reporting Team (summary writer, chart generator, distribution manager). Top supervisor routes between teams; team supervisors route between workers. Average daily run: 3 team activations × 4 worker activations each = 12 worker calls + 3 + 1 supervisor calls = 16 LLM calls. Cost: 12 × $0.063 workers + 4 × $0.007 supervisors = $0.756 + $0.028 = **$0.784/daily run** on Sonnet 4.6. $24/month for daily enterprise-grade market reports — highly cost-effective for the output quality.

**Benchmark: SWE-bench agent task on 4 patterns.** A software engineering task (implement a small feature in a Python codebase, write tests): single-agent Opus 4.7 = $0.683 (76% completion rate on SWE-bench-style tasks). Supervisor pattern (architect supervisor + coder worker + tester worker, Sonnet 4.6) = $0.218 (68% completion rate — lower model, but 3x cheaper). Swarm pattern (coder → tester → debugger, Sonnet 4.6) = $0.195 (65% completion rate). Hierarchical (2-level, 5 workers, Sonnet 4.6) = $0.580 (71% completion rate — more workers improve quality but at cost). **Cost-adjusted quality (completion rate / cost × 100): single Opus = 111, supervisor Sonnet = 312, swarm Sonnet = 333, hierarchical Sonnet = 122.** Swarm and supervisor on Sonnet 4.6 are 3x better cost-adjusted value than single Opus for software engineering tasks.

**The emerging pattern: adaptive coordination.** A small number of production systems in 2026 use adaptive coordination — starting with a single agent and dynamically spawning additional agents when the task proves complex. Initial agent attempts the task; if stuck after N turns or confidence drops below a threshold, it spawns a helper agent with a complementary skill set. The coordination pattern emerges from task complexity rather than being predetermined. This is the most token-efficient pattern for tasks with bimodal complexity (most are easy, a few are very hard): easy tasks complete with 1 agent, hard tasks spawn 2-3. Average cost: 0.8 × $0.137 (single agent) + 0.2 × $0.350 (spawned helper) = $0.110 + $0.070 = **$0.180/query** — cheaper than static supervisor for the same task distribution. Source: LangGraph dynamic graph construction docs.

Implement your multi-agent coordination pattern in 5 steps

  1. 1

    Choose your pattern based on task structure, not architectural preference

    Answer four questions: (1) Is the step sequence fixed? → sequential. (2) Does a controller decide routing? → supervisor. (3) Do agents hand off peer-to-peer? → swarm. (4) Are there 10+ agents that need team organization? → hierarchical. Don't pick based on what sounds most impressive. Start with the simplest pattern that fits the task structure. You can always add complexity; removing it is harder. The supervisor pattern covers 70% of production multi-agent use cases that can't be handled by simple sequential execution.

  2. 2

    Implement the core routing logic with explicit state

    For supervisor: define a `next_worker` field in state and a supervisor node that populates it, plus a conditional edge that routes to the named worker or END. For swarm: give each agent a `handoff_to_*` tool and a routing function that detects handoff tool calls. For hierarchical: build team graphs first, then compose them as nodes in the top-level graph. Always add a `total_turns` counter to state and a termination condition at your budget maximum. The routing logic is the most important part — debug it first with mock agents before connecting real LLMs.

  3. 3

    Add error handling at the coordination level

    For supervisors: add an error state to routing and an error_handler node that can retry or escalate. For swarms: add a timeout mechanism (if no handoff within N turns, route to a fallback agent). For hierarchical: verify that team-level supervisors catch worker errors before propagating up. Add try/except around every agent invocation with error messages that feed back to the coordinating agent. A multi-agent system without coordination-level error handling fails ungracefully on the first tool error in any worker.

  4. 4

    Profile cost per pattern on your actual task distribution

    Run 30 representative tasks through each candidate pattern. Log total LLM calls, input tokens, output tokens, and wall time per task. Calculate: coordinator overhead (supervisor/hierarchical calls), worker execution cost, inter-agent message overhead, and synthesis cost. Compare to the single-agent baseline. The right pattern is the cheapest one that meets your quality bar — not the most architecturally sophisticated. Most teams are surprised by how much the coordinator overhead adds up at scale.

  5. 5

    Instrument with Langfuse and run a baseline eval

    Add the Langfuse callback handler to every agent's LLM. Run your golden dataset through the multi-agent system. Score with LLM-as-judge on your quality dimensions. Compare to the single-agent baseline score. If the multi-agent system doesn't beat single-agent quality on your dataset, reconsider the pattern choice — the coordination overhead is only worth paying for a measurable quality improvement. See the agent eval with Langfuse tutorial for the full eval setup: /tutorial/agent-eval-with-langfuse.

Frequently Asked Questions

What is the supervisor multi-agent pattern?

In the supervisor pattern, a central controller LLM receives the task, routes to specialized worker agents based on what needs to be done, receives worker results, and decides whether to continue routing or terminate. All coordination flows through the supervisor — workers never communicate directly. It's the most common production multi-agent pattern because it's explicit, debuggable, and easy to extend. LangGraph implements it via a supervisor node with conditional edges that route to worker nodes. Source: LangGraph multi-agent concepts at langchain-ai.github.io/langgraph/concepts/multi_agent/.

What is the difference between supervisor and swarm multi-agent patterns?

In supervisor, a central controller decides which agent works next — all routing decisions are explicit and logged. In swarm, agents hand off directly to each other peer-to-peer based on task context — no central controller. Supervisor is more debuggable and controllable; swarm has lower coordination overhead and more natural parallelism. Supervisor costs ~1.8-3x a single agent; swarm costs ~1.5-2.5x. Use supervisor when you need explicit routing control and audit trails; use swarm for dynamic workflows where any agent might hand off to any other based on what they discover.

When should I use the hierarchical multi-agent pattern?

When you have 10+ specialized agents that can't be effectively managed by a single supervisor. Hierarchical nests supervisors — a top-level supervisor delegates to team supervisors (research team, writing team, review team), which delegate to individual workers. The extra coordination levels add 2.5-5x cost vs a single agent (vs supervisor's 1.8-3x) and are only justified when the team structure genuinely reflects distinct specializations with their own tools, models, and workflows. For teams of under 6 agents, a single supervisor handles coordination efficiently. Source: LangGraph hierarchical patterns at langchain-ai.github.io/langgraph/concepts/multi_agent/.

Is the agent network pattern production-ready in 2026?

Partially. Fully decentralized agent networks (no coordinator, pure shared-memory communication) remain in the research category in 2026 due to high cost (3-10x single-agent) and poor debuggability (emergent routing is hard to trace). Orchestrated network hybrids — a lightweight coordinator monitoring a shared state bus — are production-ready and used in enterprise research and simulation systems. For most SaaS applications, supervisor or swarm patterns deliver adequate quality at lower cost and higher debuggability. Use networks for research tasks with unknown exploration paths; use supervisor/swarm for production business processes.

How do I handle errors in multi-agent coordination patterns?

Design error handling at the architecture level: in supervisor, add an explicit error state to routing logic and an error_handler node that can retry or escalate; in swarm, add timeout mechanisms (if no handoff within N turns, route to a fallback); in hierarchical, ensure team supervisors catch worker errors before they propagate up. Add `handle_tool_errors=True` to all ToolNodes. Add a global `total_turns` counter with a termination condition. Without coordination-level error handling, a single tool error in any worker causes graceless failure of the entire multi-agent run.

How does LangGraph implement multi-agent coordination?

LangGraph implements multi-agent coordination via: (1) supervisor nodes with conditional edges that route based on state (for supervisor pattern), (2) `Send` API for parallel worker dispatch and true concurrent execution, (3) subgraphs as worker nodes (each worker is a compiled LangGraph graph used as a node in the supervisor graph), (4) shared TypedDict state that all agents read and write, with Annotated fields for append semantics. For swarm: each agent has handoff tools that output routing signals, detected by conditional edges. Source: LangGraph multi-agent concepts at langchain-ai.github.io/langgraph/concepts/multi_agent/.

What is the cheapest multi-agent pattern for production?

For equivalent task quality, the cost order from cheapest to most expensive is: sequential/swarm (1.5-2.5x single agent) < supervisor (1.8-3x) < hierarchical (2.5-5x) < network (3-10x). For most production tasks at 100K/month on Sonnet 4.6, swarm and supervisor cost $14,000-$23,000 vs single-agent $13,700 — a 2-70% premium for multi-agent quality improvements. Always benchmark against the single-agent baseline first: if a single Opus 4.7 loop ($68,250/month) quality matches a 3-worker Sonnet supervisor ($23,000/month), the supervisor pattern saves 66%. The cheapest pattern depends on your specific quality requirements and task distribution.

How does AutoGen's multi-agent pattern differ from LangGraph's?

LangGraph uses an explicit state graph — nodes transform state, edges route between nodes based on conditions. It's highly structured, gives you fine-grained routing control, and is best for supervisor and hierarchical patterns. AutoGen uses a conversational actor model — agents send messages to each other, and a GroupChatManager decides who speaks next. It's more natural for swarm-style peer-to-peer workflows and conversational multi-agent systems. Both support multiple coordination patterns, but LangGraph excels at structured routing; AutoGen excels at conversational multi-agent flows. For production in 2026: LangGraph for supervisor/hierarchical, AutoGen for swarm/network. Source: AutoGen stable docs at microsoft.github.io/autogen/stable/.

The pattern is the architecture. The prompts are the quality.

Our AI Prompt Generator writes coordination-pattern-aware prompts for supervisor agents, worker agents, and handoff descriptions — structured to reduce coordination overhead and keep worker outputs concise. Works with LangGraph, CrewAI, and AutoGen. 14-day free trial, no card.

Browse all prompt tools →