The core question: does your task benefit from parallelization or specialization?
The multi-agent pattern earns its overhead in exactly two circumstances: when your task has truly independent subtasks that can run in parallel (reducing wall-clock time proportional to the degree of parallelism), and when your task benefits from context isolation (each agent starts fresh, avoiding the quality degradation and increased token cost of long accumulated context). Every other argument for multi-agent — 'it's more modular,' 'each agent has a clear role,' 'it's more like how teams work' — is aesthetic, not economical.
**Parallelism requires genuine independence.** Two subtasks are truly independent if the output of one is not an input to the other. Searching for news about five different companies simultaneously is parallel — each search doesn't depend on the others. Searching for news, then summarizing findings, then writing a report is sequential — each step depends on the prior one. If your DAG (directed acyclic graph) of tasks is a straight line, multi-agent gives you coordination overhead with zero parallelism benefit.
Context isolation benefits accrue when your task has enough sequential steps that the accumulated input context becomes expensive or quality-degrading. **A common threshold: roughly 15-25 sequential steps**, beyond which a single agent's growing context costs more per turn (in tokens and quality) than the coordination overhead of splitting the task across agents with clean contexts. Below that threshold, single agent wins on simplicity every time.
The specialization argument is valid only when genuinely different capabilities are required. If step 3 of your task requires adversarial fact-checking that must be independent of step 1's generative work, a separate fact-checker agent with a clean context provides real quality value. If steps 1 through 5 all require the same kind of reasoning (summarization, extraction, Q&A), separate agents just multiply your API calls without adding capability.
When neither parallelism nor specialization applies, the multi-agent pattern is pure overhead: 10-30% more tokens spent on orchestration messages, summaries, and handoffs; a more complex observability setup; harder debugging; and more failure modes to manage. The simplest architecture that meets your requirements is almost always the right architecture. Start with single agent and refactor to multi-agent when you have concrete evidence of a bottleneck that multi-agent solves.