Architecture
Multi-Agent Architecture: Patterns, Tradeoffs, and Reference Designs
Learn when multi-agent architecture outperforms single-agent systems, which coordination patterns fit best, and how to manage context, reliability, security, and cost.

Guide coverage
Architecture
Agent News Watch for teams building and operating AI agents.
Start with one agent whenever you can. Add multiple agents only when specialization, context isolation, or parallel work makes the system easier to understand and operate, not simply because the pattern sounds more advanced.
Multi-agent architecture is the design choice to split one workflow across multiple specialized agents or agent-like roles. The point is not novelty. The point is to decide when separate planners, researchers, reviewers, routers, or executors reduce confusion enough to justify the added coordination cost. If the workflow itself is not settled yet, start with AI Agent Use Cases. Then use AI Agent Architecture. If you are comparing workflow-control layers, keep AI Agent Orchestration close as well.
This page sits between architecture, orchestration, and protocol decisions. The live A2A v1.0.0 brief matters because it turns cross-service delegation into a clearer protocol question. The Google ADK 2.0 alpha brief matters because workflow runtimes and task APIs make multi-agent systems more inspectable. But the hardest question still comes first: should the system be multi-agent at all?
When to use multi-agent architecture and when not to
Multi-agent design is justified when one bounded agent can no longer carry the workflow cleanly. The clearest signals are parallelizable subtasks, sharply different skill domains, separate ownership boundaries, or enough context volume that one prompt loop becomes noisy and fragile.
1Signal | One agent is enough | Multiple agents may help2Task decomposition | One role can own the job end to end | Distinct planner, researcher, reviewer, or executor roles emerge3Context size | Shared context stays small and relevant | Each role needs its own working context window4Tool surface | One bounded tool set is easy to govern | Different roles need very different tools or permissions5Parallel work | Steps run mostly in sequence | Independent subtasks can run concurrently6Ownership and operations | One team owns the full workflow | Different teams or services own different capabilities
That rubric keeps teams from splitting too early. If the real problem is only better validation, a tighter prompt, or one approval gate, a single-agent architecture usually wins. Reach for multiple agents only when separation removes complexity instead of hiding it.
The four patterns builders actually use
Supervisor and subagents
A supervisor pattern works when one coordinator decides which specialist to call next and what the success criteria should be. It is especially useful when the workflow needs role separation but still benefits from one top-level view of progress and task state.
1Orchestrator-worker pattern2User goal3 -> supervisor agent4 -> researcher agent (collect evidence)5 -> analyst agent (compare options)6 -> executor agent (prepare approved action)7 -> supervisor reviews outputs8 -> human approval or final response
Router and parallel dispatch
A router pattern works when the workflow can fan out to multiple specialists or tools in parallel, then merge the results. The key decision is whether the merge logic stays deterministic or whether another agent needs to reconcile conflicting outputs.
Stateful handoffs
Stateful handoffs work when each phase of the workflow needs a different role, but only one role should be active at a time. The critical design choice is the handoff packet: what state moves forward, what stays local, and what evidence the next role needs to trust the transition.
1Stateful handoff pattern2Planner agent3 -> creates task brief + constraints + required evidence4 -> hands packet to specialist agent5Specialist agent6 -> executes within scoped tools and local context7 -> returns artifact + status + unresolved questions8Reviewer agent or human9 -> approves, routes back, or closes task
Skills and progressive disclosure
Skills patterns keep one top-level agent in charge while smaller specialist routines or agents activate only when needed. This is often the right middle ground for teams that want specialization without turning the whole system into a network of peer agents.
1Pattern | Best fit | State model | Main upside | Main failure mode2Supervisor + subagents | role-specialized delivery workflows | supervisor owns shared task state | clear coordination point | overloaded supervisor3Router + fan-out | parallel search, retrieval, or analysis | per-branch local state + merge state | latency and coverage gains | expensive merge or duplicate work4Stateful handoffs | staged workflows with explicit transitions | handoff packet plus agent-local state | strong boundary control | missing context at transition5Skills / progressive use | one main agent with occasional specialist help | mostly single-agent, optional local state| lower complexity than full network| specialization stays too implicit
How to design context, memory, and state boundaries
Most multi-agent failures are state failures before they are model failures. The system either shares too much context and bloats every prompt, or it shares too little and forces agents to reconstruct the task from scratch. Good multi-agent design treats state as a first-class artifact.
1State bucket | What belongs there | What to avoid2Global task state | task goal, policies, latest approved artifact | every intermediate chain of thought3Agent-local working state | temporary notes, scratch work, tool-specific data | long-lived facts other agents depend on4Handoff packet | constraints, evidence, open questions, next step | vague summaries with no source grounding5Durable memory | reusable facts explicitly approved for reuse | speculative conclusions from one run
The handoff packet is where most teams either win or lose. It should contain the minimum evidence the next agent needs, the exact constraint set that still applies, and the unresolved questions that require attention. If the packet is only a vague summary, the system turns into a game of telephone.
Where orchestration, MCP, and A2A fit
Multi-agent architecture overlaps with orchestration and protocols, but it does not replace them. Architecture answers why multiple roles exist. Orchestration answers how the workflow coordinates them. Protocols answer how communication works across boundaries.
1Layer | Main question | Guide to read next2Architecture | Should the workflow split into multiple roles? | AI Agent Architecture3Orchestration | How do retries, approvals, and branches run? | AI Agent Orchestration4MCP | How do agents reach tools and resources safely? | Model Context Protocol5A2A | How do separate agent services delegate tasks? | Agent-to-Agent Protocol
Use Model Context Protocol when the issue is standardized tool and resource access. Use Agent-to-Agent Protocol when the issue is cross-service task delegation. If the whole system still lives inside one bounded runtime, internal orchestration may be enough and a network protocol may be unnecessary.
Evaluate and secure the whole multi-agent system
Single-turn quality checks are not enough once multiple agents collaborate. The system now needs end-state evaluation, trace inspection, and role-level accountability. You are no longer scoring only one output. You are scoring the path that produced it.
1Evaluation layer | What to score2Role quality | Did each agent produce the right artifact for its job?3Handoff quality | Did the next role receive enough grounded context?4End-state quality | Did the combined workflow solve the real task?5Operational behavior | How many retries, stalls, duplicate steps, or escalations happened?6Security posture | Were permissions, approvals, and delegated scopes enforced?
That is why AI Agent Evaluation and AI Agent Security become more important, not less, in multi-agent systems. Cross-agent trust, permission scoping, and audit trails matter because one bad handoff can widen the blast radius far beyond one prompt.
1Cost and latency tradeoff | Single-agent default | Multi-agent impact2Prompt and token cost | One context window | Multiple contexts plus handoff artifacts3Latency | One main reasoning path | Parallel speedups or sequential coordination overhead4Observability | One trace to inspect | More traces, richer audit surface, more complexity5Reliability | Fewer moving parts | More failure points, but cleaner role boundaries if designed well
Reference designs by workload
Research systems
Research is one of the clearest multi-agent fits because evidence gathering, analysis, and synthesis often benefit from separate roles. A planner defines the scope, researchers gather sources in parallel, and a reviewer or synthesizer produces the final artifact. This is where explicit context windows and end-state evaluation make the biggest difference.
Support workflows
Support systems justify multiple agents only when the workflow truly has different phases or owners, such as intake triage, policy review, and resolution planning. If one agent can read the ticket, pull context, and draft the next step cleanly, stay single-agent. The split should solve an operational problem, not create one.
Coding and engineering workflows
Coding workflows sometimes benefit from separate researcher, implementer, and reviewer roles, especially when the system needs isolated tool permissions or parallel investigation. But they only work in production if tests, approvals, and rollback rules stay explicit. Otherwise the extra agents only spread confusion across more traces.
A builder decision tree for when to split the system
1Start with one bounded agent2 -> Is the workflow still clear with one role and one tool surface?3 -> Yes: keep it single-agent4 -> No: does specialization reduce ambiguity or permission scope?5 -> No: improve orchestration or validation instead6 -> Yes: can the handoff packet and approval rules stay explicit?7 -> No: redesign the workflow before splitting8 -> Yes: add the smallest multi-agent pattern that solves the need
Common architecture mistakes
Splitting too early
Teams often add agents before they have proven a single-agent version of the workflow. That usually hides weak task definition behind extra coordination rather than solving the real problem.
Letting agents duplicate work
If multiple agents re-run the same retrieval, reasoning, or validation work because state boundaries are vague, the system pays a high token tax for very little gain.
Losing source fidelity across handoffs
A handoff that carries only a summary instead of evidence invites drift. Preserve citations, retrieved artifacts, and clear status so the next role can challenge bad assumptions instead of inheriting them silently.
Treating every handoff as a protocol problem
Many teams jump to interoperability standards when the real issue is internal workflow control. Use A2A only when the delegation boundary actually crosses runtimes, services, or ownership domains. Otherwise start with orchestration and keep the communication internal.
What to read next
Use AI Agent Use Cases when the workflow itself still needs to earn the extra coordination cost, AI Agent Architecture for the broad system map, AI Agent Orchestration for workflow control, Agent-to-Agent Protocol when delegation crosses service boundaries, Model Context Protocol for tool and resource access, AI Agent Evaluation for end-state scoring, and AI Agent Security for trust-boundary design. Then keep the A2A v1.0.0 brief and the Google ADK 2.0 alpha brief nearby as the runtime and interoperability layers keep moving.
Continue the guide path
Move from this topic into the next pilot, architecture, stack, protocol, or live-release decision.

Guide coverage
Foundations / Implementation
Agent News Watch for teams building and operating AI agents.
Foundations / Implementation
Learn the best AI agent use cases for product, ops, engineering, and support teams, plus how to choose the right autonomy level, architecture, and rollout path.

Guide coverage
Architecture
Agent News Watch for teams building and operating AI agents.
Architecture
Learn how AI agent architecture works across models, tools, memory, orchestration, guardrails, and multi-agent patterns with practical reference designs.

Guide coverage
Implementation
Agent News Watch for teams building and operating AI agents.
Implementation
Learn AI agent orchestration patterns for coordinating state, tools, retries, approvals, and multi-step workflows without overbuilding your stack.

Guide coverage
Protocols
Agent News Watch for teams building and operating AI agents.
Protocols
Learn what Agent-to-Agent Protocol is, how A2A handles cross-agent communication, and when builders should care about A2A versus MCP.

Guide coverage
Evaluation
Agent News Watch for teams building and operating AI agents.
Evaluation
Learn how to evaluate AI agents with task-based evals, regression checks, human review, and production metrics across tools, safety, latency, and cost.

News coverage
Protocols / Interoperability
Agent News Watch for teams building and operating AI agents.
Protocols / Interoperability
A2A v1.0.0 adds tasks/list, modern OAuth flows, multitenant gRPC support, and breaking spec cleanup. Here is what agent builders need to test.