Developer guide to multi‑agent patterns with ADK

Última actualización: 12/24/2025
  • Multi‑agent systems in ADK replace monolithic prompts with modular, cooperating agents.
  • Workflow agents (Sequential, Loop, Parallel) orchestrate LLM and custom agents via shared session state.
  • Google Cloud provides a reference architecture, security and observability stack for deploying ADK MAS.
  • Patterns like coordinator, pipelines, fan‑out/gather and iterative refinement emerge naturally from ADK primitives.

multi-agent adk guide

Agentic applications are quickly outgrowing the classic “single mega‑prompt” pattern, and developers need a solid mental model to structure multiple agents without descending into chaos. Google’s Agent Development Kit (ADK) was designed precisely for this: it lets you compose reliable multi‑agent systems, wire in tools and memory, and deploy everything on Google Cloud with production‑grade observability, security and cost control.

This guide walks you through the main multi‑agent patterns supported by ADK – from simple parent/child hierarchies to Sequential, Loop and Parallel workflow agents – and shows how they fit into a broader reference architecture on Google Cloud. We will also cover shared session state, delegation mechanisms, common multi‑agent blueprints, and the practical aspects of deploying, securing and operating these systems in real environments.

Why multi‑agent systems in ADK?

When an application is driven by a single, monolithic agent prompt, it quickly becomes hard to reason about, test and evolve. Huge prompts are fragile, tricky to debug and painful to maintain as requirements grow. ADK pushes you toward building a Multi‑Agent System (MAS) where each agent has a focused responsibility, and orchestration is explicit.

Structuring your app as several cooperating agents brings modularity, specialisation and reusability. You can have a research agent, a critic, a file‑writer, a router, a data‑access agent, and reuse them across projects or workflows instead of re‑embedding the same logic into one jumbo prompt.

ADK gives you concrete building blocks to realize this: LLM‑centric agents, workflow agents (Sequential, Parallel, Loop) and custom agents that encapsulate non‑LLM logic. All of them inherit from a common BaseAgent, so they plug into the same orchestration model, logging, state handling and tool system.

As systems grow, this composition model scales better than ad‑hoc orchestration code or deeply nested chains of function calls around a single model. You keep the cognitive load manageable: each agent has a clear mandate and a well‑defined interaction surface with others.

ADK primitives for composing agents

ADK exposes a small set of primitives that you can combine to express surprisingly rich multi‑agent architectures. Understanding these core concepts makes it much easier to reason about higher‑level patterns later.

The first primitive is the agent hierarchy: every agent can declare a list of sub_agents, forming a tree with a single root_agent at the top. When you pass children into sub_agents, ADK automatically wires their parent_agent reference and enforces that a given instance has only one parent (otherwise a ValueError is raised).

This hierarchy is more than décor: it defines which agents are allowed to delegate to which, and it is the scope used by workflow agents and LLM‑driven transfer. From any agent, you can traverse upward via agent.parent_agent or find descendants with agent.find_agent(name), which is extremely handy for debugging.

On top of basic LLM agents, ADK introduces dedicated workflow agents – SequentialAgent, ParallelAgent and LoopAgent – whose job is not to “think” but to orchestrate sub‑agents. They all share the same interface but implement different execution strategies: run in order, fan‑out in parallel, or iterate in a loop with explicit termination rules.

The third essential primitive is the communication layer, centered on the shared Session and its state dictionary. Session state acts like a common whiteboard: any agent or tool can write intermediate results or flags, and other agents in the same invocation can read them, often via key‑templating inside their instructions (for example {PLOT_OUTLINE?}).

How agents talk to each other in ADK

ADK supports three complementary communication modes between agents: shared session state, LLM‑driven transfer, and explicit invocation via AgentTool. Choosing the right one for each interaction keeps your system both expressive and predictable.

Shared session state (session.state) is the simplest and most pervasive mechanism. Within a single invocation, all agents see the same Session object via the InvocationContext. A tool or callback can do context.state = value, and a later agent can retrieve it with context.state.get("data_key"). For LLM agents, setting output_key automatically persists their final answer under that key.

LLM‑driven delegation, sometimes called “agent transfer”, lets an LLM decide when to hand off to another agent based on instructions and agent descriptions. Internally, the model issues a special function call like transfer_to_agent(agent_name="screenwriter"). ADK’s AutoFlow intercepts this call and re‑routes execution to the chosen agent within the allowed scope (parent, children, siblings depending on configuration).

Explicit invocation with AgentTool gives you a more controlled, function‑like way to call one agent from another. You wrap a target agent instance in AgentTool, add it to the caller’s tools list, and the LLM can then select that tool like any other function. When invoked, AgentTool.run_async executes the sub‑agent, merges state and artifacts back, and returns the sub‑agent’s response as the tool result.

These three channels cover most multi‑agent needs: asynchronous data passing via state, flexible routing via transfer, and tight synchronous calls via tools. In more complex designs, you often mix them inside a single tree: a router that transfers to children, specialists that use state to communicate, and one or two agents available as tools for ad‑hoc delegation.

Building blocks: LLM agents, workflow agents and custom agents

Most MAS topologies in ADK are combinations of three agent types: LLM‑based, workflow and custom agents. Each category solves a different slice of the orchestration problem.

LLM agents wrap a large language model and optional tools, callbacks and output routing into BaseAgent. Think of them as your “thinking” components: they interpret user input, call tools, update state and either answer the user or hand off to another agent.

Workflow agents act as managers rather than workers: they do not reason themselves, but they control the order, parallelism and repetition of sub‑agent execution. SequentialAgent runs its children one after another while sharing the same InvocationContext, ParallelAgent fans out across multiple branches that share state but have distinct history branches, and LoopAgent executes a sequence repeatedly until a stop condition is met.

Custom agents extend BaseAgent with arbitrary non‑LLM logic when the built‑in orchestration strategies are not enough. For example, you might implement a custom scheduler that executes agents conditionally based on metrics, or integrate a business rule engine that determines which sub‑flow to run depending on regulatory constraints.

This mix of generic orchestration primitives and pluggable logic is what makes ADK suitable for serious enterprise workloads, not just demos. You can start with the standard workflow agents, and only when requirements become exotic do you reach for CustomAgent.

Session state and memory patterns

Session state in ADK underpins both short‑term conversational memory and structured data passing between agents. Every conversation uses a Session object holding message history and a mutable state dictionary available to all agents in that invocation.

Writing to state is usually done inside tools or callbacks, using the ToolContext or CallbackContext object. For instance, a tool like save_attractions_to_state(tool_context, attractions: List) can merge new attractions with those already stored under state, returning a simple status message to the agent while ADK persists the state delta in the session.

Reading from state is made ergonomic via key templates embedded in instructions. When an instruction contains {my_key?}, ADK will inject state if it exists; the question mark makes it optional so the agent does not fail when the key is missing. This is critical in workflows like “research → write → review” where each step reads what the previous step saved.

For conversational memory across turns, the key idea is to reuse the same Session for subsequent user messages instead of creating a new one every time. With a shared session, the agent sees prior turns and can handle follow‑up questions, corrections and multi‑step planning. If you accidentally create a brand‑new session per turn, the agent behaves as if it had amnesia: it cannot link follow‑ups to previous context.

State also plays a big role in workflow agents like LoopAgent, which rely on persistent keys such as counters, feedback lists or flags to decide whether more iterations are required. A critic agent might append comments into CRITICAL_FEEDBACK on each pass, while a planner or refiner reads that key to improve the plan in the next iteration.

SequentialAgent: linear workflows made explicit

SequentialAgent is your go‑to pattern when you have a series of steps that must occur in a fixed order. Think of pipelines like “analyze request → research → draft → save to file” or “identify destination → plan route → book transportation”.

In ADK, a SequentialAgent holds a list of sub_agents and runs them one by one, passing the same InvocationContext through the entire chain. Because the Session and state are shared, you can have the first agent store its result under output_key="destination" and the next agent read it via {destination} in its instruction without any glue code.

A classic example is a film pitch generator: a greeter root agent talks to the user, then hands work to a SequentialAgent that calls a researcher, then a screenwriter, then a file‑writer. The user only sees the final outcome, but the event graph in ADK Web reveals the internal tree: greeter → film_concept_team → .

Compared to manual orchestration with explicit if/elif blocks and function calls, SequentialAgent keeps control‑flow declarative and minimizes boilerplate. You declare the sequence once and treat it as a single callable agent in your runner or UI, while leveraging session state to pass data between steps.

Sequential workflows also combine nicely with other workflow agents: you can embed a loop or a parallel fan‑out as one of the steps in a longer chain. This is how more advanced flows like “iterate on story quality, then run box‑office and casting analysis, then write a consolidated report” are built.

LoopAgent: iterative refinement and writer rooms

LoopAgent is designed for tasks that benefit from several passes over the work until a quality threshold is met. Instead of a single “generate once and hope for the best” approach, you can encode a process of proposal, critique and refinement.

A typical loop configuration includes agents such as a researcher, a generator (e.g. screenwriter) and a critic that collaborate over multiple rounds. On each iteration, the researcher may update background facts, the screenwriter adjusts the outline or plan, and the critic evaluates it against explicit guidelines, deciding whether more iterations are warranted.

Loops stop under two conditions: reaching max_iterations or a sub‑agent signalling that the work is done. ADK exposes a built‑in tool like exit_loop that the critic can call when a plan, outline or design passes its internal checklist. LoopAgent also respects an escalate=True flag in Event actions, giving you another way to break out early.

Persistent session state is key here: agents read keys like PLOT_OUTLINE, research or CRITICAL_FEEDBACK and write improved versions or additional comments on each pass. This pattern effectively simulates a “writers’ room” where specialists brainstorm, critique and polish until someone declares the work ready.

By combining LoopAgent with SequentialAgent, you can place the whole iterative loop as just one step in a larger end‑to‑end workflow. For example, writers_room (LoopAgent) might run first to produce a solid plot outline, after which a file_writer agent saves the result and attaches other reports.

ParallelAgent: fan‑out and gather for independent tasks

ParallelAgent implements the classic “fan‑out / gather” pattern for tasks that are independent but share context. Instead of running N research steps in series, you run them all at once and wait for all to complete, then aggregate their outputs.

Internally, ParallelAgent gives each sub‑agent a distinct InvocationContext.branch – like ParentBranch.ChildName – while still sharing the same session.state. That means they can all read initial context like PLOT_OUTLINE, but should write outputs to distinct state keys (for example box_office_report, casting_report) to avoid conflicts.

A common example is a “pre‑production team” for a movie pitch: one agent estimates box‑office potential based on comparable films, another proposes casting options, both running in parallel. A subsequent file_writer then composes a report using key templates for each sub‑result and persists it to disk.

Parallel workflows significantly reduce latency for wide queries and in escenarios de análisis de datos en tiempo real: if you need museum suggestions, concert options and restaurant ideas for a weekend, running three specialist agents in parallel is faster than querying them sequentially. After the fan‑out, a synthesis agent reads all results from state and produces a unified response for the user.

Parallel steps are almost always embedded inside a SequentialAgent that first prepares context, then runs the ParallelAgent, then continues with aggregation and reporting. This pattern is easy to recognize and reuse once you get comfortable with ADK’s workflow agents.

Orchestration patterns with ADK primitives

Once you grasp hierarchy, workflow agents and state, you can implement several classic multi‑agent patterns directly in ADK. These patterns are not hard‑coded primitives but compositions built with the same basic building blocks.

The coordinator/dispatcher pattern uses a central LLM agent as a “router” for user queries, backed by multiple specialized sub‑agents. The coordinator reads the request, then either transfers control to a sub‑agent via LLM‑driven delegation or calls specialists explicitly using AgentTool. Foodie, transportation or weekend‑guide agents are common examples.

The sequential pipeline pattern is simply a SequentialAgent whose children each implement a well‑scoped step of a process. Generator‑and‑critic flows are a classic variant: the first agent writes a draft and saves it under an output_key, the second agent analyses it and saves feedback, and maybe a third agent refines the result based on that feedback.

The parallel fan‑out/gather pattern is expressed as a ParallelAgent nested inside a sequential workflow. Parallel children write results into separate state keys; a later synth agent reads them back and presents a combined answer.

Hierarchical task decomposition emerges naturally from the parent/child tree. Higher‑level agents break goals into sub‑goals and delegate them to children (either via delegation or tools), with results rolling back up the tree. This is particularly useful in research assistants, supply‑chain optimizers or financial advisor systems where each sub‑domain has its own specialist agent.

Iterative refinement with LoopAgent formalizes the generator-critic loop into a reusable pattern. The loop executes planner, critic and refiner agents multiple times, using state keys to persist the latest plan and corresponding feedback, stopping when a quality criterion or iteration cap is reached.

Reference architecture for multi‑agent systems on Google Cloud

Beyond agent logic, you still need to run your system somewhere, and Google Cloud offers a well‑defined reference architecture for production‑grade multi‑agent deployments. At a high level, the solution combines a frontend, agent runtimes, Vertex AI models, security services and optional tool frameworks like MCP.

The typical setup starts with a frontend – often a chat interface – running on Cloud Run. Users talk to this UI, which forwards requests to a coordinator agent exposed as a service. This coordinator then chooses between different agent workflows based on user intent, including optional human‑in‑the‑loop paths where people can validate or override agent decisions.

Agents themselves can run in several environments: Cloud Run services, Google Kubernetes Engine (GKE) or Vertex AI Agent Engine. ADK spans these options, abstracting away some of the runtime details so the developer focuses on agent logic rather than infrastructure plumbing.

All agent calls rely on Vertex AI or other model runtimes for inference, often wrapped with Model Armor to sanitize prompts and responses. Model Armor helps filter prompt‑injection attempts, sensitive‑data leaks or harmful content before or after model calls, acting as a safety guardrail around generative components.

MCP (Model Context Protocol) tools and servers enter the picture when agents must talk to external systems – databases, filesystems, or SaaS APIs – in a standardized way. MCP defines a common contract between agent and tool server, so a single MCP client in your agent can access many tools built by different teams without tight coupling; esto incluye consideraciones sobre sistemas de almacenamiento de datos y cómo exponerlos de forma segura.

Security and governance for agentic applications

Agentic systems introduce security challenges that go beyond traditional microservices, because LLMs can be tricked into misusing tools or leaking data if you are not careful. Google’s recommended approach layers deterministic security controls with LLM‑aware, policy‑driven defences; también es clave entender los límites, sesgos y riesgos de los modelos al diseñar estas defensas.

Human oversight remains paramount: high‑impact flows should include approval steps where a person can pause, review or veto an agent’s proposed action. This can be modeled as a dedicated “human‑confirmation” tool that surfaces requests to a UI and only resumes execution once a human responds.

Access control for agents is handled through IAM: each agent or service account should have only the minimal permissions required to perform its duties. If a given agent is compromised or misused, the blast radius is limited because its service account cannot access unrelated resources or tools.

Policy‑driven tool gating, implemented with components like a SecurityPlugin plus a PolicyEngine, lets you demand user confirmation before certain tools run. When a policy flags a sensitive call, the plugin intercepts it, emits a special “ask for confirmation” function call, and waits for your application to return a verdict, effectively putting a human‑in‑the‑loop for high‑risk operations.

Standard Google Cloud security features complete the picture: VPC Service Controls to reduce data exfiltration risk, CMEK for customer‑managed encryption keys, Cloud Armor for WAF and DDoS protection, IAP or Identity Platform for authenticating users and granular IAM for resource access. For agent‑to‑agent communication via A2A, TLS 1.2+ and OAuth‑based authentication are required or recommended in production.

Reliability, observability and cost optimisation

Production MAS deployments must be reliable, observable and cost‑efficient; ADK integrates well with Google Cloud’s operational tooling to make that possible. You can instrument agents, sessions and tools so that their logs and traces surface in Cloud Logging and Cloud Trace.

From a reliability standpoint, design your agent graph to tolerate failures in individual components. Where possible, avoid a single, irreplaceable central brain; let independent agents perform localized tasks so that an outage in one path does not bring down the entire application; además, emplea técnicas como balanceo de carga en búsqueda distribuida para distribuir carga y reducir puntos de fallo. Simulate failures in staging to validate coordination behaviour under stress.

For model calls, Vertex AI supports dynamic shared quotas and provisioned throughput. Shared quotas avoid hard per‑project limits in pay‑as‑you‑go scenarios, while provisioned throughput is essential for high‑QPS, latency‑sensitive workloads that must not be throttled. Monitoring request rates and token usage helps you decide when to move from on‑demand to provisioned capacity.

Cost control largely depends on smart model selection, careful prompt design and avoiding unnecessary tokens. Start with cost‑effective models where you can, keep prompts concise but informative, explicitly ask for short outputs when possible, leverage context caching for repeated large prompts, and consider batch prediction when workloads allow it.

Cloud Run resource tuning and long‑term discounts further optimize runtime costs. Begin with default CPU/memory, observe real usage and adjust. For predictable workloads, committed use discounts significantly reduce spend.

On the observability side, you should treat agents as first‑class entities in your monitoring strategy. Log their inputs, decisions (e.g. which tools they call, which agent they delegate to), and state changes. Use ADK’s event graphs in the web UI for debugging individual sessions, and Cloud Logging plus custom dashboards for fleet‑wide trends.

Done well, these practices give you a transparent view of your MAS: you can see which agents are slow, which tools are overused, where prompts are too long, and where quality‑control loops like LoopAgent iterate more than expected. That feedback loop is critical to fine‑tuning both quality and cost over time.

By combining ADK’s agent primitives, workflow patterns and state mechanisms with Google Cloud’s reference architecture, security stack and operational tooling, you can design multi‑agent systems that are not only clever on paper but also deployable, governable and economically viable in production. Starting from simple parent/child agents and progressing through Sequential, Loop and Parallel orchestrations, you gain a toolkit for turning agentic ideas into robust, maintainable applications that actually deliver business value.

API
Artículo relacionado:
API Evolution: New Frontiers in Integration, Security, and Agentic AI
Related posts: