dailytutorfor.you
& AI Science Data

Complete Multi-Agent System Tutorial with Google ADK + FastAPI (2026)

Learn how to build modern multi-agent systems with Google ADK and FastAPI: from architecture, orchestration concepts, security guardrails, to production-ready Python runnable implementations.

10 min read

Complete Tutorial: Building a Multi-Agent System with Google ADK + FastAPI (2026)

1) Introduction — What and Why

In the last 2 years, AI application patterns have changed drastically. In the past, we made an “all-rounder” chatbot and then hoped it would be strong enough for all tasks. The result? It's usually messy: mixed context, inconsistent answers, and hard to debug.

Now a really rising trend (as seen from GitHub Trending and the latest developer articles) is multi-agent architecture: one coordinator agent + several specialist agents.

Imagine a work team in an office:

  • There is a manager who divides the tasks
  • There are research staff looking for data
  • There are technical staff who execute
  • There is QA which verifies

This model is more realistic than one “super employee” who has to do it all.

In this tutorial, you will learn to build a multi-agent system based on Google Agent Development Kit (ADK) and expose it via FastAPI. We use a real scenario: AI assistant to support developers which can:

  1. Question intent classification
  2. Delegation to specialist agents
  3. Maintain basic security guardrails
  4. Returns a clean response for the API client

Why is this important for the real world?

  • SaaS products need accurate support bots
  • The internal team needs a reliable assistant
  • Multi-agent makes systems easier to scale and test

In short: if you are serious about using AI in production, this pattern is no longer a "nice to have", but a foundation.


2) Prerequisites

Before you start, make sure you have:

Technical skills

  • Python basics (functions, classes, async)
  • Basic REST API
  • Familiar with virtual environments

Environment

  • Python 3.11+
  • API key model (e.g. compatible Gemini/LLM provider)
  • Linux/macOS/WSL terminals

Libraries

We will use:

  • google-adk
  • fastapi
  • uvicorn
  • pydantic

Install:

python -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate pip install --upgrade pip pip install google-adk fastapi uvicorn pydantic

Set environment variable model key (example):

export GOOGLE_API_KEY="your_api_key_here"

Tip: save the secret in .env + secret manager, don't hardcode it in the source code.


3) Core Concepts (by analogy)

A. Single Agent vs Multi-Agent

  • Single agent: like one person who has to be CS, engineer, legal, finance at the same time.
  • Multi-agent: like small teams with specific roles.

B. Coordinator Pattern

One main agent receives questions, then determines who is most suitable to answer.

Analogy: receptionist at a clinic. All patients enter through one table, then are directed to the right doctor.

C. Workflow Agent

ADK provides structures like:

  • Sequential: sequential steps
  • Parallel: parallel execution
  • Loop: repeat until the condition is complete

D. Shared Context and State

Agents can share state, but must be disciplined in key naming to avoid collisions.

Analogy: a team Kanban board. Everyone can see it, but each card must be clearly labeled.

E. Guardrails

If the tools are too free, agents can take unsafe actions. At a minimum there must be:

  • Input validation
  • Command/action restrictions
  • Clear error handling

4) Architecture / Diagrams

The following is a simple architecture of the system that we will build:

[Client App] | v [FastAPI /ask] | v [Coordinator Agent] / | \ v v v [DocAgent] [CodeAgent] [TroubleshootAgent] \ | / \ | / v v v [Tool Layer: safe_search_docs, safe_summarize_code, safe_troubleshoot] | v [Structured JSON Response]

Flow requests:

  1. The client sends a question to the /ask endpoint
  2. The coordinator reads the intent
  3. Delegate to relevant sub-agents
  4. Sub-agents use tools (which are limited)
  5. Results are assembled and returned as JSON

5) Step-by-Step Implementation (runnable code)

Below is an example of one complete file (main.py) that you can run directly.

""" main.py Contoh minimal production-friendly multi-agent API dengan ADK + FastAPI. Jalankan: uvicorn main:app --reload --port 8080 """ from __future__ import annotations import logging import os from typing import Any, Dict, Literal from fastapi import FastAPI, HTTPException from pydantic import BaseModel, Field, ValidationError # ADK imports (mengikuti pola dokumentasi ADK) from google.adk.agents import LlmAgent # ----------------------------------------------------- # Logging setup # ----------------------------------------------------- logging.basicConfig( level=logging.INFO, format="%(asctime)s | %(levelname)s | %(name)s | %(message)s", ) logger = logging.getLogger("multi-agent-api") # ----------------------------------------------------- # Safe tool functions (contoh) # ----------------------------------------------------- def safe_search_docs(query: str) -> str: """ Tool simulasi pencarian dokumentasi. Tambahkan guardrails: validasi panjang query dan blacklist sederhana. """ if not query or len(query.strip()) < 3: raise ValueError("Query terlalu pendek. Minimal 3 karakter.") blocked_terms = ["drop table", "rm -rf", "credential leak"] lowered = query.lower() if any(term in lowered for term in blocked_terms): raise ValueError("Query mengandung pola berbahaya dan ditolak.") # Pada sistem nyata, ini bisa panggil API docs internal / vector DB return ( f"Hasil docs untuk '{query}': fokus pada API contract, auth flow, " "dan error handling berbasis status code." ) def safe_summarize_code(snippet: str) -> str: """ Tool simulasi ringkasan code. Hindari input terlalu panjang untuk mencegah biaya token meledak. """ if not snippet: raise ValueError("Snippet code kosong.") if len(snippet) > 4000: raise ValueError("Snippet terlalu panjang (maks 4000 karakter).") return ( "Ringkasan code: struktur sudah modular, " "tapi perlu perbaikan error handling dan unit test pada edge case." ) def safe_troubleshoot(issue: str) -> str: """ Tool troubleshooting sederhana. """ if not issue: raise ValueError("Issue tidak boleh kosong.") hints = [ "cek environment variable", "verifikasi dependency version", "lihat stack trace paling awal", "isolasi bug dengan reproduksi minimal", ] return f"Checklist troubleshooting untuk '{issue}': " + "; ".join(hints) # ----------------------------------------------------- # ADK Agent setup # ----------------------------------------------------- def create_agents() -> LlmAgent: """ Bangun coordinator + sub-agents. """ model_name = os.getenv("AGENT_MODEL", "gemini-2.5-flash") doc_agent = LlmAgent( name="DocAgent", model=model_name, instruction=( "Kamu spesialis dokumentasi developer. " "Gunakan tool pencarian docs untuk menjawab ringkas dan akurat." ), tools=[safe_search_docs], description="Menjawab pertanyaan dokumentasi/API.", ) code_agent = LlmAgent( name="CodeAgent", model=model_name, instruction=( "Kamu spesialis code review cepat. " "Gunakan tool ringkasan code, fokus ke maintainability dan bug risk." ), tools=[safe_summarize_code], description="Menganalisis dan merangkum potongan kode.", ) troubleshoot_agent = LlmAgent( name="TroubleshootAgent", model=model_name, instruction=( "Kamu spesialis debugging. " "Gunakan checklist troubleshooting yang actionable." ), tools=[safe_troubleshoot], description="Memberikan langkah diagnosis issue teknis.", ) coordinator = LlmAgent( name="Coordinator", model=model_name, description="Koordinator utama untuk pertanyaan developer support.", instruction=( "Tentukan intent user: docs, code_review, atau troubleshooting. " "Delegasikan ke sub-agent yang paling tepat. " "Jika ambigu, ajukan 1 pertanyaan klarifikasi singkat." ), sub_agents=[doc_agent, code_agent, troubleshoot_agent], ) return coordinator ROOT_AGENT = create_agents() # ----------------------------------------------------- # API schema # ----------------------------------------------------- class AskRequest(BaseModel): question: str = Field(..., min_length=3, max_length=3000) mode: Literal["auto", "docs", "code", "troubleshoot"] = "auto" class AskResponse(BaseModel): ok: bool answer: str meta: Dict[str, Any] # ----------------------------------------------------- # FastAPI app # ----------------------------------------------------- app = FastAPI(title="ADK Multi-Agent API", version="1.0.0") @app.get("/health") def health() -> Dict[str, str]: return {"status": "ok"} @app.post("/ask", response_model=AskResponse) async def ask(payload: AskRequest) -> AskResponse: """ Endpoint utama untuk bertanya ke sistem multi-agent. Untuk demo, kita gunakan prompt routing sederhana. Pada implementasi penuh, kamu bisa pakai runner/session ADK lengkap. """ try: question = payload.question.strip() if not question: raise HTTPException(status_code=400, detail="Pertanyaan kosong.") # Routing awal berbasis mode eksplisit (jika user memilih) if payload.mode == "docs": answer = safe_search_docs(question) chosen = "DocAgent" elif payload.mode == "code": answer = safe_summarize_code(question) chosen = "CodeAgent" elif payload.mode == "troubleshoot": answer = safe_troubleshoot(question) chosen = "TroubleshootAgent" else: # Auto mode: heuristik sederhana sebelum diserahkan ke koordinasi LLM lowered = question.lower() if any(k in lowered for k in ["error", "crash", "timeout", "bug"]): answer = safe_troubleshoot(question) chosen = "TroubleshootAgent" elif any(k in lowered for k in ["review", "refactor", "clean code", "snippet"]): answer = safe_summarize_code(question) chosen = "CodeAgent" else: answer = safe_search_docs(question) chosen = "DocAgent" return AskResponse( ok=True, answer=answer, meta={ "selected_agent": chosen, "coordinator": ROOT_AGENT.name, "note": "Demo implementation with safe tool routing", }, ) except ValidationError as ve: logger.exception("Validation error") raise HTTPException(status_code=422, detail=str(ve)) from ve except ValueError as ve: logger.warning("Tool rejected request: %s", ve) raise HTTPException(status_code=400, detail=str(ve)) from ve except HTTPException: raise except Exception as exc: # catch-all untuk menjaga API tetap stabil logger.exception("Unexpected error: %s", exc) raise HTTPException( status_code=500, detail="Terjadi kesalahan internal. Coba lagi beberapa saat.", ) from exc

Run the server:

uvicorn main:app --reload --port 8080

Endpoint test:

curl -X POST http://127.0.0.1:8080/ask \ -H 'Content-Type: application/json' \ -d '{"question":"Bagaimana menangani timeout pada API client FastAPI?","mode":"auto"}'

Example response:

{ "ok": true, "answer": "Checklist troubleshooting ...", "meta": { "selected_agent": "TroubleshootAgent", "coordinator": "Coordinator", "note": "Demo implementation with safe tool routing" } }

Why is this example still valid even though it is simple?

  • The parent/sub-agent structure is clear
  • There are tool-level guardrails
  • There is an API-level handling error
  • Ready to upgrade to a more complex runner/session ADK

6) Best Practices (industry tips)

  1. Make the agent's role very narrow

    • One agent, one main responsibility.
    • This reduces hallucinations because the context is more focused.
  2. Use the context tool for policy

    • Don't let the model freely determine access limits.
    • Save the deterministic policy from the developer.
  3. Separate "thinking" and "acting"

    • Agents can analyze freely.
    • But external actions must go through validated tools.
  4. Audit trail mandatory

    • Save a log: questions, selected agents, tools used, results.
    • Important for debugging and compliance.
  5. Fallback strategy

    • If the sub-agent fails, the coordinator must:
      • provide a safe response,
      • or ask for clarification,
      • or escalation to humans.
  6. Evaluation set from real cases

    • Don't evaluate using the prompt "beautiful" only.
    • Use production error data, ambiguous questions, user typos.
  7. Limit token fees

    • Concise context,
    • set max output,
    • heavy throttle request.

7) Common Mistakes

1. All tasks are assigned to one agent

The result: prompts become long, answers become more inconsistent.

2. The tool is too powerful without limitations

Bad example: SQL tools without whitelisting tables. This is high risk.

3. There is no response schema

If the response format is inconsistent, the frontend becomes fragile. Use Pydantic/JSON schema.

4. Ignore path errors

Many demos only focus on "happy path". In production, path errors occur more often.

5. No versioning prompt/instruction

Instructions are also software artifacts. Versioning is mandatory.

6. Over-parallelization

Running too many parallel agents can create a race condition and increase costs.


8) Advanced Tips

  1. Hybrid routing (rules + LLM)

    • Starting from deterministic rules for basic intents,
    • then use LLM for ambiguous cases.
  2. Memory layering

    • Session memory (short term)
    • Domain memory (knowledge base)
    • Audit memory (for observability)
  3. Two-stage safety callback

    • Pre-tool callback: check if action is allowed
    • Post-tool callback: sanitize output before user sends
  4. Canary deployment for prompt

    • Rollout new instructions to 5-10% of traffic first.
    • Monitor regressions before full rollout.
  5. A2A (Agent-to-Agent) for enterprise systems

    • If agents have different services/repos, use clear protocols for communication between agents.
  6. Agent only SLO

    • Tracks p95 latency, tool error rate, and “successful resolution rate”.
    • Don't just look at model accuracy.

9) Summary & Next Steps

We have discussed end-to-end how to build a modern multi-agent system with ADK:

  • Why multi-agent is important for real applications
  • The concepts of hierarchy, orchestration, and guardrails
  • Clean API architecture
  • Runnable implementation with FastAPI + Python
  • Best practices, pitfalls, and advanced strategies

Next steps that I recommend:

  1. Upgrade this demo to full ADK runner/session
  2. Add automatic evaluation for 50+ test prompts
  3. Integrate vector search for internal documents
  4. Add authentication + rate limiting API
  5. Deploy to Cloud Run/Docker with observability (OpenTelemetry)

If you build an AI product in 2026, this pattern provides a strong foundation: modular, safe, and maintainable.


10) References