Building an AI Coding Agent with LangGraph + FastAPI (Complete Guide 2026)

Level: Intermediate

Estimated read: 15 minutes

Stack: Python, LangGraph, FastAPI, OpenAI API

1) Introduction — What & Why

If you follow developer trends in 2026, one pattern is clear: developers are no longer just writing code manually, but orchestrating agents. On GitHub Trending, projects around AI agents, orchestration, and observability are rising fast. On X (Twitter), discussions about “AI coding workflow,” “agentic development,” and “spec-first coding agents” are also consistently active. On DEV Community, articles about AI agents, tooling, and observability keep appearing.

The question is: why are AI coding agents becoming important?

Because software engineering needs today are no longer just “generate code,” but to:

understand project context,
perform sequential steps (plan → implement → verify),
maintain state (what has been tried, what errors appeared),
and stay safe and auditable.

This is where LangGraph becomes relevant. Compared to a simple “prompt → response” loop, LangGraph gives you a graph structure, persistent state, and human-in-the-loop patterns. That means your agent can be more stable for real coding workflows.

In this tutorial, we will build an AI Coding Agent API that can:

receive coding tasks,
create a plan,
execute Python code in a safely constrained way,
summarize results,
store per-session history.

You can use the final output as a foundation for your team’s internal coding assistant.

2) Prerequisites — Before You Start

Make sure you already have:

Python 3.11+
Basic FastAPI knowledge
An LLM model API key (example: OpenAI)
Basic understanding of async Python

Install dependencies:

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

pip install -U fastapi uvicorn pydantic langgraph langchain-openai python-dotenv

Create a .env file:

OPENAI_API_KEY=sk-xxxx
MODEL_NAME=gpt-4.1-mini

If you use another provider, the architecture pattern stays the same. Just replace the model adapter in the LLM layer.

3) Core Concepts — Foundations You Must Understand

Before coding, understand these 4 core concepts.

a) Agent State

Imagine the agent as a chef in a professional kitchen. They don’t just need the latest recipe, but also:

ingredients already used,
steps already executed,
errors that have appeared,
the next decision.

All of that is state.

b) Nodes and Edges (Graph Thinking)

LangGraph works like a metro map:

Node = work station (planner, coder, reviewer, executor)
Edge = transition path between stages

Benefits: the flow is more explicit, easier to debug, and easier to add rules.

c) Tool Calling

The LLM must not freely access the system. We provide limited “tools,” for example:

running Python snippets in a mini sandbox,
reading execution results,
returning structured output.

d) Guardrails

A coding agent without guardrails = high risk. At minimum you need:

execution time limits,
builtin whitelist,
dangerous import blocking,
input/output validation.

4) Architecture / Diagram

Our simple architecture looks like this:

Client (Web / CLI)
      |
      v
 FastAPI /agent/run
      |
      v
+-------------------------+
| LangGraph Runtime       |
|-------------------------|
| 1) Planner Node         |
| 2) Coder Node           |
| 3) Python Tool Node     |
| 4) Reviewer Node        |
+-------------------------+
      |
      v
 Session Memory (in-memory / DB later)

Flow:

The user sends an objective.
The planner breaks the task into steps.
The coder generates a code snippet.
The tool node executes the snippet with a constrained sandbox.
The reviewer summarizes results + next action.
All steps are stored in session state.

5) Step-by-Step Implementation (Runnable)

In this section we build a complete app you can run directly.

File structure

coding-agent/
  app.py
  .env

Full `app.py` code

import os
import io
import contextlib
from typing import TypedDict, List, Dict, Any

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
from dotenv import load_dotenv

from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI

# -------------------------
# Setup
# -------------------------
load_dotenv()
MODEL_NAME = os.getenv("MODEL_NAME", "gpt-4.1-mini")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

if not OPENAI_API_KEY:
    raise RuntimeError("OPENAI_API_KEY not found in environment")

llm = ChatOpenAI(model=MODEL_NAME, temperature=0)
app = FastAPI(title="AI Coding Agent API", version="1.0.0")

# Simple session memory (demo)
SESSION_STORE: Dict[str, Dict[str, Any]] = {}


# -------------------------
# State definitions
# -------------------------
class AgentState(TypedDict, total=False):
    session_id: str
    user_goal: str
    plan: str
    code: str
    execution_output: str
    execution_error: str
    final_summary: str
    trace: List[str]


class RunRequest(BaseModel):
    session_id: str = Field(..., min_length=1)
    goal: str = Field(..., min_length=5)


class RunResponse(BaseModel):
    session_id: str
    plan: str
    code: str
    execution_output: str
    execution_error: str
    final_summary: str
    trace: List[str]


# -------------------------
# Safe python executor
# -------------------------
def safe_exec_python(code: str, timeout_hint_seconds: int = 3) -> Dict[str, str]:
    """Runs Python code with minimum guardrails.
    NOTE: for production, use a separate sandbox (container/VM)."""
    blocked_keywords = [
        "import os", "import sys", "import subprocess", "import socket",
        "open(", "__import__", "eval(", "exec(", "from os", "from subprocess"
    ]

    lowered = code.lower()
    for bad in blocked_keywords:
        if bad in lowered:
            return {
                "stdout": "",
                "error": f"Blocked by guardrail: forbidden keyword detected -> {bad}"
            }

    safe_builtins = {
        "print": print,
        "len": len,
        "range": range,
        "sum": sum,
        "min": min,
        "max": max,
        "enumerate": enumerate,
        "sorted": sorted,
        "str": str,
        "int": int,
        "float": float,
        "list": list,
        "dict": dict,
        "set": set,
        "tuple": tuple,
    }

    stdout_buffer = io.StringIO()
    local_env: Dict[str, Any] = {}

    try:
        with contextlib.redirect_stdout(stdout_buffer):
            exec(code, {"__builtins__": safe_builtins}, local_env)
        return {"stdout": stdout_buffer.getvalue(), "error": ""}
    except Exception as e:
        return {"stdout": stdout_buffer.getvalue(), "error": f"{type(e).__name__}: {e}"}


# -------------------------
# Graph nodes
# -------------------------
def planner_node(state: AgentState) -> AgentState:
    prompt = f"""
You are a software planner.
Create a 3-5 step plan to complete the following goal:
{state['user_goal']}

Use concise bullet-point format.
"""
    plan = llm.invoke(prompt).content
    trace = state.get("trace", []) + ["planner_node completed"]
    return {"plan": plan, "trace": trace}


def coder_node(state: AgentState) -> AgentState:
    prompt = f"""
You are a Python engineer.
Based on the goal and plan below, write ONE runnable Python snippet.

Goal:
{state['user_goal']}

Plan:
{state['plan']}

Constraints:
- Do not import external modules.
- Must include error handling (try/except).
- Must print the final result.

Return ONLY Python code, without markdown fences.
"""
    code = llm.invoke(prompt).content.strip()
    trace = state.get("trace", []) + ["coder_node completed"]
    return {"code": code, "trace": trace}


def tool_exec_node(state: AgentState) -> AgentState:
    result = safe_exec_python(state.get("code", ""))
    trace = state.get("trace", []) + ["tool_exec_node completed"]
    return {
        "execution_output": result.get("stdout", ""),
        "execution_error": result.get("error", ""),
        "trace": trace,
    }


def reviewer_node(state: AgentState) -> AgentState:
    prompt = f"""
You are a technical reviewer.
Summarize the following code execution result in Indonesian.

Goal:
{state['user_goal']}

Output:
{state.get('execution_output', '')}

Error:
{state.get('execution_error', '')}

Provide:
1) Status (Success/Failed)
2) Brief analysis
3) Improvement suggestions
"""
    final_summary = llm.invoke(prompt).content
    trace = state.get("trace", []) + ["reviewer_node completed"]
    return {"final_summary": final_summary, "trace": trace}


# -------------------------
# Build graph
# -------------------------
builder = StateGraph(AgentState)
builder.add_node("planner", planner_node)
builder.add_node("coder", coder_node)
builder.add_node("executor", tool_exec_node)
builder.add_node("reviewer", reviewer_node)

builder.set_entry_point("planner")
builder.add_edge("planner", "coder")
builder.add_edge("coder", "executor")
builder.add_edge("executor", "reviewer")
builder.add_edge("reviewer", END)

graph = builder.compile()


# -------------------------
# API endpoints
# -------------------------
@app.post("/agent/run", response_model=RunResponse)
def run_agent(req: RunRequest):
    try:
        initial_state: AgentState = {
            "session_id": req.session_id,
            "user_goal": req.goal,
            "trace": ["request received"]
        }

        final_state = graph.invoke(initial_state)

        # Save session
        SESSION_STORE[req.session_id] = final_state

        return RunResponse(
            session_id=req.session_id,
            plan=final_state.get("plan", ""),
            code=final_state.get("code", ""),
            execution_output=final_state.get("execution_output", ""),
            execution_error=final_state.get("execution_error", ""),
            final_summary=final_state.get("final_summary", ""),
            trace=final_state.get("trace", []),
        )

    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Internal error: {e}")


@app.get("/agent/session/{session_id}")
def get_session(session_id: str):
    data = SESSION_STORE.get(session_id)
    if not data:
        raise HTTPException(status_code=404, detail="Session not found")
    return data

Run the server

uvicorn app:app --reload --port 8000

Test endpoint

curl -X POST http://127.0.0.1:8000/agent/run \
  -H 'Content-Type: application/json' \
  -d '{
    "session_id": "demo-001",
    "goal": "Create a Python program to calculate student average scores and show pass status if >= 75"
  }'

If successful, you will get:

step plan,
generated code,
execution output,
reviewer summary.

6) Best Practices — Industry Tips

1. Use spec-first, not random prompts

Before generating code, force the planner to create a plan first. This reduces hallucinations and makes output more consistent.

2. Separate each node’s role

The planner should not code at the same time. The coder should not evaluate quality at the same time. Separation of concern makes debugging much easier.

3. Store trace and artifacts

Store prompts, node outputs, and errors per step. This is important for audits, observability, and postmortems.

4. Apply strict tool policy

Do not give filesystem/network access without a clear business reason. Default posture: deny by default.

5. Evaluate with real test cases

Not just “does the model answer,” but “does the solution pass edge-case scenarios?”

7) Common Mistakes — Frequent Errors

Mistake #1: Assuming agent = chatbot

A chatbot may look good in demos but fail in long workflows. An agent needs state, flow control, and retry mechanisms.

Mistake #2: Not limiting tool execution

If you allow unrestricted exec without guardrails, that is not a feature — it is a security incident waiting to happen.

Mistake #3: No fallback when LLM output is poor

You need a strategy:

retry with a stricter prompt,
fallback model,
or request human approval.

Mistake #4: Ignoring token costs

Multi-node workflows can be expensive. Monitor tokens per node, cache stable steps, and use small models for simple tasks.

Mistake #5: Not separating dev/prod mode

In dev you might be loose. In prod, everything must be strict: timeout, sandbox, auth, rate limit, audit log.

8) Advanced Tips — Move to the Next Level

If the foundation above is running, upgrade to production level:

Persistent memory to Redis/Postgres
Replace the in-memory store so sessions survive restarts.
Human-in-the-loop interrupt
Add an approval node before risky code execution.
Multi-agent collaboration
Split agents into Architect, Implementer, Reviewer.
Full-stack observability
Integrate tracing to see bottlenecks and failure patterns.
Policy engine
Add explicit rules: what file types may be modified, what commands are forbidden, etc.
Regression harness
Store a benchmark task set so agent quality can be compared across releases.

9) Summary & Next Steps

We have built an AI coding agent that is:

structured with LangGraph,
has a planner → coder → executor → reviewer flow,
has execution guardrails,
exposes an API via FastAPI,
and stores session state.

Next steps I recommend:

Add API key/JWT authentication.
Move the executor to an isolated sandbox container.
Store trace in observability storage.
Add automated tests for 20+ coding scenarios.
Build a simple dashboard to view agent session history.

This way, you move from an “AI demo” to an agent engineering system that is truly usable by a team.

10) References

LangGraph Overview (Official Docs): https://docs.langchain.com/oss/python/langgraph/overview
LangGraph Quickstart: https://docs.langchain.com/oss/python/langgraph/quickstart
LangGraph GitHub Repository: https://github.com/langchain-ai/langgraph
LangChain Agents Docs: https://docs.langchain.com/oss/python/langchain/agents
FastAPI Official Docs: https://fastapi.tiangolo.com/
DEV Community (AI/agents article trends): https://dev.to
GitHub Trending (daily repo trends): https://github.com/trending
X Explore (real-time discussions): https://x.com/explore

If you want, in the follow-up tutorial we can discuss a multi-agent coding pipeline version (Architect Agent + Coder Agent + QA Agent) with a per-commit approval flow so it fits production teams. 🚀

Building an AI Coding Agent with LangGraph + FastAPI (Complete Guide 2026)

Building an AI Coding Agent with LangGraph + FastAPI (Complete Guide 2026)

1) Introduction — What & Why

2) Prerequisites — Before You Start

3) Core Concepts — Foundations You Must Understand

a) Agent State

b) Nodes and Edges (Graph Thinking)

c) Tool Calling

d) Guardrails

4) Architecture / Diagram

5) Step-by-Step Implementation (Runnable)

File structure

Full app.py code

Run the server

Test endpoint

6) Best Practices — Industry Tips

1. Use spec-first, not random prompts

2. Separate each node’s role

3. Store trace and artifacts

4. Apply strict tool policy

5. Evaluate with real test cases

7) Common Mistakes — Frequent Errors

Mistake #1: Assuming agent = chatbot

Mistake #2: Not limiting tool execution

Mistake #3: No fallback when LLM output is poor

Mistake #4: Ignoring token costs

Mistake #5: Not separating dev/prod mode

8) Advanced Tips — Move to the Next Level

9) Summary & Next Steps

10) References

Full `app.py` code