Building an AI Coding Agent with LangGraph + FastAPI (Complete Guide 2026)
A comprehensive tutorial for building a stateful AI coding agent that can safely execute Python tools, store conversation memory, and be exposed as a production-ready API with FastAPI.
Building an AI Coding Agent with LangGraph + FastAPI (Complete Guide 2026)
Level: Intermediate
Estimated read: 15 minutes
Stack: Python, LangGraph, FastAPI, OpenAI API
1) Introduction — What & Why
If you follow developer trends in 2026, one pattern is clear: developers are no longer just writing code manually, but orchestrating agents. On GitHub Trending, projects around AI agents, orchestration, and observability are rising fast. On X (Twitter), discussions about “AI coding workflow,” “agentic development,” and “spec-first coding agents” are also consistently active. On DEV Community, articles about AI agents, tooling, and observability keep appearing.
The question is: why are AI coding agents becoming important?
Because software engineering needs today are no longer just “generate code,” but to:
- understand project context,
- perform sequential steps (plan → implement → verify),
- maintain state (what has been tried, what errors appeared),
- and stay safe and auditable.
This is where LangGraph becomes relevant. Compared to a simple “prompt → response” loop, LangGraph gives you a graph structure, persistent state, and human-in-the-loop patterns. That means your agent can be more stable for real coding workflows.
In this tutorial, we will build an AI Coding Agent API that can:
- receive coding tasks,
- create a plan,
- execute Python code in a safely constrained way,
- summarize results,
- store per-session history.
You can use the final output as a foundation for your team’s internal coding assistant.
2) Prerequisites — Before You Start
Make sure you already have:
- Python 3.11+
- Basic FastAPI knowledge
- An LLM model API key (example: OpenAI)
- Basic understanding of async Python
Install dependencies:
python -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate pip install -U fastapi uvicorn pydantic langgraph langchain-openai python-dotenv
Create a .env file:
OPENAI_API_KEY=sk-xxxx MODEL_NAME=gpt-4.1-mini
If you use another provider, the architecture pattern stays the same. Just replace the model adapter in the LLM layer.
3) Core Concepts — Foundations You Must Understand
Before coding, understand these 4 core concepts.
a) Agent State
Imagine the agent as a chef in a professional kitchen. They don’t just need the latest recipe, but also:
- ingredients already used,
- steps already executed,
- errors that have appeared,
- the next decision.
All of that is state.
b) Nodes and Edges (Graph Thinking)
LangGraph works like a metro map:
- Node = work station (planner, coder, reviewer, executor)
- Edge = transition path between stages
Benefits: the flow is more explicit, easier to debug, and easier to add rules.
c) Tool Calling
The LLM must not freely access the system. We provide limited “tools,” for example:
- running Python snippets in a mini sandbox,
- reading execution results,
- returning structured output.
d) Guardrails
A coding agent without guardrails = high risk. At minimum you need:
- execution time limits,
- builtin whitelist,
- dangerous import blocking,
- input/output validation.
4) Architecture / Diagram
Our simple architecture looks like this:
Client (Web / CLI) | v FastAPI /agent/run | v +-------------------------+ | LangGraph Runtime | |-------------------------| | 1) Planner Node | | 2) Coder Node | | 3) Python Tool Node | | 4) Reviewer Node | +-------------------------+ | v Session Memory (in-memory / DB later)
Flow:
- The user sends an objective.
- The planner breaks the task into steps.
- The coder generates a code snippet.
- The tool node executes the snippet with a constrained sandbox.
- The reviewer summarizes results + next action.
- All steps are stored in session state.
5) Step-by-Step Implementation (Runnable)
In this section we build a complete app you can run directly.
File structure
coding-agent/ app.py .env
Full app.py code
import os import io import contextlib from typing import TypedDict, List, Dict, Any from fastapi import FastAPI, HTTPException from pydantic import BaseModel, Field from dotenv import load_dotenv from langgraph.graph import StateGraph, END from langchain_openai import ChatOpenAI # ------------------------- # Setup # ------------------------- load_dotenv() MODEL_NAME = os.getenv("MODEL_NAME", "gpt-4.1-mini") OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") if not OPENAI_API_KEY: raise RuntimeError("OPENAI_API_KEY not found in environment") llm = ChatOpenAI(model=MODEL_NAME, temperature=0) app = FastAPI(title="AI Coding Agent API", version="1.0.0") # Simple session memory (demo) SESSION_STORE: Dict[str, Dict[str, Any]] = {} # ------------------------- # State definitions # ------------------------- class AgentState(TypedDict, total=False): session_id: str user_goal: str plan: str code: str execution_output: str execution_error: str final_summary: str trace: List[str] class RunRequest(BaseModel): session_id: str = Field(..., min_length=1) goal: str = Field(..., min_length=5) class RunResponse(BaseModel): session_id: str plan: str code: str execution_output: str execution_error: str final_summary: str trace: List[str] # ------------------------- # Safe python executor # ------------------------- def safe_exec_python(code: str, timeout_hint_seconds: int = 3) -> Dict[str, str]: """Runs Python code with minimum guardrails. NOTE: for production, use a separate sandbox (container/VM).""" blocked_keywords = [ "import os", "import sys", "import subprocess", "import socket", "open(", "__import__", "eval(", "exec(", "from os", "from subprocess" ] lowered = code.lower() for bad in blocked_keywords: if bad in lowered: return { "stdout": "", "error": f"Blocked by guardrail: forbidden keyword detected -> {bad}" } safe_builtins = { "print": print, "len": len, "range": range, "sum": sum, "min": min, "max": max, "enumerate": enumerate, "sorted": sorted, "str": str, "int": int, "float": float, "list": list, "dict": dict, "set": set, "tuple": tuple, } stdout_buffer = io.StringIO() local_env: Dict[str, Any] = {} try: with contextlib.redirect_stdout(stdout_buffer): exec(code, {"__builtins__": safe_builtins}, local_env) return {"stdout": stdout_buffer.getvalue(), "error": ""} except Exception as e: return {"stdout": stdout_buffer.getvalue(), "error": f"{type(e).__name__}: {e}"} # ------------------------- # Graph nodes # ------------------------- def planner_node(state: AgentState) -> AgentState: prompt = f""" You are a software planner. Create a 3-5 step plan to complete the following goal: {state['user_goal']} Use concise bullet-point format. """ plan = llm.invoke(prompt).content trace = state.get("trace", []) + ["planner_node completed"] return {"plan": plan, "trace": trace} def coder_node(state: AgentState) -> AgentState: prompt = f""" You are a Python engineer. Based on the goal and plan below, write ONE runnable Python snippet. Goal: {state['user_goal']} Plan: {state['plan']} Constraints: - Do not import external modules. - Must include error handling (try/except). - Must print the final result. Return ONLY Python code, without markdown fences. """ code = llm.invoke(prompt).content.strip() trace = state.get("trace", []) + ["coder_node completed"] return {"code": code, "trace": trace} def tool_exec_node(state: AgentState) -> AgentState: result = safe_exec_python(state.get("code", "")) trace = state.get("trace", []) + ["tool_exec_node completed"] return { "execution_output": result.get("stdout", ""), "execution_error": result.get("error", ""), "trace": trace, } def reviewer_node(state: AgentState) -> AgentState: prompt = f""" You are a technical reviewer. Summarize the following code execution result in Indonesian. Goal: {state['user_goal']} Output: {state.get('execution_output', '')} Error: {state.get('execution_error', '')} Provide: 1) Status (Success/Failed) 2) Brief analysis 3) Improvement suggestions """ final_summary = llm.invoke(prompt).content trace = state.get("trace", []) + ["reviewer_node completed"] return {"final_summary": final_summary, "trace": trace} # ------------------------- # Build graph # ------------------------- builder = StateGraph(AgentState) builder.add_node("planner", planner_node) builder.add_node("coder", coder_node) builder.add_node("executor", tool_exec_node) builder.add_node("reviewer", reviewer_node) builder.set_entry_point("planner") builder.add_edge("planner", "coder") builder.add_edge("coder", "executor") builder.add_edge("executor", "reviewer") builder.add_edge("reviewer", END) graph = builder.compile() # ------------------------- # API endpoints # ------------------------- @app.post("/agent/run", response_model=RunResponse) def run_agent(req: RunRequest): try: initial_state: AgentState = { "session_id": req.session_id, "user_goal": req.goal, "trace": ["request received"] } final_state = graph.invoke(initial_state) # Save session SESSION_STORE[req.session_id] = final_state return RunResponse( session_id=req.session_id, plan=final_state.get("plan", ""), code=final_state.get("code", ""), execution_output=final_state.get("execution_output", ""), execution_error=final_state.get("execution_error", ""), final_summary=final_state.get("final_summary", ""), trace=final_state.get("trace", []), ) except Exception as e: raise HTTPException(status_code=500, detail=f"Internal error: {e}") @app.get("/agent/session/{session_id}") def get_session(session_id: str): data = SESSION_STORE.get(session_id) if not data: raise HTTPException(status_code=404, detail="Session not found") return data
Run the server
uvicorn app:app --reload --port 8000
Test endpoint
curl -X POST http://127.0.0.1:8000/agent/run \ -H 'Content-Type: application/json' \ -d '{ "session_id": "demo-001", "goal": "Create a Python program to calculate student average scores and show pass status if >= 75" }'
If successful, you will get:
- step plan,
- generated code,
- execution output,
- reviewer summary.
6) Best Practices — Industry Tips
1. Use spec-first, not random prompts
Before generating code, force the planner to create a plan first. This reduces hallucinations and makes output more consistent.
2. Separate each node’s role
The planner should not code at the same time. The coder should not evaluate quality at the same time. Separation of concern makes debugging much easier.
3. Store trace and artifacts
Store prompts, node outputs, and errors per step. This is important for audits, observability, and postmortems.
4. Apply strict tool policy
Do not give filesystem/network access without a clear business reason. Default posture: deny by default.
5. Evaluate with real test cases
Not just “does the model answer,” but “does the solution pass edge-case scenarios?”
7) Common Mistakes — Frequent Errors
Mistake #1: Assuming agent = chatbot
A chatbot may look good in demos but fail in long workflows. An agent needs state, flow control, and retry mechanisms.
Mistake #2: Not limiting tool execution
If you allow unrestricted exec without guardrails, that is not a feature — it is a security incident waiting to happen.
Mistake #3: No fallback when LLM output is poor
You need a strategy:
- retry with a stricter prompt,
- fallback model,
- or request human approval.
Mistake #4: Ignoring token costs
Multi-node workflows can be expensive. Monitor tokens per node, cache stable steps, and use small models for simple tasks.
Mistake #5: Not separating dev/prod mode
In dev you might be loose. In prod, everything must be strict: timeout, sandbox, auth, rate limit, audit log.
8) Advanced Tips — Move to the Next Level
If the foundation above is running, upgrade to production level:
-
Persistent memory to Redis/Postgres
Replace the in-memory store so sessions survive restarts. -
Human-in-the-loop interrupt
Add an approval node before risky code execution. -
Multi-agent collaboration
Split agents into Architect, Implementer, Reviewer. -
Full-stack observability
Integrate tracing to see bottlenecks and failure patterns. -
Policy engine
Add explicit rules: what file types may be modified, what commands are forbidden, etc. -
Regression harness
Store a benchmark task set so agent quality can be compared across releases.
9) Summary & Next Steps
We have built an AI coding agent that is:
- structured with LangGraph,
- has a planner → coder → executor → reviewer flow,
- has execution guardrails,
- exposes an API via FastAPI,
- and stores session state.
Next steps I recommend:
- Add API key/JWT authentication.
- Move the executor to an isolated sandbox container.
- Store trace in observability storage.
- Add automated tests for 20+ coding scenarios.
- Build a simple dashboard to view agent session history.
This way, you move from an “AI demo” to an agent engineering system that is truly usable by a team.
10) References
- LangGraph Overview (Official Docs): https://docs.langchain.com/oss/python/langgraph/overview
- LangGraph Quickstart: https://docs.langchain.com/oss/python/langgraph/quickstart
- LangGraph GitHub Repository: https://github.com/langchain-ai/langgraph
- LangChain Agents Docs: https://docs.langchain.com/oss/python/langchain/agents
- FastAPI Official Docs: https://fastapi.tiangolo.com/
- DEV Community (AI/agents article trends): https://dev.to
- GitHub Trending (daily repo trends): https://github.com/trending
- X Explore (real-time discussions): https://x.com/explore
If you want, in the follow-up tutorial we can discuss a multi-agent coding pipeline version (Architect Agent + Coder Agent + QA Agent) with a per-commit approval flow so it fits production teams. 🚀