Complete Tutorial: Building an Internal AI Coding Agent with Next.js (2026)

Reading time: ±15 minutes
Target audience: Web developers who want to build an internal AI agent for engineering workflows.

1) Introduction — What and Why

Have you ever thought, “Why does a simple bug still take 2–3 hours?” or “Why does issue context in Slack/Linear/GitHub often get scattered?”

In 2026, engineering trends are moving toward internal coding agents: bots/agents that can read issues, run commands in isolated environments, make code changes, and then open PRs automatically.

From trend research:

GitHub Trending shows a surge in agentic engineering projects (e.g., langchain-ai/open-swe)
X Explore is full of discussions about agent orchestration, sandboxing, and reliability
DEV Community highlights topics like #ai, #webdev, #automation, #typescript

The meaning is clear: developers don’t just need a “chatbot,” but an agent that can actually do work within safe boundaries.

In this tutorial, you will build a simple yet production-minded version of an AI coding agent based on Next.js App Router + AI SDK. Our focus:

Secure architecture (sandbox-first)
Controlled tool calling
An issue → execution → auditable results flow

2) Prerequisites

Before starting, prepare:

Node.js 20+ (22 recommended)
pnpm or npm
Basic understanding of Next.js App Router
Familiarity with REST API and async/await
LLM provider API key (example: OpenAI/Vercel AI Gateway)

Optional but very helpful:

Docker (for sandbox command runner)
Redis/Postgres (for queue and persistence)

3) Core Concepts — Fundamentals with Analogies

Imagine you have a super-fast junior project manager:

They read issues
They plan the steps
They delegate small tasks
They report results

But if you give them root access to a production server, that’s dangerous. So the core concepts are:

a) Orchestrator

The main brain that decides the agent’s work sequence. Think of it as an “orchestra conductor.”

b) Tools

The agent’s hands. Examples: read files, run commands, fetch URLs, create PRs.

c) Sandbox

An isolated workspace. If an error happens, the “explosion” won’t hit the main system.

d) Memory & Context

The agent needs issue context, comments, and repo rules (AGENTS.md or internal policy).

e) Guardrails

Boundary rules: command allowlist, time limits, token limits, and allowed editable paths.

Without guardrails, the agent can be fast but wild. With guardrails, the agent may be slightly slower but much safer.

4) Architecture / Diagram

Here is the architecture we will implement:

┌─────────────────────────────────────────────────────────┐
│                   Frontend (Next.js)                   │
│   Chat UI / Issue Input / Execution Timeline / Logs    │
└───────────────────────┬─────────────────────────────────┘
                        │ POST /api/agent/run
                        ▼
┌─────────────────────────────────────────────────────────┐
│                 Agent Orchestrator API                 │
│  - Build prompt/context                                │
│  - Call LLM (stream)                                   │
│  - Parse tool calls                                    │
│  - Enforce policies                                    │
└───────────────┬───────────────────────┬─────────────────┘
                │                       │
                ▼                       ▼
     ┌─────────────────────┐   ┌────────────────────────┐
     │ Tool Executor       │   │ Context Provider       │
     │ - read/write file   │   │ - issue details        │
     │ - run command       │   │ - repo rules           │
     │ - git diff summary  │   │ - conversation memory  │
     └──────────┬──────────┘   └────────────────────────┘
                │
                ▼
      ┌──────────────────────┐
      │ Isolated Sandbox     │
      │ (ephemeral runtime)  │
      └──────────────────────┘

Short flow:

User sends a task
Orchestrator asks the model for a plan
Model calls tools as needed
Tools execute in the sandbox
Results are summarized + returned to the UI

5) Step-by-Step Implementation (Runnable)

We’ll build a minimal project that still works end to end.

Step 1 — Initialize project

pnpm create next-app@latest ai-coding-agent --ts --app --eslint
cd ai-coding-agent
pnpm add ai @ai-sdk/openai zod

Create .env.local:

OPENAI_API_KEY=your_api_key_here
AGENT_MAX_STEPS=6

Step 2 — Define policy and tool schema

Create file lib/agent-policy.ts:

// lib/agent-policy.ts
export const ALLOWED_COMMANDS = [
  "npm test",
  "pnpm test",
  "npm run lint",
  "pnpm lint",
  "npm run build",
  "pnpm build",
] as const;

export function isAllowedCommand(cmd: string): boolean {
  return ALLOWED_COMMANDS.some((allowed) => cmd.trim().startsWith(allowed));
}

export function sanitizePath(inputPath: string): string {
  if (inputPath.includes("..")) {
    throw new Error("Path traversal terdeteksi");
  }
  return inputPath.replace(/^\/+/, "");
}

Step 3 — Build tool executor with error handling

Create lib/tools.ts:

// lib/tools.ts
import { promises as fs } from "fs";
import path from "path";
import { exec as execCb } from "child_process";
import { promisify } from "util";
import { isAllowedCommand, sanitizePath } from "./agent-policy";

const exec = promisify(execCb);
const WORKSPACE_ROOT = path.join(process.cwd(), "workspace");

export async function readFileTool(filePath: string): Promise<string> {
  const safePath = sanitizePath(filePath);
  const fullPath = path.join(WORKSPACE_ROOT, safePath);

  try {
    const content = await fs.readFile(fullPath, "utf8");
    return content;
  } catch (error) {
    throw new Error(`Gagal membaca file: ${String(error)}`);
  }
}

export async function writeFileTool(filePath: string, content: string): Promise<string> {
  const safePath = sanitizePath(filePath);
  const fullPath = path.join(WORKSPACE_ROOT, safePath);

  try {
    await fs.mkdir(path.dirname(fullPath), { recursive: true });
    await fs.writeFile(fullPath, content, "utf8");
    return `Berhasil menulis ${safePath}`;
  } catch (error) {
    throw new Error(`Gagal menulis file: ${String(error)}`);
  }
}

export async function runCommandTool(command: string): Promise<string> {
  if (!isAllowedCommand(command)) {
    throw new Error(`Command tidak diizinkan oleh policy: ${command}`);
  }

  try {
    const { stdout, stderr } = await exec(command, {
      cwd: WORKSPACE_ROOT,
      timeout: 60_000,
      env: { ...process.env, CI: "true" },
      maxBuffer: 2 * 1024 * 1024,
    });

    return [stdout, stderr].filter(Boolean).join("
").trim() || "Command selesai tanpa output";
  } catch (error) {
    throw new Error(`Eksekusi command gagal: ${String(error)}`);
  }
}

Step 4 — Agent API route with tool calling

Create app/api/agent/run/route.ts:

// app/api/agent/run/route.ts
import { openai } from "@ai-sdk/openai";
import { generateText, tool } from "ai";
import { z } from "zod";
import { readFileTool, writeFileTool, runCommandTool } from "@/lib/tools";

const SYSTEM_PROMPT = `
Kamu adalah AI coding agent internal.
Prioritas: keamanan, kejelasan, reproducibility.
Selalu jelaskan rencana sebelum eksekusi.
Jangan gunakan command di luar allowlist.
`;

export async function POST(req: Request) {
  try {
    const body = await req.json();
    const task = String(body?.task ?? "").trim();

    if (!task) {
      return Response.json({ error: "Task wajib diisi" }, { status: 400 });
    }

    const result = await generateText({
      model: openai("gpt-5.1"),
      system: SYSTEM_PROMPT,
      prompt: `Task: ${task}`,
      tools: {
        read_file: tool({
          description: "Baca file dari workspace",
          inputSchema: z.object({ path: z.string().min(1) }),
          execute: async ({ path }) => {
            return await readFileTool(path);
          },
        }),
        write_file: tool({
          description: "Tulis file ke workspace",
          inputSchema: z.object({ path: z.string().min(1), content: z.string() }),
          execute: async ({ path, content }) => {
            return await writeFileTool(path, content);
          },
        }),
        run_command: tool({
          description: "Jalankan command yang diizinkan policy",
          inputSchema: z.object({ command: z.string().min(1) }),
          execute: async ({ command }) => {
            return await runCommandTool(command);
          },
        }),
      },
      maxSteps: Number(process.env.AGENT_MAX_STEPS ?? 6),
    });

    return Response.json({
      ok: true,
      text: result.text,
      steps: result.steps?.length ?? 0,
    });
  } catch (error) {
    console.error("[agent-run-error]", error);
    return Response.json(
      { ok: false, error: error instanceof Error ? error.message : "Unknown error" },
      { status: 500 }
    );
  }
}

Step 5 — Simple UI to test the agent

Create app/page.tsx:

"use client";

import { useState } from "react";

export default function HomePage() {
  const [task, setTask] = useState("Buat ringkasan struktur project dan rekomendasi perbaikan awal");
  const [result, setResult] = useState("");
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState("");

  async function runAgent() {
    setLoading(true);
    setError("");
    setResult("");

    try {
      const res = await fetch("/api/agent/run", {
        method: "POST",
        headers: { "Content-Type": "application/json" },
        body: JSON.stringify({ task }),
      });

      const data = await res.json();
      if (!res.ok || !data.ok) {
        throw new Error(data.error ?? "Gagal menjalankan agent");
      }

      setResult(data.text);
    } catch (err) {
      setError(err instanceof Error ? err.message : "Terjadi error");
    } finally {
      setLoading(false);
    }
  }

  return (
    <main style={{ maxWidth: 860, margin: "32px auto", fontFamily: "sans-serif" }}>
      <h1>AI Coding Agent Internal</h1>
      <p>Masukkan task engineering, lalu jalankan agent dengan batas aman.</p>

      <textarea
        value={task}
        onChange={(e) => setTask(e.target.value)}
        rows={6}
        style={{ width: "100%", padding: 12 }}
      />

      <button onClick={runAgent} disabled={loading} style={{ marginTop: 12, padding: "10px 16px" }}>
        {loading ? "Menjalankan..." : "Run Agent"}
      </button>

      {error && <p style={{ color: "crimson" }}>Error: {error}</p>}

      {result && (
        <section style={{ marginTop: 24 }}>
          <h2>Hasil</h2>
          <pre style={{ whiteSpace: "pre-wrap", background: "#f5f5f5", padding: 12 }}>{result}</pre>
        </section>
      )}
    </main>
  );
}

Run:

mkdir -p workspace
pnpm dev

Open http://localhost:3000, then test simple tasks like:

“Read package.json and explain the main dependencies”
“Run pnpm lint and summarize the errors”

6) Best Practices — Industry Tips

Sandbox-first, permission-later
Never execute agent commands directly on the production host.
Tool curation > tool quantity
It’s better to have 5 solid tools than 30 ambiguous tools.
Deterministic middleware
Add guaranteed post-run steps (e.g., always summarize diff, always attach logs).
Observability is mandatory
Store traces: prompt, tool call, stdout/stderr, duration, token usage.
Policy versioning
Store allowlist policies as versioned files for easier audits.
Human-in-the-loop for high-risk actions
For sensitive changes (auth, billing, security), manual approval is mandatory.

7) Common Mistakes — What Often Causes Failure

Mistake 1: Giving unlimited shell access

Impact: the agent can run destructive commands.
Solution: command allowlist + timeout + maxBuffer.

Mistake 2: Not limiting file paths

Impact: path traversal (../../).
Solution: sanitize paths, use an explicit workspace root.

Mistake 3: Prompt too generic

Impact: “hallucinated” output or non-actionable results.
Solution: add issue context, constraints, and a definition of done.

Mistake 4: No retry strategy

Impact: runs fail due to temporary errors.
Solution: controlled retries for transient network/tool errors.

Mistake 5: Ignoring token cost

Impact: costs balloon without business value.
Solution: limit max steps, compress context, cache retrieval results.

8) Advanced Tips — If You Want to Go Deeper

a) Multi-agent pattern

Separate roles:

Planner agent (creates plan)
Executor agent (runs tools)
Reviewer agent (quality gate)

b) Event-driven queue

Use a queue (BullMQ/Kafka/SQS) so agent tasks don’t block the main web request.

c) PR-ready workflow

Add an automated pipeline:

Generate patch
Run test/lint
Generate changelog
Open draft PR

d) Long-term memory

Store learned lessons per repo (e.g., architecture patterns, style rules) for future runs.

e) Safety scoring

Assign risk scores to every action (read-only, write, command, network). Require approval if the score is high.

9) Summary & Next Steps

We have built the foundation of an internal AI coding agent that:

Is based on modern Next.js
Uses structured tool calling
Has basic security guardrails
Can be run and tested in real scenarios

If you want to continue to production level, the priority order is:

Add sandbox container runtime (Docker/Firecracker)
Add observability + trace dashboard
Integrate issue tracker (Linear/Jira) and GitHub PR automation
Implement role-based approval

Remember: a great agent is not the one that is “the smartest,” but the one that is the most reliable.

10) References

GitHub Trending: https://github.com/trending
Open SWE (repo): https://github.com/langchain-ai/open-swe
Open SWE announcement: https://blog.langchain.com/open-swe-an-open-source-framework-for-internal-coding-agents/
LangGraph Overview: https://docs.langchain.com/oss/python/langgraph/overview
Vercel AI SDK Next.js Quickstart: https://ai-sdk.dev/docs/getting-started/nextjs-app-router
GitHub Copilot Agents docs: https://docs.github.com/en/copilot/how-tos/use-copilot-agents
OpenAI Agents guide: https://developers.openai.com/api/docs/guides/agents
MCP Introduction: https://modelcontextprotocol.io/introduction

If you want, after this we can build Part 2: full “issue-to-PR automation” implementation complete with a draft PR generator and an automatic reviewer checklist.