examples/agents + swarms
Strong tier · 14 min read

Build your own Claude Code — 300-line local coding agent

A ReAct loop with four tools — read, plan, edit, test — that reads your repo, makes multi-file patches, runs the tests, and iterates on failures. One AIgateway key, any model slug, full control. No vendor lock-in.

14 min readpublished 2026-04-25category · Agents + swarms
Terminal view of a coding agent reading, planning, editing, and testing a repository in a ReAct loop

Claude Code and Cursor have trained everyone to expect an agent that knows your repo, plans multi-file changes, runs your tests, and iterates on failures. The good news: the architecture is not mysterious. It's a ReAct loop — reason + act — wrapped around four tools.

This example is the whole thing in 300 lines of Python. The loop, the tools, the diff formatting, the test runner. Point it at Kimi K2.6 and it runs on the free tier; point it at Claude Opus 4.7 and you get something within shouting distance of Claude Code itself. No vendor lock-in, no proprietary surface — it's your agent.

AIgateway keyPython 3.11+subprocessrich (pretty terminal)unified diff
Note
Why roll your own? Control. You pick the model, the prompt, the tool surface, the auto-approve policy. Everything the vendor hides becomes a knob. Works-in-your-language, fits-your-team.

Build it in five steps

  1. STEP 01

    Declare the tools

    Four is enough for most coding tasks — read files, edit files, run shell commands, run tests. Each is a Python function plus a JSON schema the LLM can call. More tools are a distraction until the core loop works.

    TOOLS = [
        {"type": "function", "function": {
            "name": "read_file",
            "description": "Return the full text of a file, optionally a line range.",
            "parameters": {"type": "object",
                "properties": {"path": {"type": "string"}, "start": {"type": "integer"},
                               "end": {"type": "integer"}},
                "required": ["path"]}}},
        {"type": "function", "function": {
            "name": "apply_patch",
            "description": "Apply a unified-diff patch to the repo.",
            "parameters": {"type": "object",
                "properties": {"patch": {"type": "string"}},
                "required": ["patch"]}}},
        {"type": "function", "function": {
            "name": "run_shell",
            "description": "Run a shell command in the repo root. Returns stdout + stderr.",
            "parameters": {"type": "object",
                "properties": {"cmd": {"type": "string"}},
                "required": ["cmd"]}}},
        {"type": "function", "function": {
            "name": "run_tests",
            "description": "Run the project's test command. Returns pass/fail summary + output.",
            "parameters": {"type": "object",
                "properties": {"scope": {"type": "string"}}}}},
    ]
  2. STEP 02

    Implement the tool handlers

    Map each function name to a Python function that does the thing. The run_shell handler is the one that needs care — sandbox it, cap the walltime, and whitelist commands if you're running in a shared environment.

    import subprocess, pathlib
    
    def read_file(path: str, start: int = None, end: int = None) -> str:
        lines = pathlib.Path(path).read_text().splitlines(keepends=True)
        if start is None: return "".join(lines)
        return "".join(lines[start-1:end])
    
    def apply_patch(patch: str) -> str:
        r = subprocess.run(["git", "apply", "-"], input=patch.encode(),
                           capture_output=True)
        if r.returncode: raise RuntimeError(r.stderr.decode())
        return "applied"
    
    def run_shell(cmd: str, timeout: int = 30) -> str:
        r = subprocess.run(cmd, shell=True, capture_output=True, timeout=timeout)
        return r.stdout.decode() + r.stderr.decode()
    
    def run_tests(scope: str = "") -> str:
        cmd = f"pnpm test {scope}" if scope else "pnpm test"
        return run_shell(cmd, timeout=120)
    
    HANDLERS = {"read_file": read_file, "apply_patch": apply_patch,
                "run_shell": run_shell, "run_tests": run_tests}
  3. STEP 03

    Write the ReAct loop

    Ask the model, execute any tool calls it returns, feed the results back, repeat until the model stops calling tools. That's the whole loop — everything else is quality-of-life on top.

    from openai import OpenAI
    import json
    
    client = OpenAI(base_url="https://api.aigateway.sh/v1", api_key="sk-aig-...")
    SYSTEM = open("agent-system-prompt.md").read()  # your preferences + conventions
    
    def step(messages: list[dict], model: str = "moonshot/kimi-k2.6") -> list[dict]:
        r = client.chat.completions.create(
            model=model, messages=messages, tools=TOOLS, tool_choice="auto",
            extra_headers={"x-aig-tag": "coder.loop"},
        )
        msg = r.choices[0].message
        messages.append({"role": "assistant", "content": msg.content or "",
                         "tool_calls": msg.tool_calls or []})
        if not msg.tool_calls:
            return messages
        for tc in msg.tool_calls:
            args = json.loads(tc.function.arguments or "{}")
            try:
                out = str(HANDLERS[tc.function.name](**args))[:8000]
            except Exception as e:
                out = f"ERROR: {e}"
            messages.append({"role": "tool", "tool_call_id": tc.id, "content": out})
        return messages
    
    def run(goal: str, model: str = "moonshot/kimi-k2.6"):
        messages = [{"role": "system", "content": SYSTEM},
                    {"role": "user",   "content": goal}]
        for i in range(40):  # hard step cap
            messages = step(messages, model)
            if messages[-1]["role"] == "assistant" and not messages[-1].get("tool_calls"):
                break
        return messages[-1]["content"]
  4. STEP 04

    Plug in a human gate for risky tools

    Auto-approving edits and shell commands is fine in a sandbox; in a real repo, gate the two write-side tools behind a one-key confirmation. That one line is the difference between 'a clever demo' and 'something you'd let run on Friday afternoon.'

    RISKY = {"apply_patch", "run_shell"}
    
    def human_ok(tool_name: str, args: dict) -> bool:
        if tool_name not in RISKY:
            return True
        print(f"\n{tool_name}({args})")
        return input("[y/N] > ").strip().lower() == "y"
    
    # Wrap HANDLERS call sites with human_ok() — 4 extra lines.
  5. STEP 05

    Point at any model

    The whole loop is model-agnostic. Start on free Kimi K2.6 for dev, upgrade to Opus 4.7 or GPT-5.4 for production, A/B all three with the eval example in this library. One string change.

    if __name__ == "__main__":
        import sys
        goal  = sys.argv[1]
        model = sys.argv[2] if len(sys.argv) > 2 else "moonshot/kimi-k2.6"
        print(run(goal, model))
    
    # $ python agent.py "fix the pagination bug in users API"
    # $ python agent.py "add a CSV exporter for /v1/usage" anthropic/claude-opus-4.7

What to add next

MCP tools. If you already built the MCP server example in this library, plug its tools into the coder with no extra handlers — the gateway routes MCP calls through the same key.

A plan step. Before the first tool call, ask the model to produce a written plan. That makes the agent's intent auditable and dramatically cuts the number of dead-end edits on hard tasks.

A cache. Coding workloads repeat — turn on `x-aig-cache: semantic` for read-heavy phases (reading the same files, running the same tests) and you'll see 30-40% bill reduction on iterative tasks.

Make it yours

The system prompt file is where your team's conventions live — naming, commit style, testing policy, what to avoid. Treat it like a living README for the agent; the prompt is half the agent.

Custom tools are a one-function addition. A company-specific `deploy_preview(env)` tool can turn this agent from 'writes code' into 'ships features' for your stack specifically.

# Add a tool that knows your stack.
TOOLS.append({"type": "function", "function": {
    "name": "deploy_preview",
    "description": "Create a preview deployment and return the URL.",
    "parameters": {"type": "object",
        "properties": {"branch": {"type": "string"}}, "required": ["branch"]}}})

def deploy_preview(branch: str) -> str:
    return run_shell(f"vercel --env=preview --git-branch={branch}")
HANDLERS["deploy_preview"] = deploy_preview

FAQ

How does this compare to Claude Code?+

Architecturally the same — ReAct loop plus a tool belt. Claude Code is more polished (UI, sandbox, approval policies, file watchers), but it is locked to Anthropic models and Anthropic's tool surface. This agent gives up the polish in exchange for total control: pick any model, define any tool, ship any UX.

Which model should I run it on?+

Kimi K2.6 is the best free option and genuinely competitive on most coding tasks; Opus 4.7 is best-in-class for complex multi-file refactors; GPT-5.4 is a strong middle ground. Run the eval example in this library on your own task samples before committing.

Is running shell commands safe?+

Not by default. Always gate `run_shell` and `apply_patch` behind a human approval in real repos, or run the agent in a disposable sandbox (Docker, VM, or a CF Containers instance). The example shows the 4-line gate.

Can the agent run my full test suite?+

Yes, but cap the walltime. `run_tests` in the example has a 120s timeout — if your suite is longer, split it into scoped runs (the `scope` arg in the tool schema). For CI-scale runs, skip in-loop testing and let the CI verify at PR time.

How do I stop it from going off the rails?+

The 40-step hard cap in the loop is the first line. The `x-aig-tag: coder.loop` header plus a monthly cap on that tag is the second. Human approval for write-side tools is the third. In practice those three guardrails catch 99% of runaway cases.

Can I stream the agent's thinking to the terminal?+

Yes — set `stream=True` on the chat-completions call and render deltas as they arrive. The real Claude Code does exactly this; it's the single biggest UX lift over a blocking agent.

What about multi-repo or monorepo?+

Add a `workspace` parameter to the tools and pass a repo root per call. In monorepos, restrict `read_file` to paths under the current working package so the agent doesn't drown in unrelated context.

Can I run the agent on CI?+

Yes. A good pattern is 'agent-on-PR' — when a PR opens with a label, run the agent with a short goal (add tests, fix lint, port to the new API), let it push to the branch, and require human review. The `x-aig-tag` makes the per-PR cost obvious.

READY TO BUILD?
Get an AIgateway key in 30 seconds. Free Kimi K2.6 through Apr 30, 2026; everything else is pass-through.
Get your key →API referenceKimi K2.6 details

More examples