examples/agents + swarms
First example · 12 min build

Launch a work-automation agent swarm with Hermes + Kimi K2.6

Spin up a five-agent swarm that runs locally on Hermes, routes every call through AIgateway, and automates your inbox, calendar, research, coding, and reporting — using Kimi K2.6 on the free tier until Apr 30.

12 min readpublished 2026-04-24category · Agents + swarms
Autonomous AI agent swarm running on Hermes with Kimi K2.6

A single chatbot is a toy. A swarm — a planner that delegates to specialists, each with its own memory and tool belt — is what actually replaces a workday.

In this example we wire up five Hermes agents running locally on your laptop, point them at AIgateway, and let Kimi K2.6 do the thinking. You get agent-grade reasoning (1T parameters, 256K context, native tool calling, 300-agent sub-plan fan-out) on the free tier through Apr 30, 2026 — then pass-through pricing after that.

By the end you'll have a five-agent crew that triages your inbox, keeps your calendar sane, drafts research memos, writes + tests code, and ships a daily stand-up report. All local, all logged, all billable against one key.

AIgateway keyKimi K2.6 (free)Hermes Agent (local)OpenAI SDKMCP tools
Note
Free through Apr 30, 2026 — every AIgateway account gets 100 Kimi K2.6 requests/day with no card on file. The swarm in this tutorial sits comfortably inside that envelope during development.
Architecture — Kimi K2.6 powers a swarm of specialist Hermes agents via AIgateway
Hermes runs the agent loop locally. AIgateway is the one API your five agents talk to — Kimi K2.6 for reasoning today, any other model tomorrow with a slug change.source · LushBinary

Build it in six steps

  1. STEP 01

    Get an AIgateway key

    Sign in at aigateway.sh/signin and copy your `sk-aig-…` key from the dashboard. Every key automatically includes the Kimi K2.6 free tier (100 req/day through Apr 30).

  2. STEP 02

    Install Hermes locally

    One-line install. Hermes runs the agent loop on your laptop and ships with an MCP tool client out of the box.

    curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
    hermes --version
  3. STEP 03

    Point Hermes at AIgateway

    Hermes reads the same `OPENAI_*` env vars the OpenAI SDK does. Set them to your AIgateway key and base URL, pin the default model to Kimi K2.6.

    export OPENAI_API_KEY="sk-aig-..."
    export OPENAI_BASE_URL="https://api.aigateway.sh/v1"
    export HERMES_DEFAULT_MODEL="moonshot/kimi-k2.6"
    
    hermes model set default moonshot/kimi-k2.6
  4. STEP 04

    Define the swarm

    Five specialist agents + one planner. Each agent has a role, a toolset, and a shared memory handle. The planner reads your inbox + today's calendar and hands each specialist a focused brief.

    from hermes import Agent, Swarm
    from openai import OpenAI
    
    client = OpenAI(
        base_url="https://api.aigateway.sh/v1",
        api_key="sk-aig-...",
    )
    MODEL = "moonshot/kimi-k2.6"
    
    planner = Agent(
        name="planner",
        client=client, model=MODEL,
        system=(
            "You are the chief-of-staff. Read the user's inbox, calendar, "
            "and running notes. Produce a JSON plan that assigns one brief "
            "to each specialist: inbox, calendar, research, coder, reporter."
        ),
        tools=["memory.read", "inbox.list", "calendar.list"],
    )
    
    inbox = Agent(
        name="inbox",
        client=client, model=MODEL,
        system=(
            "You triage email. Archive noise, reply to acknowledgements, "
            "escalate anything that needs the user. Draft replies — never send."
        ),
        tools=["inbox.list", "inbox.archive", "inbox.draft"],
    )
    
    calendar = Agent(
        name="calendar",
        client=client, model=MODEL,
        system=(
            "You own the calendar. Find conflicts, propose holds, "
            "reshuffle when priorities change."
        ),
        tools=["calendar.list", "calendar.propose", "calendar.move"],
    )
    
    research = Agent(
        name="research",
        client=client, model=MODEL,
        system=(
            "You are a research analyst. Given a brief, gather sources, "
            "extract the 3 strongest facts, write a 200-word memo."
        ),
        tools=["web.search", "web.fetch", "memory.append"],
    )
    
    coder = Agent(
        name="coder",
        client=client, model=MODEL,
        system=(
            "You are a senior engineer. Write small, tested patches. "
            "Never touch production without human review."
        ),
        tools=["repo.read", "repo.diff", "shell.run", "tests.run"],
    )
    
    reporter = Agent(
        name="reporter",
        client=client, model=MODEL,
        system=(
            "You write a one-page daily stand-up from the swarm's logs. "
            "Bullet what shipped, what's blocked, what the user should decide."
        ),
        tools=["memory.read", "report.post"],
    )
    
    swarm = Swarm(
        planner=planner,
        specialists=[inbox, calendar, research, coder, reporter],
        memory="./swarm-memory.db",
    )
  5. STEP 05

    Run the morning loop

    One call kicks off the whole swarm. Hermes handles the planner → specialist fan-out, tool execution, memory writes, retries, and token accounting.

    report = swarm.run(
        goal="Handle my morning: triage inbox, clean up the calendar, "
             "draft the weekly investor update, and ship me a stand-up.",
        budget_usd=0.25,
        max_steps=40,
    )
    print(report.text)
    print("steps:", report.step_count, "cost:", report.cost_usd)
  6. STEP 06

    Attribute the spend

    Every agent tags its own calls with `x-aig-tag`, so when you open the AIgateway dashboard you see exactly which specialist burned which cents. Hard-cap any tag with a one-line POST if a specialist goes rogue.

    curl -X POST https://api.aigateway.sh/v1/budgets \
      -H "Authorization: Bearer sk-aig-..." \
      -d '{ "tag": "swarm.research", "monthly_cap_cents": 2000 }'
Kimi K2.6 — 1T parameter agent model with 256K context and native tool calling
Kimi K2.6 is the current open-weight frontier for agent workloads — it plans, calls tools natively, and handles long trajectories without drift. Free on AIgateway through Apr 30.source · Moonshot AI

What the swarm actually does in the morning

The planner opens the day with one Kimi K2.6 call. It reads your last 24h of mail, today's calendar, and the swarm's long-term memory, then produces a JSON plan — five briefs, one per specialist, ranked by impact.

The five specialists run in parallel. Hermes watches the token budget; when a specialist overspends, its run is capped and control returns to the planner.

The reporter fires last. It reads every specialist's memory writes and produces a one-page stand-up: shipped, blocked, decisions needed. That file is the first thing you read with your coffee.

Swarm memory — per-agent long-term memory with semantic recall
Hermes' local memory store. Each agent gets its own namespace; the planner sees across namespaces when it needs to hand off work between specialists.source · Unwind AI

Swap Kimi for Opus on a hot call

One of the reasons you route through AIgateway instead of hitting Moonshot direct: on any call, you can upgrade to Claude Opus 4.7 or GPT-5.4 by changing one string. Same key, same schema.

Typical pattern: plan + triage on Kimi K2.6 (cheap, fast), escalate to Opus for the research memo when the stakes are high.

# Escalate just the research call — everything else stays on Kimi.
research.model = "anthropic/claude-opus-4.7"
swarm.run(goal="Write the investor memo. Be rigorous.")
Hermes terminal — live stream of specialist agents running tools
Hermes streams every tool call to the terminal in real time. Nothing is a black box — you can pause, redirect, or kill any specialist mid-step.source · Unwind AI

Going further

Add your own specialists. Hermes agents are just classes — give the new one a role, a system prompt, a toolset, and register it with the swarm.

Replace the `./swarm-memory.db` handle with a Vectorize binding to share memory across devices, or plug in any MCP-compatible memory server.

When you're ready to put the swarm in front of customers, mint a sub-account per user with the `/v1/sub-accounts` API — each customer gets their own key, spend cap, and isolated analytics without you touching a billing table.

FAQ

Do I need a Moonshot API key?+

No. Your AIgateway key (sk-aig-…) is the only credential. AIgateway proxies to Moonshot under the hood — you don't install their SDK, manage a second invoice, or register a second account.

Is Kimi K2.6 really free on AIgateway?+

Yes — 100 requests/day on `moonshot/kimi-k2.6` through Apr 30, 2026. No card on file. After the trial, Kimi bills pass-through like every other model in the catalog. A five-agent swarm typically runs 20-40 requests per morning loop, so the free tier covers regular development use.

Can I run this fully offline?+

The Hermes loop and your tool servers run locally. Model inference is a network call — Kimi K2.6 runs on Moonshot's infrastructure via AIgateway. If you want fully offline, swap the model slug to a Workers AI edge model (`@cf/meta/llama-3.1-8b-instruct`) or host your own and point the OpenAI client at it.

How do I keep one agent from burning the whole budget?+

Every swarm call carries an `x-aig-tag` header per specialist. POST a hard cap to `/v1/budgets` with that tag and AIgateway enforces it before dispatch — the specialist's calls start returning 402 when the cap trips, and the swarm routes around it.

What if Kimi K2.6 gets dethroned next month?+

Change one string. The swarm logic doesn't know which upstream model it's on. Swap every specialist to `gpt-5.4` or `claude-opus-4.7` — or A/B a few — without touching a single line of agent code.

Can I give each of my end users their own agent swarm?+

Yes. `POST /v1/sub-accounts` mints a scoped `sk-aig-…` key per customer with its own spend cap, rate limit, and isolated analytics. Hand it to the customer or keep it server-side. The swarm code doesn't change.

Do the specialists share memory?+

By default each specialist has its own memory namespace. The planner is the only agent that reads across namespaces — that's how it hands off context between specialists without leaking irrelevant history into every prompt.

Why Hermes and not OpenClaw?+

Either works. Hermes has a tighter Python surface and lands closer to the agent-framework end of the spectrum; OpenClaw is heavier but ships with 5,700+ prebuilt skills. The AIgateway side is identical — point either at `https://api.aigateway.sh/v1` with your key.

READY TO BUILD?
Get an AIgateway key in 30 seconds. Free Kimi K2.6 through Apr 30, 2026; everything else is pass-through.
Get your key →API referenceKimi K2.6 details

More examples