For a while now I’ve had a quiet, specific discomfort. I spend my days building with AI tools that feel close to sorcery — I describe a thing, and a capable assistant goes off, reads files, runs commands, spins up helpers, summarises what they found, and comes back with the work mostly done. And yet every other podcast tells me I should build agents, and I couldn’t honestly say what I was missing. Everything I already did looked like agents. So either I was already doing it, or I was missing something and couldn’t see the edge of it.

It turns out both were true, and the gap between them is the whole story.

Using the loop versus writing the loop

Here’s the reframe that dissolved it for me. If you use a tool like Claude Code heavily, you have been riding one of the most sophisticated agents that exists — you just never wrote the loop. The helpers, the orchestration, the read-summarise-decide rhythm, the way it knows when to stop: that is all genuine agentic behaviour. Someone wrote that loop, picked the tools, and made the hard calls — context management, error recovery, when to quit. You experience the output of those decisions without ever having made them.

When people say “build an agent,” they mean something narrower and more humbling: you write the loop. You own the control flow, the tools, the memory, the error handling, and the stopping condition. Nobody made those calls for you.

The good news is that the loop is almost insultingly small.

What an agent actually is

Strip away the noise and an agent is one sentence: a model in a loop with tools. The model gets a goal. It decides to call a tool. It gets the result. It decides the next move. It repeats until it judges the job done. That while loop is the agent. Multi-agent systems, agent-to-agent protocols, orchestration frameworks — all of it is decoration on this core.

The agent loop A goal goes to a model; the model asks for a tool; the tool runs; the result is fed back; repeat until the model stops asking, which is the answer. feed the result back, and repeat the goal model decides the next move run the tool it asked for tool_use stops asking no tool? it’s done → the final answer ITS TOOLS — WHAT IT CAN DO calculator · web_search refund_order · your tools…
The whole thing. A model gets a goal, asks for a tool, you run it and feed the result back, and it loops until it stops asking — that last step is the agent deciding it’s done. Everything fancier is decoration on this.

I don’t expect you to believe me until you’ve seen it, so here is the entire thing. Around fifty lines. It runs.

import anthropic
client = anthropic.Anthropic()

# 1. The TOOLS. This is where an agent's real power lives — what it can *do*.
def calculator(expression: str) -> str:
    return str(eval(expression))           # toy only; never eval untrusted input

def web_search(query: str) -> str:
    return f"(pretend results for: {query})"   # swap in a real search API

TOOLS = {"calculator": calculator, "web_search": web_search}

# 2. Describe those tools to the model so it knows when/how to call them.
TOOL_SCHEMAS = [
    {"name": "calculator", "description": "Evaluate a math expression.",
     "input_schema": {"type": "object",
                      "properties": {"expression": {"type": "string"}},
                      "required": ["expression"]}},
    {"name": "web_search", "description": "Search the web for a query.",
     "input_schema": {"type": "object",
                      "properties": {"query": {"type": "string"}},
                      "required": ["query"]}},
]

def run(goal: str):
    messages = [{"role": "user", "content": goal}]

    # 3. THE LOOP. This is the entire idea of an agent.
    while True:
        resp = client.messages.create(
            model="claude-opus-4-8",
            max_tokens=1024,
            tools=TOOL_SCHEMAS,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": resp.content})

        # No tool requested -> the agent decided it's done.
        if resp.stop_reason != "tool_use":
            print(next(b.text for b in resp.content if b.type == "text"))
            return

        # 4. Run every tool the model asked for; feed results back in.
        results = []
        for block in resp.content:
            if block.type == "tool_use":
                output = TOOLS[block.name](**block.input)
                results.append({"type": "tool_result",
                                "tool_use_id": block.id, "content": output})
        messages.append({"role": "user", "content": results})

run("What is 17% of 6,400, then search for why that number might matter?")

Read it once and the magic evaporates — in the good way. The while True is the agent. The line stop_reason != "tool_use" is the model deciding it’s finished, the stopping condition you’ve never had to write before. The TOOLS dict is the lever: the agent can do exactly what’s in it and nothing else. Every framework you’ve heard of replaces this file with abstractions. None of them replace the idea in it.

You’re not behind. You’ve been living inside the agent the whole time. You just have to drop one layer of abstraction and write the loop once.

— what I wish someone had told me sooner

The four things you never had to think about

Once you’ve written the loop, you can see precisely what the tools were doing for you — the four things that were invisible because they were handled.

The loop itself. Control flow, parsing tool calls, recovering when a tool throws. You just read all of it. It was never the hard part; it was just hidden.

Custom tools — the biggest lever by far. The toy above has a calculator. A real agent’s power is the tools you define for your problem: refund_order, check_inventory, escalate_to_human. The model is only as capable as its tools allow. This is invisible when the tools were chosen for you, and it’s where almost all the real work — and real value — lives.

Running unattended. The tool you ride is interactive — you approve, you redirect, you catch it when it wanders. Most production agents run with no one watching. Reliability without a human to course-correct is a genuinely different discipline, and it’s the one the demos quietly skip.

The supporting cast that unattended operation forces. The moment no one is watching, you need: evals (how do you know it works if you’re asleep?), observability (every step logged, so you can debug a run that already happened), guardrails (what is it forbidden to do), and durable memory across runs. None of this is exotic. All of it is the difference between a demo and a thing you’d trust.

The landscape, honestly mapped

When you go looking for help with the above, you hit a wall of options. It sorts cleanly into three layers, and the useful thing is a decision heuristic, not a feature list.

LayerWhat it isReach for it when
Vendor SDKsOne vendor’s loop, tools and context management, exposed directly — the same machinery their own products run onYou want the fastest path to a single agent on one model family
FrameworksOrchestration across many agents or steps — graphs, roles, shared state, persistenceYou need multi-agent coordination or graph-shaped control flow
ProtocolsStandards for how agents talk to tools, and to each otherYou’re connecting across a boundary you don’t own

On the vendor SDKs: Anthropic’s Claude Agent SDK (renamed from the Claude Code SDK in late 2025) hands you the same loop, built-in tools, and context compaction that power Claude Code. OpenAI’s Agents SDK, whose core idea is the handoff — agents explicitly passing control with their context — picked up a model-native harness and sandboxed execution in its April 2026 update. Google’s ADK is the counterpart on its stack. For a single agent calling a couple of tools, one of these is usually the shortest road.

On frameworks, I’ll resist crowning a winner, because the practitioners I trust have stopped doing it. LangGraph models agents as nodes in a graph with explicit shared state, and has become the default where auditability and deterministic control matter. CrewAI uses a role-based metaphor and is the fastest way to a working multi-agent prototype — hours, not weeks. A common path is to start on CrewAI and migrate the parts that need fine control to LangGraph. The honest rule: match the framework to the shape of your workflow and your debugging needs, not to its star count.

The agent-to-agent thing, deflated

This is the part with the most podcast energy and the least day-one relevance, so it’s worth being precise.

Most of what gets sold as “multi-agent” is just one orchestrator model making other model calls as if they were tools, and summarising back up — which is exactly what the helpers in the tool you already use do. The fancier agent-to-agent protocols are about something else: interoperability, an agent your company built talking to one another company built. That’s an infrastructure and standards concern, not a feature you need to ship your first agent.

The cleanest mental model — and the single most useful thing to hold onto — is that there are two layers, not two rivals.

MCP and A2A — the two layers MCP is the vertical layer connecting an agent to its tools and data. A2A is the horizontal layer connecting one agent to another. A2A — agent ↔ agent · the horizontal layer your agent a model in a loop another agent built by someone else MCP MCP tools · data · APIs files, search, your functions its own tools · data behind its own boundary MCP — agent ↔ tools · the vertical layer
Two layers, not two rivals. MCP standardises how an agent reaches its tools and data; A2A standardises how one agent talks to another. Start with the vertical one — most of us never need the horizontal one on day one.

MCP, the Model Context Protocol, is the vertical layer: how an agent reaches its tools and data. By early 2026 it had effectively won that layer, adopted across every major vendor, with — by some counts — tens of millions of SDK downloads a month. A2A, Agent2Agent, is the horizontal layer: how one agent talks to another, with capability cards at a known URL and a defined task lifecycle. And the real 2026 story isn’t a protocol war at all — both now sit under a shared, vendor-neutral foundation, the way HTTP and WebSocket and gRPC simply coexist on the web. Complementary layering, not winner-take-all.

The deflation, for builders: if you’re hand-rolling custom endpoints for agents to talk to each other in 2026, you’re mostly minting technical debt. But equally — most of us don’t need A2A on day one. Start with one agent and MCP tools. Add the horizontal layer only when you genuinely have separate, specialised agents that must coordinate across a boundary.

The assignment

So here’s the only homework that matters, and it’s the thing I did that finally closed my own gap: build the dumbest possible agent yourself. Fifty lines. The API, one or two tools you define, a while loop, no framework. Give it a goal that needs a tool, and watch it decide.

Once you’ve written the loop with your own hands, the entire landscape reorganises in your head. The frameworks stop being mysterious and become what they always were — convenience wrappers around an idea you now own. You stop wondering whether you’re behind.

You were never behind. You’d been inside the agent the whole time. You just hadn’t written the loop.