For a while now I’ve had a quiet, specific discomfort. I spend my days building with AI tools that feel close to sorcery — I describe a thing, and a capable assistant goes off, reads files, runs commands, spins up helpers, summarises what they found, and comes back with the work mostly done. And yet every other podcast tells me I should build agents, and I couldn’t honestly say what I was missing. Everything I already did looked like agents. So either I was already doing it, or I was missing something and couldn’t see the edge of it.
It turns out both were true, and the gap between them is the whole story.
Using the loop versus writing the loop
Here’s the reframe that dissolved it for me. If you use a tool like Claude Code heavily, you have been riding one of the most sophisticated agents that exists — you just never wrote the loop. The helpers, the orchestration, the read-summarise-decide rhythm, the way it knows when to stop: that is all genuine agentic behaviour. Someone wrote that loop, picked the tools, and made the hard calls — context management, error recovery, when to quit. You experience the output of those decisions without ever having made them.
When people say “build an agent,” they mean something narrower and more humbling: you write the loop. You own the control flow, the tools, the memory, the error handling, and the stopping condition. Nobody made those calls for you.
The good news is that the loop is almost insultingly small.
What an agent actually is
Strip away the noise and an agent is one sentence: a model in a loop with tools. The model gets a goal. It decides to call a tool. It gets the result. It decides the next move. It repeats until it judges the job done. That while loop is the agent. Multi-agent systems, agent-to-agent protocols, orchestration frameworks — all of it is decoration on this core.
I don’t expect you to believe me until you’ve seen it, so here is the entire thing. Around fifty lines. It runs.
import anthropic
client = anthropic.Anthropic()
# 1. The TOOLS. This is where an agent's real power lives — what it can *do*.
def calculator(expression: str) -> str:
return str(eval(expression)) # toy only; never eval untrusted input
def web_search(query: str) -> str:
return f"(pretend results for: {query})" # swap in a real search API
TOOLS = {"calculator": calculator, "web_search": web_search}
# 2. Describe those tools to the model so it knows when/how to call them.
TOOL_SCHEMAS = [
{"name": "calculator", "description": "Evaluate a math expression.",
"input_schema": {"type": "object",
"properties": {"expression": {"type": "string"}},
"required": ["expression"]}},
{"name": "web_search", "description": "Search the web for a query.",
"input_schema": {"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"]}},
]
def run(goal: str):
messages = [{"role": "user", "content": goal}]
# 3. THE LOOP. This is the entire idea of an agent.
while True:
resp = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
tools=TOOL_SCHEMAS,
messages=messages,
)
messages.append({"role": "assistant", "content": resp.content})
# No tool requested -> the agent decided it's done.
if resp.stop_reason != "tool_use":
print(next(b.text for b in resp.content if b.type == "text"))
return
# 4. Run every tool the model asked for; feed results back in.
results = []
for block in resp.content:
if block.type == "tool_use":
output = TOOLS[block.name](**block.input)
results.append({"type": "tool_result",
"tool_use_id": block.id, "content": output})
messages.append({"role": "user", "content": results})
run("What is 17% of 6,400, then search for why that number might matter?")
Read it once and the magic evaporates — in the good way. The while True is the agent. The line stop_reason != "tool_use" is the model deciding it’s finished, the stopping condition you’ve never had to write before. The TOOLS dict is the lever: the agent can do exactly what’s in it and nothing else. Every framework you’ve heard of replaces this file with abstractions. None of them replace the idea in it.
You’re not behind. You’ve been living inside the agent the whole time. You just have to drop one layer of abstraction and write the loop once.
The four things you never had to think about
Once you’ve written the loop, you can see precisely what the tools were doing for you — the four things that were invisible because they were handled.
The loop itself. Control flow, parsing tool calls, recovering when a tool throws. You just read all of it. It was never the hard part; it was just hidden.
Custom tools — the biggest lever by far. The toy above has a calculator. A real agent’s power is the tools you define for your problem: refund_order, check_inventory, escalate_to_human. The model is only as capable as its tools allow. This is invisible when the tools were chosen for you, and it’s where almost all the real work — and real value — lives.
Running unattended. The tool you ride is interactive — you approve, you redirect, you catch it when it wanders. Most production agents run with no one watching. Reliability without a human to course-correct is a genuinely different discipline, and it’s the one the demos quietly skip.
The supporting cast that unattended operation forces. The moment no one is watching, you need: evals (how do you know it works if you’re asleep?), observability (every step logged, so you can debug a run that already happened), guardrails (what is it forbidden to do), and durable memory across runs. None of this is exotic. All of it is the difference between a demo and a thing you’d trust.
The landscape, honestly mapped
When you go looking for help with the above, you hit a wall of options. It sorts cleanly into three layers, and the useful thing is a decision heuristic, not a feature list.
| Layer | What it is | Reach for it when |
|---|---|---|
| Vendor SDKs | One vendor’s loop, tools and context management, exposed directly — the same machinery their own products run on | You want the fastest path to a single agent on one model family |
| Frameworks | Orchestration across many agents or steps — graphs, roles, shared state, persistence | You need multi-agent coordination or graph-shaped control flow |
| Protocols | Standards for how agents talk to tools, and to each other | You’re connecting across a boundary you don’t own |
On the vendor SDKs: Anthropic’s Claude Agent SDK (renamed from the Claude Code SDK in late 2025) hands you the same loop, built-in tools, and context compaction that power Claude Code. OpenAI’s Agents SDK, whose core idea is the handoff — agents explicitly passing control with their context — picked up a model-native harness and sandboxed execution in its April 2026 update. Google’s ADK is the counterpart on its stack. For a single agent calling a couple of tools, one of these is usually the shortest road.
On frameworks, I’ll resist crowning a winner, because the practitioners I trust have stopped doing it. LangGraph models agents as nodes in a graph with explicit shared state, and has become the default where auditability and deterministic control matter. CrewAI uses a role-based metaphor and is the fastest way to a working multi-agent prototype — hours, not weeks. A common path is to start on CrewAI and migrate the parts that need fine control to LangGraph. The honest rule: match the framework to the shape of your workflow and your debugging needs, not to its star count.
The agent-to-agent thing, deflated
This is the part with the most podcast energy and the least day-one relevance, so it’s worth being precise.
Most of what gets sold as “multi-agent” is just one orchestrator model making other model calls as if they were tools, and summarising back up — which is exactly what the helpers in the tool you already use do. The fancier agent-to-agent protocols are about something else: interoperability, an agent your company built talking to one another company built. That’s an infrastructure and standards concern, not a feature you need to ship your first agent.
The cleanest mental model — and the single most useful thing to hold onto — is that there are two layers, not two rivals.
MCP, the Model Context Protocol, is the vertical layer: how an agent reaches its tools and data. By early 2026 it had effectively won that layer, adopted across every major vendor, with — by some counts — tens of millions of SDK downloads a month. A2A, Agent2Agent, is the horizontal layer: how one agent talks to another, with capability cards at a known URL and a defined task lifecycle. And the real 2026 story isn’t a protocol war at all — both now sit under a shared, vendor-neutral foundation, the way HTTP and WebSocket and gRPC simply coexist on the web. Complementary layering, not winner-take-all.
The deflation, for builders: if you’re hand-rolling custom endpoints for agents to talk to each other in 2026, you’re mostly minting technical debt. But equally — most of us don’t need A2A on day one. Start with one agent and MCP tools. Add the horizontal layer only when you genuinely have separate, specialised agents that must coordinate across a boundary.
The assignment
So here’s the only homework that matters, and it’s the thing I did that finally closed my own gap: build the dumbest possible agent yourself. Fifty lines. The API, one or two tools you define, a while loop, no framework. Give it a goal that needs a tool, and watch it decide.
Once you’ve written the loop with your own hands, the entire landscape reorganises in your head. The frameworks stop being mysterious and become what they always were — convenience wrappers around an idea you now own. You stop wondering whether you’re behind.
You were never behind. You’d been inside the agent the whole time. You just hadn’t written the loop.