Skip to main content

Command Palette

Search for a command to run...

The Agent Loop in 40 Lines

Every agent framework wraps the same small loop. Here it is, written by hand, so you know what you are actually buying when you adopt LangChain or LangGraph.

Updated
โ€ข11 min read
The Agent Loop in 40 Lines

There is a meme in LLM engineering that says an agent is "an LLM in a while loop." Like most memes, it's compact, a little smug, and very nearly the truth.

Here is the whole truth, written out: an agent is a function that calls the model, executes any tool the model asks for, feeds the tool's output back to the model, and keeps going until the model produces a final answer or the loop hits a budget. That's it. Forty lines of Python. No framework. No langgraph. No vector database. No MCP. No Redis. Just the loop.

This post is going to write that loop by hand, from scratch, in a way you can paste into a file and run right now. At the end of it you will have a working agent with two tools that can answer multi-step questions that a single tool call cannot. More importantly, you will know exactly what every "agent framework" on the market is wrapping โ€” so the next time a marketing page talks about "cognitive architectures" you can translate it back to "a while loop around messages.create."


The shape

Start with the picture of what we're building. Two tools. A loop. A budget.

The loop has three branches: final answer, tool calls, or budget exhausted. Each iteration either makes progress (runs some tools and adds their results to history) or terminates. That's the whole state machine.


The tools

Two toy tools that are boring enough to be clear and chainable enough to be interesting. One returns the current time in a city. The other converts a time from one zone to another.

from datetime import datetime
from zoneinfo import ZoneInfo

def get_current_time(city: str) -> dict:
    """Look up the current time in a city, using a tiny hardcoded map."""
    zones = {
        "London": "Europe/London",
        "New York": "America/New_York",
        "Tokyo": "Asia/Tokyo",
        "Sydney": "Australia/Sydney",
    }
    zone_name = zones.get(city)
    if not zone_name:
        return {"error": f"unknown city: {city}"}
    now = datetime.now(ZoneInfo(zone_name))
    return {"city": city, "iso": now.isoformat(), "hour": now.hour}

def hours_between(iso_a: str, iso_b: str) -> dict:
    """How many hours between two ISO timestamps."""
    a = datetime.fromisoformat(iso_a)
    b = datetime.fromisoformat(iso_b)
    delta_hours = (b - a).total_seconds() / 3600
    return {"hours": round(delta_hours, 2)}

These are chainable: "what time is it in Sydney, and how many hours until my Tokyo meeting at 2026-04-13T14:00:00+09:00" is a two-tool question. The model has to call get_current_time("Sydney") first, then feed that result into hours_between along with the meeting time. A single tool call can't answer. A loop can.


The tool schemas

Same format as B4.1. Two tools, each with a description and input schema.

TOOLS = [
    {
        "name": "get_current_time",
        "description": (
            "Return the current local time for a city as an ISO-8601 "
            "timestamp and the hour of the day."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city name, e.g. 'London' or 'Tokyo'.",
                },
            },
            "required": ["city"],
        },
    },
    {
        "name": "hours_between",
        "description": (
            "Return the number of hours between two ISO-8601 timestamps. "
            "Useful for computing time until an event."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "iso_a": {
                    "type": "string",
                    "description": "The earlier ISO-8601 timestamp.",
                },
                "iso_b": {
                    "type": "string",
                    "description": "The later ISO-8601 timestamp.",
                },
            },
            "required": ["iso_a", "iso_b"],
        },
    },
]

# Dispatcher โ€” map tool name to actual function
HANDLERS = {
    "get_current_time": get_current_time,
    "hours_between": hours_between,
}

The loop itself

Here is the forty lines. Everything else in this post is setup. This is the agent.

# pip install anthropic
import json
import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

SYSTEM = (
    "You are a helpful assistant. Use the provided tools to answer "
    "questions about time zones. When you have the information you need, "
    "give the final answer directly without unnecessary tool calls."
)

def run_agent(user_message: str, max_iterations: int = 10) -> str:
    messages = [{"role": "user", "content": user_message}]

    for iteration in range(max_iterations):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1000,
            system=SYSTEM,
            tools=TOOLS,
            messages=messages,
        )

        # If the model is done, return the text answer.
        if response.stop_reason == "end_turn":
            return next(b.text for b in response.content if b.type == "text")

        # Otherwise the model wants to call one or more tools.
        if response.stop_reason == "tool_use":
            messages.append({"role": "assistant", "content": response.content})

            tool_results = []
            for block in response.content:
                if block.type != "tool_use":
                    continue
                handler = HANDLERS.get(block.name)
                if handler is None:
                    result = {"error": f"unknown tool: {block.name}"}
                else:
                    try:
                        result = handler(**block.input)
                    except Exception as e:
                        result = {"error": str(e)}
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": json.dumps(result),
                })

            messages.append({"role": "user", "content": tool_results})
            continue

        # Unexpected stop reason โ€” bail.
        return f"[agent halted: unexpected stop_reason={response.stop_reason}]"

    return "[agent halted: max iterations reached]"

if __name__ == "__main__":
    print(run_agent(
        "What's the current time in Sydney, and how many hours until "
        "2026-04-13T14:00:00+09:00 in Tokyo?"
    ))

Paste that (and the tool definitions and schemas above) into agent.py, set your ANTHROPIC_API_KEY, and run. The model will call get_current_time("Sydney"), read the result, then call hours_between(iso_a=sydney_now, iso_b="2026-04-13T14:00:00+09:00"), read the result, and produce a final answer like "It's currently 11:03 AM in Sydney. That's 5 hours and 57 minutes until your 2 PM meeting in Tokyo."

That is an agent. Two tools, one while-loop, a budget, error handling, structured dispatch, and a final answer. No framework. No graph abstraction. No cognitive architecture. Forty lines.


What every line is doing

Let me walk through the non-obvious parts of that loop, because each line has a reason:

for iteration in range(max_iterations) โ€” the budget. Without this, a confused model could call tools in an infinite loop (especially easy if your tools are slightly broken and the model keeps retrying). Ten iterations is a reasonable default for most tasks; bigger budgets cost more money and are rarely needed.

if response.stop_reason == "end_turn" โ€” this is the "model is done" branch. end_turn means the model produced a text answer without requesting more tools. Return it.

if response.stop_reason == "tool_use" โ€” the model wants to run tools. We append the full assistant response (which contains the tool-use blocks) to the conversation, then run the tools.

messages.append({"role": "assistant", "content": response.content}) โ€” you must append the full response.content from the model before appending tool results. This preserves the tool-use blocks the model produced, which the next messages.create call uses to pair tool_use_id with the corresponding tool_result. Skip this and you get an error.

handler = HANDLERS.get(block.name) โ€” dispatch by name. A dictionary lookup, not a chain of if statements. Scales cleanly to 50 tools.

`try: result = handler(block.input)** โ€” catch *any* exception from the tool and return it as a structured error. Raising an exception in the loop would kill the agent; returning a{"error": ...}` lets the model see the failure and decide whether to retry or try a different approach.

tool_results.append({...}) โ€” we collect all tool results before appending them as one user turn, because the model might have emitted multiple tool_use blocks in a single response (parallel tool calls). All results for a single assistant turn must be appended together.

continue โ€” back to the top of the loop for another round.

Nine substantive lines of logic, thirty-one lines of boilerplate and comments. The agent is small.


What this loop gives you for free

Now the interesting part. Look at what this forty-line loop can already do:

  • Multi-step tasks. The model can chain two, three, or ten tool calls across iterations to answer a complex question.
  • Error recovery. If a tool fails, the model sees the error as a structured result and can try a different tool, retry with different args, or fall back to an apology.
  • Parallel tool calls. The inner loop over response.content handles multiple tool-use blocks in one turn โ€” you get 2โ€“4x latency improvement on branching questions for free.
  • Budget safety. A hard cap prevents runaway loops.
  • Observability. Print response.content or messages at each iteration and you have a full trace of what the agent did. (In production, log this, not print.)

What this loop does not give you yet:

  • State persistence across conversation restarts. If the user closes the chat and comes back, the agent starts fresh โ€” unless you store messages in a database.
  • Planning. The model decides each step as it goes, not by first writing a plan and executing it. Often fine, occasionally worse on complex tasks. B4.3 covers the planning question.
  • Multi-agent coordination. One model, one loop. If you want multiple specialised agents, you need a router. We cover that in B4.4 and argue that most "multi-agent" setups are better as one smarter agent.
  • Human-in-the-loop approval. Destructive tools should require a human confirmation (see B2.4). This loop doesn't have that built in โ€” it runs every tool the model asks for. In production, gate destructive tools explicitly in the HANDLERS dispatch.
  • Observability at scale. Print statements are fine for dev. For prod you want structured traces with OpenTelemetry or LangSmith-style tracing. We cover observability in B5.3.

Every single one of those is a small addition to the loop. A database for state. An if tool.destructive: await confirm() check. A wrapper that logs each iteration. None of them require throwing out the loop; all of them bolt on.


So why do frameworks exist?

They exist because every team that builds an agent inevitably writes this loop, then adds the bolt-ons, then realises they've reinvented half a framework, then asks "should we just adopt one?" The answer depends.

Use a framework when:

  • You're building a complex multi-step system with branching logic, parallel sub-agents, and state that has to persist across sessions. LangGraph's graph abstraction genuinely helps here.
  • You want to standardise tool definitions across many teams or services. Frameworks give you a common vocabulary.
  • You value a batteries-included observability story and don't want to wire traces yourself.
  • You have a team that's comfortable with the framework and you value shared knowledge.

Write the loop by hand when:

  • Your agent is a simple one โ€” a few tools, one conversation, one purpose. Most production agents are this.
  • You're prototyping and need to understand exactly what's happening.
  • You're building something that doesn't fit the framework's assumptions (weird deployment target, custom storage, unusual tool semantics).
  • You want to stay on the latest model features before frameworks catch up.

The middle ground is fine too: write the core loop by hand, pull in a framework later if/when you need specific features. Don't start with the framework.


Admit what breaks

  • Models occasionally loop. A confused agent can call the same tool three times in a row with identical args, waiting for a different result. Your budget catches this eventually, but it's wasteful. Log repeated identical tool calls and shortcut them.
  • Tool results that exceed the context window. Your tool returns 50,000 lines of logs. The model's context gets eaten. Truncate or summarise tool outputs before adding them to the history.
  • The model hallucinates tool names. Constrained decoding blocks the obvious cases, but a confused agent can still produce malformed tool uses. The unknown tool error branch exists for this reason.
  • Parallel tools that interfere. Two tool calls that both modify the same resource can race. Make tools idempotent, serialise mutations, or validate preconditions in the handler.
  • The model decides a task is done when it isn't. end_turn means "I'm done," but "done" is the model's judgment. If the model gives up too early, your system prompt needs to push it harder ("Continue calling tools until you have a complete answer").
  • Max iterations too low. You set max_iterations=3, the task needs 5, and your agent silently returns [max iterations reached]. Log this case loudly and alert on it in production.
  • Max iterations too high. You set max_iterations=100 and a buggy tool causes a loop. Bill surprise. Keep budgets small and increase them explicitly when you measure a need.

What just changed in your code

  • You now know what an agent is. A function with a while loop, a model call, a dispatcher, and a budget. Not a framework. Not a buzzword.
  • Write the loop by hand for your first agent. You'll be faster than fighting a framework, and you'll know where every piece of behaviour comes from.
  • Always have a budget. max_iterations is non-negotiable. Ten is a reasonable default.
  • Always catch tool exceptions and return them as structured errors. Never let a tool raise inside the loop.
  • Always append the full assistant response before tool results. Skipping this breaks the pairing of tool_use_id and tool_result.
  • Log every iteration's response.content โ€” at minimum the tool names and arguments. Observability in agents is more important than in any other LLM code because the behaviour is autonomous.

Next post, B4.3, we take on the most hyped question in agent research: planning vs reacting. ReAct, reflection, plan-and-execute โ€” three patterns the papers love. We'll look honestly at which ones actually help in production and which ones are expensive and optional.


Course navigation

โฌ…๏ธ Previous๐Ÿ“ You are hereNext โžก๏ธ
โฌ…๏ธ Previous
B4.1 ยท Tool Use Is Structured Output in Disguise
B4.2 of B6.4Next โžก๏ธ
B4.3 ยท Planning vs Reacting

๐Ÿ“š AI for Builders ยท Course Home โ€” 28 posts, six modules.


Cover photo via Unsplash. This post is part of the AI for Builders series.

More from this blog

Learn AI - Zero to Hero

111 posts