Agentic Loops
An agentic loop is the core execution pattern of an AI agent: the model observes its environment, reasons about what to do, takes an action, observes the result, and repeats until the task is complete.

An agentic loop is the repeating execution pattern that turns a language model into an AI agent. Instead of receiving a question and returning a single answer, the model runs through a cycle: observe what is happening, decide what to do, act on that decision, observe the result of the action, and then repeat. The loop continues until the agent determines the task is complete.
The word “loop” is precise. Each iteration feeds the result of the previous action back into the model’s input, giving the model new information to reason about. This feedback mechanism is what separates an agent from a standard LLM call. A single call is a one-shot question-and-answer exchange. A loop is a process that evolves based on what the model learns at each step.
This pattern appears across almost every agent framework in production today: ReAct (Reason + Act), OpenAI Assistants, LangGraph, AutoGen, Claude’s tool-use API, and custom agentic systems. The implementation details vary, but the underlying loop is the same.
The loop, step by step
The loop exits when the agent produces a final answer instead of a tool call, or when an external stopping condition triggers (see Stopping Logic below).
The parts of a loop
State
State is everything the agent knows right now. In practice, state is the contents of the context window at the start of each iteration: the original task, all previous tool calls, all observations returned by those calls, and any memory retrieved from external storage. An agent with a small or poorly managed state will make poor decisions. State management is one of the hardest engineering problems in agentic systems.
Reasoning
Reasoning is the LLM’s decision step. The model reads the current state and determines what action to take next. In chain-of-thought models (such as Claude or OpenAI’s o-series), the reasoning step is a visible scratchpad before the tool call or final answer. In standard completions, the reasoning is internal. Either way, the quality of the reasoning step determines the quality of the loop.
Action
An action is what the agent does in the external world. Actions fall into four broad categories:
- Read actions: web search, file read, database query, API GET request
- Write actions: file write, database insert, API POST request, sending a message
- Compute actions: running code in a sandbox, calling a calculation function
- Delegation actions: spawning a sub-agent, calling another model
Observation
An observation is the result of an action, fed back into the agent’s state. If the agent runs a web search, the observation is the search results. If the agent runs code, the observation is the stdout output or the error message. Observations are the mechanism by which the agent learns from its actions during the same session.
Stopping condition
The stopping condition determines when the loop exits. A stopping condition can be:
- Explicit signal: the model produces a structured “DONE” token or a final answer without a tool call
- Max iterations: the orchestrator enforces a hard ceiling on the number of loop cycles
- Quality threshold: an evaluator agent signals that the output meets the required standard
- Human approval: a human-in-the-loop checkpoint confirms the action before the loop continues or exits
Loops without a stopping condition run forever and exhaust your budget. Always define at least one.
Loop variants
System architecture
What makes loops fail
Agentic loops introduce failure modes that do not exist in single-call LLM usage.
Infinite loops. If the agent has no stopping condition and the task is ambiguous, the model will continue calling tools indefinitely. This exhausts your token budget and produces no output. Always set max_iterations as a hard ceiling.
Context window exhaustion. Each observation adds tokens to the context. A loop that runs 20 iterations on a task involving long web pages will fill a 128k context window and either truncate earlier observations or throw an error. Summarise observations before appending them to state when loops are expected to run long.
Unhandled tool errors. If a tool call returns an error and the agent has no instruction for how to handle failure, it will retry the same call repeatedly or hallucinate a successful result. Wrap every tool with explicit error handling and return structured error observations the model can reason about.
Hallucinated tool arguments. The model may invent arguments for a tool call that look plausible but are invalid, such as a non-existent file path or a malformed JSON body. Validate all tool inputs before execution and return the validation error as an observation, not a system crash.
Reasoning drift. In long loops, the model can lose track of the original task as observations accumulate. Restate the goal explicitly in the system prompt and consider injecting a task reminder at fixed intervals.
Controlling loops in production
Four practices that keep agentic loops reliable at scale:
Set max_iterations explicitly. Every agent runner (LangGraph, AutoGen, custom) supports a hard cap on loop cycles. Set it before deploying. A reasonable default for general tasks is 10-15 iterations. Raise it only for known long-running workflows.
Log every step. Each observe-reason-act cycle should emit a structured log entry: the iteration number, the tool called, the arguments, and the observation returned. Without per-step logs, debugging a failed loop is almost impossible because you cannot see what the model was reasoning about at each point.
Add a human-in-the-loop checkpoint for high-stakes actions. Before the agent executes a write action (sending an email, modifying a database record, posting to an external API), pause the loop and request human confirmation. Resume the loop only after approval. This is especially important in financial, medical, and legal contexts.
Monitor token spend per loop, not per call. A single agentic loop can consume 10-50x the tokens of a single LLM call. Set spend limits at the loop level and alert when a single session exceeds your threshold. Cost control on agents is fundamentally different from cost control on chat applications.
Real-world analogy
Think of a customer support agent following a decision tree to resolve an order issue.
The agent receives a customer message (observe). The agent reads the message, checks the rules, and decides to look up the order number (reason). The agent queries the order system (act). The order system returns the order status: “shipped, delayed” (observe). The agent reads the delay and decides to check the carrier tracking page (reason). The agent fetches the tracking page (act). The tracking page shows a customs hold (observe). The agent now has enough information to draft a response explaining the delay (reason). The agent sends the reply and signals it is done (act, stopping condition met).
Each step in that sequence is one iteration of an agentic loop. The agent did not know it would need the tracking page when it started. It discovered that through the loop.
Further reading
- What is an AI Agent? : definition of agents, how they differ from assistants and automations
- What is Tool Use? : how LLMs call external functions and what happens when they do
- What is Agentic AI? : the broader category and how agentic systems differ from single-model pipelines
- ReAct: Synergizing Reasoning and Acting in Language Models : the original paper formalising the reason-then-act loop pattern
- LangGraph documentation : graph-based framework for building stateful, multi-actor agentic loops
- OpenAI Assistants API : hosted implementation of agentic loops with built-in tool execution and thread management
- Anthropic tool use documentation : how Claude implements the action step of an agentic loop using structured tool calls
- AutoGen documentation : Microsoft’s framework for multi-agent loops, including critique and hierarchical patterns