Skip to content

Agents and Agentic Workflows

Making AI Useful 7 min read

In Short

An AI agent is a system where a language model dynamically directs its own actions in a loop, using tools and environmental feedback to accomplish a goal. Agentic workflows are the patterns developers use to structure that autonomy, ranging from simple prompt chains to fully autonomous multi-agent systems.

100%

Scroll to pan · Ctrl/Cmd + scroll to zoom · drag to pan · double-click to fit

An AI agent repeats a perceive, plan, act, and observe cycle, then checks a stopping condition. If unmet, it iterates with the updated context; if met (task done, max iterations, or needs human input), the loop ends.

01. What It Is

An AI agent is a system built around a language model that can perceive its environment, plan actions, invoke tools, observe results, and iterate until a task is complete. Agents are not static: they decide what to do next based on what they observe, not by following a fixed script.

Anthropic draws a sharp distinction between agents and workflows:

Workflows are systems where LLMs and tools are orchestrated through predefined code paths. The sequence of steps is determined in advance. A workflow might call an LLM to classify an input, route it to one of three handlers, then call a second LLM to format the output. Each step is explicit and the branching logic lives in your code, not in the model.

Agents are systems where the LLM dynamically directs its own processes and tool usage, deciding at runtime how to accomplish a task. The model chooses which tools to call, in what order, and when to stop. This produces greater flexibility but also greater unpredictability.

In practice, most production systems sit on a spectrum. A system might follow a predefined workflow for known cases but fall back to an agent loop when the task is novel or ambiguous.

02. Why It Matters

Most real-world tasks are not fully specifiable in advance. A customer support ticket might require reading a knowledge base, querying an order database, drafting a reply, and triggering a refund, all in a sequence that depends on what each step reveals. Writing that as a rigid workflow would require anticipating every case. An agent loop handles novel combinations naturally.

Agents also enable long-horizon tasks. Software engineering, research, and data analysis all require maintaining state, revising plans, and making dozens of decisions before producing an output. Agentic systems are the mechanism by which LLMs operate at that scale.

The tradeoff is cost, latency, and risk. More autonomy means more LLM calls, more opportunity for compounding errors, and less predictability. Anthropic's guidance is to use the simplest architecture that reliably solves the problem. Start with a single LLM call. Add workflow patterns. Add agents only when workflows are genuinely insufficient.

03. How It Works

The agent loop

Agents operate through a repeating cycle:

  1. Perceive. The agent receives a task and any available context (conversation history, tool results, environmental state).
  2. Plan. The model reasons about what action to take next. This may be explicit chain-of-thought reasoning or implicit in the model's output.
  3. Act. The agent invokes a tool, writes output, or requests clarification. Actions have real consequences: reading a file, executing code, calling an API, modifying a database.
  4. Observe. The result of the action comes back into the context. Tool call output, error messages, and code execution results are all observations.
  5. Iterate. The agent re-enters the loop with updated context. It continues until the task is complete, a stopping condition triggers (such as a maximum iteration count), or the agent determines it needs human input.

The loop is "just LLMs using tools based on environmental feedback in a loop," as Anthropic puts it. The implementation is often only a few dozen lines of code. Complexity comes from the tools, the prompts, and the failure modes, not from the loop itself.

Autonomy levels

Autonomy is a spectrum, not a binary:

  • Minimal autonomy. A single LLM call with retrieval and few-shot examples. No loop. Suitable for well-defined, one-shot tasks.
  • Structured workflow. A sequence of LLM calls with predefined routing and gates. The code drives the sequence. LLMs handle the content within each step.
  • Agent with guardrails. A loop where the model directs tool use, but human checkpoints or approval gates interrupt before destructive actions.
  • Fully autonomous agent. The model runs the loop without human intervention, stopping only when it decides the task is done.

Higher autonomy requires higher trust in the model's decision-making, higher tolerance for cost, and explicit mitigation for compounding errors.

04. Key Terms / Components

Term Meaning
Agent loop The perceive-plan-act-observe cycle an agent runs repeatedly
Tool An external function the agent can invoke (API, database, shell command)
Orchestrator An LLM that decomposes tasks and directs worker agents
Worker agent An LLM that handles a delegated subtask
Stopping condition A rule that ends the agent loop (task done, max iterations reached, error threshold)
Human-in-the-loop A checkpoint where a human approves or redirects before the agent continues
Compounding error An early mistake that cascades into increasingly wrong subsequent actions

05. Common Patterns

Anthropic's research identifies five core workflow patterns used throughout agentic system design.

1. Prompt chaining

A task is decomposed into sequential steps. The output of each LLM call becomes the input of the next. Gates between steps can verify progress before continuing. This is the simplest form of structured workflow.

Best for: tasks with clear, fixed subtasks where accuracy at each step matters more than speed. Translation pipelines, document drafting, multi-stage analysis.

2. Routing

An initial LLM call classifies the input and directs it to one of several specialized downstream handlers. Each handler is optimized for its specific case.

Best for: support triage, query classification, content moderation. Any system where inputs fall into meaningfully distinct categories that benefit from different handling.

3. Parallelization

Independent subtasks run concurrently. Two variations:

Sectioning: a large task is broken into independent parts that run in parallel and are merged afterward. Useful when subtasks do not depend on each other.

Voting: the same task is sent to multiple LLM calls and their outputs are aggregated or majority-voted. Useful for reducing variance and catching errors.

Best for: large documents that need parallel processing, decisions requiring diverse perspectives, tasks where a single model call is too unreliable.

4. Orchestrator-workers

A central orchestrator LLM dynamically decomposes a complex task into subtasks and delegates them to worker LLMs (or specialized agents). The orchestrator synthesizes the workers' results. Unlike parallelization, the subtasks are not predefined: the orchestrator decides at runtime what needs to be done based on the task at hand.

Best for: complex, open-ended tasks where the steps cannot be fully specified in advance. Software engineering tasks, research synthesis, multi-step data analysis.

5. Evaluator-optimizer

One LLM generates a response and a second LLM evaluates it and provides feedback. The first LLM revises based on that feedback. The cycle repeats until the evaluator is satisfied or a maximum iteration count is reached.

Best for: tasks with clear quality criteria and where iterative refinement demonstrably improves outcomes. Code generation with test feedback, translation with fluency scoring, content with defined rubrics.

06. Multi-Agent Systems

Complex tasks often benefit from multiple specialized agents working together. In a multi-agent system, agents can run in parallel, each focused on a domain (one handles code, another handles documentation, a third handles security review), with an orchestrator coordinating them.

Multi-agent systems enable parallelism and specialization but introduce coordination overhead and new failure modes. An orchestrator that misunderstands a subtask will propagate that error to all workers that depend on it. Clear interfaces between agents, explicit handoff formats, and shared context management become critical at this scale.

By 2026, Anthropic introduced Agent Skills as a shared format for packaging and distributing reusable agent capabilities across Claude.ai, Claude Code, and the API.

07. Where Agents Fail

  • Compounding errors. An early wrong decision leads to increasingly wrong subsequent actions. In a 20-step task, an error at step 3 can corrupt everything downstream.
  • Tool misuse. The model passes wrong parameter types, calls tools in the wrong order, or uses a tool for a purpose its description does not cover.
  • Context loss. Long agent sessions can exhaust the context window or cause the model to lose track of earlier constraints and state.
  • Complexity traps. Teams add orchestration frameworks and multi-agent architectures before the simpler single-agent version has been validated. The added complexity obscures bugs and makes debugging harder.
  • Abstraction opacity. Frameworks that hide prompts and responses behind abstractions make it hard to understand what the model is actually seeing, leading to incorrect assumptions about model behavior.
  • Unrecoverable actions. An agent with write access to production systems can make irreversible mistakes. Irreversibility and blast radius should be explicit constraints on tool permissions.

08. Common Pitfalls

  • Deploying agents in production without sandbox testing. Agents make real calls; test them against real systems before they touch real data.
  • No maximum iteration limits. An agent stuck in a loop will run indefinitely and rack up API costs without human intervention.
  • Granting more tool permissions than necessary. The principle of least privilege applies: give agents only the tools they need for their specific task.
  • Skipping human-in-the-loop checkpoints for destructive actions. Confirmation before deletes, sends, or payment operations is cheap; accidental execution is expensive.
  • Trusting agent self-assessment of task completion. Agents may incorrectly report success. Verify outputs against ground truth where possible.

Verified against primary sources

Every claim traces to a cited source below.

Key terms

AI agent
A system where a language model directs its own actions in a loop, using tools and feedback to reach a goal.
Workflow
A system where LLMs and tools are orchestrated through predefined code paths set in advance.
Agent loop
The perceive-plan-act-observe cycle an agent runs repeatedly until done.
Orchestrator
An LLM that decomposes tasks and directs worker agents.
Compounding error
An early mistake that cascades into increasingly wrong subsequent actions.

Tags

#agents #agentic-workflows #llm #orchestration #multi-agent #tool-use

More in AI Agents