Single Prompt vs. Agent: When to Use Which

Most AI features built into early-stage web apps do not need an agent. A single, well-structured prompt sent to Claude covers the majority of real-world use cases — and ships faster, costs less, and breaks less often. But there are genuine scenarios where a chain of autonomous steps, tool calls, and feedback loops earns its complexity. This post draws the line.

The Core Distinction

A single prompt is a stateless function call. Input goes in, output comes out. The model has no memory across calls, executes no side effects, and makes no branching decisions on your behalf. An agent, by contrast, is a loop: the model decides what to do next, calls tools, receives results, and iterates until it reaches a terminal state — or errors out.

Dimension	Single Prompt	Agent
Execution model	One LLM call	Loop of LLM calls + tool invocations
State	Stateless	Stateful across steps
Latency	200 ms – 3 s	Seconds to minutes
Cost per run	Predictable, low	Variable, can escalate
Failure surface	Prompt + output parsing	Each tool call + loop termination
Debugging	Straightforward	Requires tracing per step
Best fit	Defined input → defined output	Open-ended task with branching paths

When a Single Prompt Is Enough

The single-prompt pattern handles far more than most teams expect. If you can describe your task as "given X, always produce Y in this format," a prompt is the right tool. Complexity of the language task — summarisation, classification, extraction, generation — does not require an agent. Only the need for sequential decision-making with unknown steps does.

Classify incoming support tickets into predefined categories
Extract structured fields (name, date, amount) from unstructured text
Generate a first draft of a product description from a data object
Summarise a meeting transcript to three bullet points
Translate a UI string with brand-voice instructions baked into the system prompt
Score a lead against a fixed rubric and return a JSON result
Rewrite user-submitted text to match a defined tone or reading level
Answer a question using a context document passed in the prompt (RAG retrieval already done upstream)

A Minimal Single-Prompt Implementation

The following TypeScript example shows a ticket-classification call. Input is a raw ticket string; output is a typed classification object. No tool calls, no iteration.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

type TicketCategory = "billing" | "technical" | "account" | "other";

interface ClassificationResult {
  category: TicketCategory;
  confidence: "high" | "medium" | "low";
  summary: string;
}

async function classifyTicket(
  ticketText: string
): Promise<ClassificationResult> {
  const message = await client.messages.create({
    model: "claude-opus-4-5",
    max_tokens: 256,
    system:
      "You are a support ticket classifier. " +
      "Return ONLY valid JSON matching this shape: " +
      "{ category: 'billing'|'technical'|'account'|'other', " +
      "confidence: 'high'|'medium'|'low', summary: string }. " +
      "No prose, no markdown fences.",
    messages: [
      {
        role: "user",
        content: ticketText
      }
    ]
  });

  const raw = message.content[0];
  if (raw.type !== "text") throw new Error("Unexpected content type");

  return JSON.parse(raw.text) as ClassificationResult;
}

const result = await classifyTicket(
  "I was charged twice for my subscription this month."
);
console.log(result);
// { category: 'billing', confidence: 'high', summary: 'Duplicate charge reported' }

When You Actually Need an Agent

An agent is warranted when the number of steps required to complete a task cannot be determined in advance, or when intermediate results change which subsequent actions are taken. The model needs to act, observe, and decide — not just transform text.

Research a company by querying multiple APIs, then synthesise findings — the number of queries depends on what each one returns
Debug a failing CI pipeline by reading logs, editing files, running tests, and iterating until green
Fill out a multi-step web form where each page's fields depend on previous answers
Coordinate sub-tasks across specialised sub-agents (e.g. one for search, one for writing, one for quality review)
Monitor a live data feed and take conditional actions based on threshold breaches
Execute a multi-step database migration with validation checks between each step

A Minimal Agent Implementation with Tool Use

The example below implements a bare-bones agentic loop. The model can call a getstockprice tool; the loop continues until the model returns a stop reason of end_turn without requesting further tool calls. The loop, tool dispatch, and result injection are explicit application-layer responsibilities — the SDK does not handle them automatically.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

function getStockPrice(ticker: string): string {
  const prices: Record<string, number> = {
    AAPL: 213.4,
    MSFT: 415.2,
    TSLA: 177.9
  };
  const price = prices[ticker.toUpperCase()];
  return price
    ? `${ticker.toUpperCase()}: $${price}`
    : `${ticker}: not found`;
}

const tools: Anthropic.Tool[] = [
  {
    name: "get_stock_price",
    description: "Returns the current stock price for a given ticker symbol.",
    input_schema: {
      type: "object" as const,
      properties: {
        ticker: { type: "string", description: "Stock ticker, e.g. AAPL" }
      },
      required: ["ticker"]
    }
  }
];

async function runStockAgent(userQuery: string): Promise<string> {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userQuery }
  ];

  while (true) {
    const response = await client.messages.create({
      model: "claude-opus-4-5",
      max_tokens: 1024,
      tools,
      messages
    });

    messages.push({ role: "assistant", content: response.content });

    if (response.stop_reason === "end_turn") {
      const textBlock = response.content.find((b) => b.type === "text");
      return textBlock && textBlock.type === "text"
        ? textBlock.text
        : "No response";
    }

    if (response.stop_reason === "tool_use") {
      const toolResults: Anthropic.ToolResultBlockParam[] = response.content
        .filter((b): b is Anthropic.ToolUseBlock => b.type === "tool_use")
        .map((toolUse) => {
          const input = toolUse.input as { ticker: string };
          const result = getStockPrice(input.ticker);
          return {
            type: "tool_result" as const,
            tool_use_id: toolUse.id,
            content: result
          };
        });

      messages.push({ role: "user", content: toolResults });
      continue;
    }

    throw new Error(`Unexpected stop_reason: ${response.stop_reason}`);
  }
}

const answer = await runStockAgent(
  "Compare the current prices of Apple and Tesla and tell me which is higher."
);
console.log(answer);

Cost and Risk Trade-offs

Agents introduce costs that compound at every loop iteration. Before committing to an agentic architecture, weigh the following factors against your actual requirements.

Trade-off	Single Prompt	Agent
Token cost per run	Fixed, proportional to context size	Multiplied by loop iterations — can be 10–50× higher
Latency	Single round-trip; suitable for synchronous UI	Multiple round-trips; often requires async job queue + polling
Error propagation	One failure point; easy to catch and retry	Tool failures mid-loop can corrupt state; partial execution is hard to roll back
Prompt injection risk	Limited to input text	Any tool output re-enters the context; external data can manipulate subsequent steps
Observability	Log request and response	Requires per-step tracing (e.g. Langfuse or custom span logging)
Rate limit exposure	One API call per user action	Burst of calls per agent run; can exhaust rate limits under load

Decision Taxonomy: Task Shape to Architecture

The following classification covers the most common task types encountered in SMB and solopreneur web apps. Use it as a quick reference when scoping AI features.

Task type	Steps known upfront?	External tool calls?	Recommended architecture
Text classification	Yes	No	Single prompt
Structured data extraction	Yes	No	Single prompt
Text generation (fixed schema)	Yes	No	Single prompt
Retrieval-augmented Q&A (retrieval pre-done)	Yes	No	Single prompt
Multi-document summarisation	Yes	No	Single prompt + map-reduce if context exceeds model window
Research across multiple APIs	No	Yes	Agent
Conditional form filling or web automation	No	Yes	Agent
Code generation + execution + fix loop	No	Yes	Agent
Scheduled monitoring with conditional alerts	No	Yes	Agent
Multi-specialist review pipeline	Yes (fixed stages)	Optional	Prompt pipeline (sequential calls, not a loop)

The Middle Ground: Prompt Pipelines

Between a single prompt and a full agent lies a prompt pipeline: a fixed sequence of LLM calls where the output of one call becomes the input of the next. The number of steps is hard-coded; there is no autonomous branching. This covers most complex AI features in early products — sentiment analysis followed by response drafting, extraction followed by scoring, classification followed by routing — without the operational overhead of a loop.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

async function singleCall(system: string, user: string): Promise<string> {
  const msg = await client.messages.create({
    model: "claude-opus-4-5",
    max_tokens: 512,
    system,
    messages: [{ role: "user", content: user }]
  });
  const block = msg.content[0];
  if (block.type !== "text") throw new Error("Expected text block");
  return block.text;
}

async function classifySentiment(review: string): Promise<string> {
  return singleCall(
    "Classify the sentiment of the review. Return ONLY one word: positive, negative, or neutral.",
    review
  );
}

async function draftResponse(
  review: string,
  sentiment: string
): Promise<string> {
  return singleCall(
    `You are a customer support agent. The review sentiment is ${sentiment}. Write a concise, empathetic reply in max 3 sentences.`,
    review
  );
}

async function reviewPipeline(
  review: string
): Promise<{ sentiment: string; reply: string }> {
  const sentiment = await classifySentiment(review);
  const reply = await draftResponse(review, sentiment);
  return { sentiment, reply };
}

const output = await reviewPipeline(
  "The onboarding was confusing and I could not find the export button anywhere."
);
console.log(output);
// { sentiment: 'negative', reply: '...' }

Frequently Asked Questions

Can I add memory to a single prompt without building a full agent?

Yes. Conversation history is just an array of message objects appended to the messages parameter. Passing the last N turns of context gives the model continuity without any agentic loop. This is the right pattern for chatbot-style features. An agent is only needed when the model must decide what external actions to take based on that history.

How do I cap runaway agent costs in production?

Set a hard maximum on loop iterations (e.g. max_iterations = 10) and track cumulative token usage across the run. If either limit is reached, terminate the loop and return a partial result with an error flag. Add per-run budget limits at the API key level if your provider supports it. Log every iteration to catch escalation patterns early.

Is RAG (retrieval-augmented generation) a form of agentic architecture?

Not necessarily. If retrieval is a fixed pre-processing step — you fetch documents, then pass them into a single prompt — it is a prompt pipeline, not an agent. RAG becomes agentic only when the model decides whether to retrieve, what query to use, and whether the retrieved results are sufficient or require a follow-up retrieval — i.e. when retrieval is a tool the model calls conditionally.

Does using Claude's tool_use feature automatically make my app an agent?

No. A single prompt can include tool definitions and receive a tooluse stop reason. What makes something an agent is the loop: you execute the tool, return the result, and call the model again. If you call the model once, parse a tooluse block, execute the tool, and stop — that is a single-turn tool call, not an agent.

At what product stage does an agentic feature make sense to build?

When a manual workflow performed by a human involves conditional branching, external data lookups, and more than two or three distinct steps — and when that workflow runs frequently enough that automation pays back the engineering and operational overhead. For most MVPs, a single prompt or a short pipeline is the right starting point. Agentic features are a second- or third-iteration addition, not a launch requirement.

When a single Claude prompt is enough — and when you need an agent