Function calling — or tool use, or whatever your provider named it this quarter — is the primitive that turns a chat model into an agent. Most teams learn the basic pattern quickly: define a tool, the model invokes it with typed parameters, you execute, return the result. What takes longer is recognizing when to use which orchestration pattern, and how to make each robust. Five patterns, when to pick each, and the failure modes each opens up.

Five patterns

Single-shot, parallel, sequential, plan-then-execute, and ReAct loop. Pick by how much planning the task requires and whether tools depend on each other's outputs.

Pattern 1: Single-shot

The model needs to call exactly one tool to answer. Classic example: a weather assistant where every query ends with get_weather(city). The prompt describes the tool, the user asks, the model calls, you return, the model generates the final response.

When to use: when your task genuinely has a fixed shape. Works well for simple lookups, CRUD-like assistants, anything where 'look up X then format' is the workflow. Failure modes: the model calls the tool when it shouldn't (overactive), or refuses to call when it should (underactive). Both are fixable with few-shot examples in the system prompt.

Pattern 2: Parallel tool calls

Multiple independent tools called in one turn. User asks "what's the weather in Paris and flights from NYC today?" — model calls get_weather(Paris) and search_flights(NYC, today) concurrently. Provider APIs return both calls in a single response; your executor runs them in parallel.

When to use: aggregator queries, dashboards, any UI where you're fetching multiple pieces of context at once. Latency savings are significant when individual tool calls take 500ms+. Failure mode: the model sometimes splits a task that should be sequential into parallel calls (fetch X, then based on X fetch Y). Prompt engineering and tool description wording matter more here than in any other pattern.

Pattern 3: Sequential

Each tool's output feeds the next. Search → fetch specific URL → summarize → translate. The model plans the sequence implicitly or the prompt structures it explicitly.

When to use: any workflow where early steps inform later ones. Most real agent workflows are sequential or mixed-sequential-parallel. Failure modes: the model skips steps (summarizes without fetching), or repeats steps (re-searches unnecessarily). Mitigations: clear tool descriptions about what each returns, and a state object that shows the model what's already been done. See LangGraph patterns for the state-management discipline.

Pattern 4: Plan-then-execute

The model first produces an explicit plan — a list of tool calls with expected inputs/outputs — then executes the plan. Optionally revises the plan mid-execution.

When to use: complex workflows where the cost of a bad mid-execution decision is high, and showing the plan to the user or a supervisor adds value. Example: a research agent that plans its search strategy, shows you the outline, then runs it. Failure modes: the plan becomes disconnected from execution reality — the model plans step 3 assuming step 2 returned data that step 2 actually didn't return. Mitigation: after each step, force the model to re-read the plan and either continue, revise, or abort.

Pattern 5: ReAct loop

Think → act → observe → repeat. The model reasons about the next step, takes a single action, sees the result, then loops. This is the pattern underlying most "agentic" systems — no pre-committed plan, just iterative reasoning.

When to use: tasks where the path is genuinely unpredictable. Research tasks, debugging, open-ended problem-solving. Failure modes: infinite loops, wandering, tool-call storms. Mitigations: strict step cap (max 10 steps before forcing a conclusion), cost cap (max $X per session), reflection prompts at key points ("are you making progress?"). See agents in production for the full loop discipline.

Choosing the pattern

Default to single-shot or sequential for 80% of use cases. Use ReAct only when the task genuinely can't be planned ahead. Use plan-then-execute when the audit trail matters (regulated contexts, user-facing research agents). Parallel is an optimization on top of the others, not its own pattern in practice.

One hard-won rule: start simpler than you think you need. Most teams try to build ReAct agents when a single-shot tool call would do. The single-shot version ships faster, breaks less often, and tells you whether you actually needed the complexity.

Function calling patterns that hold up in production

Pattern 1: Single-shot

Pattern 2: Parallel tool calls

Pattern 3: Sequential

Pattern 4: Plan-then-execute

Pattern 5: ReAct loop

Choosing the pattern

Continue the thread.

AI agents in production: what actually breaks

LangGraph patterns we use in every agentic system

Making structured outputs actually reliable

Want to talk about this?