The loop at a glance
Every agent session follows the same cycle:- Receive prompt. Claude receives your prompt, along with the system prompt, tool definitions, and conversation history. The SDK yields a
SystemMessagewith subtype"init"containing session metadata. - Evaluate and respond. Claude evaluates the current state and determines how to proceed. It may respond with text, request one or more tool calls, or both. The SDK yields an
AssistantMessagecontaining the text and any tool call requests. - Execute tools. The SDK runs each requested tool and collects the results. Each set of tool results feeds back to Claude for the next decision. You can use hooks to intercept, modify, or block tool calls before they run.
- Repeat. Steps 2 and 3 repeat as a cycle. Each full cycle is one turn. Claude continues calling tools and processing results until it produces a response with no tool calls.
- Return result. The SDK yields a final
AssistantMessagewith the text response (no tool calls), followed by aResultMessagewith the final text, token usage, cost, and session ID.
Glob and responding with the results. A complex task (“refactor the auth module and update the tests”) can chain dozens of tool calls across many turns, reading files, editing code, and running tests, with Claude adjusting its approach based on each result.
Turns and messages
A turn is one round trip inside the loop: Claude produces output that includes tool calls, the SDK executes those tools, and the results feed back to Claude automatically. This happens without yielding control back to your code. Turns continue until Claude produces output with no tool calls, at which point the loop ends and the final result is delivered. Consider what a full session might look like for the prompt “Fix the failing tests in auth.ts”. First, the SDK sends your prompt to Claude and yields aSystemMessage with the session metadata. Then the loop begins:
- Turn 1: Claude calls
Bashto runnpm test. The SDK yields anAssistantMessagewith the tool call, executes the command, then yields aUserMessagewith the output (three failures). - Turn 2: Claude calls
Readonauth.tsandauth.test.ts. The SDK returns the file contents and yields anAssistantMessage. - Turn 3: Claude calls
Editto fixauth.ts, then callsBashto re-runnpm test. All three tests pass. The SDK yields anAssistantMessage. - Final turn: Claude produces a text-only response with no tool calls: “Fixed the auth bug, all three tests pass now.” The SDK yields a final
AssistantMessagewith this text, then aResultMessagewith the same text plus cost and usage.
max_turns / maxTurns, which counts tool-use turns only. For example, max_turns=2 in the loop above would have stopped before the edit step. You can also use max_budget_usd / maxBudgetUsd to cap turns based on a spend threshold.
Without limits, the loop runs until Claude finishes on its own, which is fine for well-scoped tasks but can run long on open-ended prompts (“improve this codebase”). Setting a budget is a good default for production agents. See Turns and budget below for the option reference.
Message types
As the loop runs, the SDK yields a stream of messages. Each message carries a type that tells you what stage of the loop it came from. The five core types are:SystemMessage: session lifecycle events. Thesubtypefield distinguishes them:"init"is the first message (session metadata), and"compact_boundary"fires after compaction. In TypeScript, the compact boundary is its ownSDKCompactBoundaryMessagetype rather than a subtype ofSDKSystemMessage.AssistantMessage: emitted after each Claude response, including the final text-only one. Contains text content blocks and tool call blocks from that turn.UserMessage: emitted after each tool execution with the tool result content sent back to Claude. Also emitted for any user inputs you stream mid-loop.StreamEvent: only emitted when partial messages are enabled. Contains raw API streaming events (text deltas, tool input chunks). See Stream responses.ResultMessage: the last message, always. Contains the final text result, token usage, cost, and session ID. Check thesubtypefield to determine whether the task succeeded or hit a limit. See Handle the result.
Handle messages
Which messages you handle depends on what you’re building:- Final results only: handle
ResultMessageto get the output, cost, and whether the task succeeded or hit a limit. - Progress updates: handle
AssistantMessageto see what Claude is doing each turn, including which tools it called. - Live streaming: enable partial messages (
include_partial_messagesin Python,includePartialMessagesin TypeScript) to getStreamEventmessages in real time. See Stream responses in real-time.
- Python: check message types with
isinstance()against classes imported fromclaude_agent_sdk(for example,isinstance(message, ResultMessage)). - TypeScript: check the
typestring field (for example,message.type === "result").AssistantMessageandUserMessagewrap the raw API message in a.messagefield, so content blocks are atmessage.message.content, notmessage.content.
Example: Check message types and handle results
Example: Check message types and handle results
Tool execution
Tools give your agent the ability to take action. Without tools, Claude can only respond with text. With tools, Claude can read files, run commands, search code, and interact with external services.Built-in tools
The SDK includes the same tools that power Claude Code:| Category | Tools | What they do |
|---|---|---|
| File operations | Read, Edit, Write | Read, modify, and create files |
| Search | Glob, Grep | Find files by pattern, search content with regex |
| Execution | Bash | Run shell commands, scripts, git operations |
| Web | WebSearch, WebFetch | Search the web, fetch and parse pages |
| Discovery | ToolSearch | Dynamically find and load tools on-demand instead of preloading all of them |
| Orchestration | Agent, Skill, AskUserQuestion, TodoWrite | Spawn subagents, invoke skills, ask the user, track tasks |
- Connect external services with MCP servers (databases, browsers, APIs)
- Define custom tools with custom tool handlers
- Load project skills via setting sources for reusable workflows
Tool permissions
Claude determines which tools to call based on the task, but you control whether those calls are allowed to execute. You can auto-approve specific tools, block others entirely, or require approval for everything. Three options work together to determine what runs:allowed_tools/allowedToolsauto-approves listed tools. A read-only agent with["Read", "Glob", "Grep"]in its allowed tools list runs those tools without prompting. Tools not listed are still available but require permission.disallowed_tools/disallowedToolsblocks listed tools, regardless of other settings. See Permissions for the order that rules are checked before a tool runs.permission_mode/permissionModecontrols what happens to tools that aren’t covered by allow or deny rules. See Permission mode for available modes.
"Bash(npm:*)" to allow only specific commands. See Permissions for the full rule syntax.
When a tool is denied, Claude receives a rejection message as the tool result and typically attempts a different approach or reports that it couldn’t proceed.
Parallel tool execution
When Claude requests multiple tool calls in a single turn, both SDKs can run them concurrently or sequentially depending on the tool. Read-only tools (likeRead, Glob, Grep, and MCP tools marked as read-only) can run concurrently. Tools that modify state (like Edit, Write, and Bash) run sequentially to avoid conflicts.
Custom tools default to sequential execution. To enable parallel execution for a custom tool, mark it as read-only in its annotations: readOnly in TypeScript or readOnlyHint in Python.
Control how the loop runs
You can limit how many turns the loop takes, how much it costs, how deeply Claude reasons, and whether tools require approval before running. All of these are fields onClaudeAgentOptions (Python) / Options (TypeScript).
Turns and budget
| Option | What it controls | Default |
|---|---|---|
Max turns (max_turns / maxTurns) | Maximum tool-use round trips | No limit |
Max budget (max_budget_usd / maxBudgetUsd) | Maximum cost before stopping | No limit |
ResultMessage with a corresponding error subtype (error_max_turns or error_max_budget_usd). See Handle the result for how to check these subtypes and ClaudeAgentOptions / Options for syntax.
Effort level
Theeffort option controls how much reasoning Claude applies. Lower effort levels use fewer tokens per turn and reduce cost. Not all models support the effort parameter. See Effort for which models support it.
| Level | Behavior | Good for |
|---|---|---|
"low" | Minimal reasoning, fast responses | File lookups, listing directories |
"medium" | Balanced reasoning | Routine edits, standard tasks |
"high" | Thorough analysis | Refactors, debugging |
"max" | Maximum reasoning depth | Multi-step problems requiring deep analysis |
effort, the Python SDK leaves the parameter unset and defers to the model’s default behavior. The TypeScript SDK defaults to "high".
effort trades latency and token cost for reasoning depth within each response. Extended thinking is a separate feature that produces visible chain-of-thought blocks in the output. They are independent: you can set effort: "low" with extended thinking enabled, or effort: "max" without it.effort is set at the top-level query() options, not per-subagent.
Permission mode
The permission mode option (permission_mode in Python, permissionMode in TypeScript) controls whether the agent asks for approval before using tools:
| Mode | Behavior |
|---|---|
"default" | Tools not covered by allow rules trigger your approval callback; no callback means deny |
"acceptEdits" | Auto-approves file edits, other tools follow default rules |
"plan" | No tool execution; Claude produces a plan for review |
"dontAsk" | Never prompts. Tools pre-approved by permission rules run, everything else is denied |
"auto" (TypeScript only) | Uses a model classifier to approve or deny each tool call. See Auto mode for availability and behavior |
"bypassPermissions" | Runs all allowed tools without asking. Cannot be used when running as root on Unix. Use only in isolated environments where the agent’s actions cannot affect systems you care about |
"default" with a tool approval callback to surface approval prompts. For autonomous agents on a dev machine, "acceptEdits" auto-approves file edits while still gating Bash behind allow rules. Reserve "bypassPermissions" for CI, containers, or other isolated environments. See Permissions for full details.
Model
If you don’t setmodel, the SDK uses Claude Code’s default, which depends on your authentication method and subscription. Set it explicitly (for example, model="claude-sonnet-4-6") to pin a specific model or to use a smaller model for faster, cheaper agents. See models for available IDs.
The context window
The context window is the total amount of information available to Claude during a session. It does not reset between turns within a session. Everything accumulates: the system prompt, tool definitions, conversation history, tool inputs, and tool outputs. Content that stays the same across turns (system prompt, tool definitions, CLAUDE.md) is automatically prompt cached, which reduces cost and latency for repeated prefixes.What consumes context
Here’s how each component affects context in the SDK:| Source | When it loads | Impact |
|---|---|---|
| System prompt | Every request | Small fixed cost, always present |
| CLAUDE.md files | Session start, when settingSources is enabled | Full content in every request (but prompt-cached, so only the first request pays full cost) |
| Tool definitions | Every request | Each tool adds its schema; use MCP tool search to load tools on-demand instead of all at once |
| Conversation history | Accumulates over turns | Grows with each turn: prompts, responses, tool inputs, tool outputs |
| Skill descriptions | Session start (with setting sources enabled) | Short summaries; full content loads only when invoked |
Automatic compaction
When the context window approaches its limit, the SDK automatically compacts the conversation: it summarizes older history to free space, keeping your most recent exchanges and key decisions intact. The SDK emits a message withtype: "system" and subtype: "compact_boundary" in the stream when this happens (in Python this is a SystemMessage; in TypeScript it is a separate SDKCompactBoundaryMessage type).
Compaction replaces older messages with a summary, so specific instructions from early in the conversation may not be preserved. Persistent rules belong in CLAUDE.md (loaded via settingSources) rather than in the initial prompt, because CLAUDE.md content is re-injected on every request.
You can customize compaction behavior in several ways:
- Summarization instructions in CLAUDE.md: The compactor reads your CLAUDE.md like any other context, so you can include a section telling it what to preserve when summarizing. The section header is free-form (not a magic string); the compactor matches on intent.
PreCompacthook: Run custom logic before compaction occurs, for example to archive the full transcript. The hook receives atriggerfield (manualorauto). See hooks.- Manual compaction: Send
/compactas a prompt string to trigger compaction on demand. (Slash commands sent this way are SDK inputs, not CLI-only shortcuts. See slash commands in the SDK.)
Example: Summarization instructions in CLAUDE.md
Example: Summarization instructions in CLAUDE.md
Add a section to your project’s CLAUDE.md telling the compactor what to preserve. The header name isn’t special; use any clear label.
CLAUDE.md
Keep context efficient
A few strategies for long-running agents:- Use subagents for subtasks. Each subagent starts with a fresh conversation (no prior message history, though it does load its own system prompt and project-level context like CLAUDE.md). It does not see the parent’s turns, and only its final response returns to the parent as a tool result. The main agent’s context grows by that summary, not by the full subtask transcript. See What subagents inherit for details.
- Be selective with tools. Every tool definition takes context space. Use the
toolsfield onAgentDefinitionto scope subagents to the minimum set they need, and use MCP tool search to load tools on demand instead of preloading all of them. - Watch MCP server costs. Each MCP server adds all its tool schemas to every request. A few servers with many tools can consume significant context before the agent does any work. The
ToolSearchtool can help by loading tools on-demand instead of preloading all of them. See MCP tool search for configuration. - Use lower effort for routine tasks. Set effort to
"low"for agents that only need to read files or list directories. This reduces token usage and cost.
Sessions and continuity
Each interaction with the SDK creates or continues a session. Capture the session ID fromResultMessage.session_id (available in both SDKs) to resume later. The TypeScript SDK also exposes it as a direct field on the init SystemMessage; in Python it’s nested in SystemMessage.data.
When you resume, the full context from previous turns is restored: files that were read, analysis that was performed, and actions that were taken. You can also fork a session to branch into a different approach without modifying the original.
See Session management for the full guide on resume, continue, and fork patterns.
In Python,
ClaudeSDKClient handles session IDs automatically across multiple calls. See the Python SDK reference for details.Handle the result
When the loop ends, theResultMessage tells you what happened and gives you the output. The subtype field (available in both SDKs) is the primary way to check termination state.
| Result subtype | What happened | result field available? |
|---|---|---|
success | Claude finished the task normally | Yes |
error_max_turns | Hit the maxTurns limit before finishing | No |
error_max_budget_usd | Hit the maxBudgetUsd limit before finishing | No |
error_during_execution | An error interrupted the loop (for example, an API failure or cancelled request) | No |
error_max_structured_output_retries | Structured output validation failed after the configured retry limit | No |
result field (the final text output) is only present on the success variant, so always check the subtype before reading it. All result subtypes carry total_cost_usd, usage, num_turns, and session_id so you can track cost and resume even after errors. In Python, total_cost_usd and usage are typed as optional and may be None on some error paths, so guard before formatting them. See Tracking costs and usage for details on interpreting the usage fields.
The result also includes a stop_reason field (string | null in TypeScript, str | None in Python) indicating why the model stopped generating on its final turn. Common values are end_turn (model finished normally), max_tokens (hit the output token limit), and refusal (the model declined the request). On error result subtypes, stop_reason carries the value from the last assistant response before the loop ended. To detect refusals, check stop_reason === "refusal" (TypeScript) or stop_reason == "refusal" (Python). See SDKResultMessage (TypeScript) or ResultMessage (Python) for the full type.
Hooks
Hooks are callbacks that fire at specific points in the loop: before a tool runs, after it returns, when the agent finishes, and so on. Some commonly used hooks are:| Hook | When it fires | Common uses |
|---|---|---|
PreToolUse | Before a tool executes | Validate inputs, block dangerous commands |
PostToolUse | After a tool returns | Audit outputs, trigger side effects |
UserPromptSubmit | When a prompt is sent | Inject additional context into prompts |
Stop | When the agent finishes | Validate the result, save session state |
SubagentStart / SubagentStop | When a subagent spawns or completes | Track and aggregate parallel task results |
PreCompact | Before context compaction | Archive full transcript before summarizing |
PreToolUse hook that rejects a tool call prevents it from executing, and Claude receives the rejection message instead.
Both SDKs support all the events above. The TypeScript SDK includes additional events that Python does not yet support. See Control execution with hooks for the complete event list, per-SDK availability, and the full callback API.
Put it all together
This example combines the key concepts from this page into a single agent that fixes failing tests. It configures the agent with allowed tools (auto-approved so the agent runs autonomously), project settings, and safety limits on turns and reasoning effort. As the loop runs, it captures the session ID for potential resumption, handles the final result, and prints the total cost.Next steps
Now that you understand the loop, here’s where to go depending on what you’re building:- Haven’t run an agent yet? Start with the quickstart to get the SDK installed and see a full example running end to end.
- Ready to hook into your project? Load CLAUDE.md, skills, and filesystem hooks so the agent follows your project conventions automatically.
- Building an interactive UI? Enable streaming to show live text and tool calls as the loop runs.
- Need tighter control over what the agent can do? Lock down tool access with permissions, and use hooks to audit, block, or transform tool calls before they execute.
- Running long or expensive tasks? Offload isolated work to subagents to keep your main context lean.