Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Build LLM-powered apps with the Anthropic Claude API or SDK across Python, TypeScript, Java, Go, Ruby, C#, and PHP.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
python/claude-api/streaming.md
1# Streaming — Python23## Quick Start45```python6with client.messages.stream(7model="claude-opus-4-7",8max_tokens=64000,9messages=[{"role": "user", "content": "Write a story"}]10) as stream:11for text in stream.text_stream:12print(text, end="", flush=True)13```1415### Async1617```python18async with async_client.messages.stream(19model="claude-opus-4-7",20max_tokens=64000,21messages=[{"role": "user", "content": "Write a story"}]22) as stream:23async for text in stream.text_stream:24print(text, end="", flush=True)25```2627---2829## Handling Different Content Types3031Claude may return text, thinking blocks, or tool use. Handle each appropriately:3233> **Opus 4.7 / Opus 4.6:** Use `thinking: {type: "adaptive"}`. On older models, use `thinking: {type: "enabled", budget_tokens: N}` instead.3435```python36with client.messages.stream(37model="claude-opus-4-7",38max_tokens=64000,39thinking={"type": "adaptive"},40messages=[{"role": "user", "content": "Analyze this problem"}]41) as stream:42for event in stream:43if event.type == "content_block_start":44if event.content_block.type == "thinking":45print("\n[Thinking...]")46elif event.content_block.type == "text":47print("\n[Response:]")4849elif event.type == "content_block_delta":50if event.delta.type == "thinking_delta":51print(event.delta.thinking, end="", flush=True)52elif event.delta.type == "text_delta":53print(event.delta.text, end="", flush=True)54```5556---5758## Streaming with Tool Use5960The Python tool runner currently returns complete messages. Use streaming for individual API calls within a manual loop if you need per-token streaming with tools:6162```python63with client.messages.stream(64model="claude-opus-4-7",65max_tokens=64000,66tools=tools,67messages=messages68) as stream:69for text in stream.text_stream:70print(text, end="", flush=True)7172response = stream.get_final_message()73# Continue with tool execution if response.stop_reason == "tool_use"74```7576---7778## Getting the Final Message7980```python81with client.messages.stream(82model="claude-opus-4-7",83max_tokens=64000,84messages=[{"role": "user", "content": "Hello"}]85) as stream:86for text in stream.text_stream:87print(text, end="", flush=True)8889# Get full message after streaming90final_message = stream.get_final_message()91print(f"\n\nTokens used: {final_message.usage.output_tokens}")92```9394---9596## Streaming with Progress Updates9798```python99def stream_with_progress(client, **kwargs):100"""Stream a response with progress updates."""101total_tokens = 0102content_parts = []103104with client.messages.stream(**kwargs) as stream:105for event in stream:106if event.type == "content_block_delta":107if event.delta.type == "text_delta":108text = event.delta.text109content_parts.append(text)110print(text, end="", flush=True)111112elif event.type == "message_delta":113if event.usage and event.usage.output_tokens is not None:114total_tokens = event.usage.output_tokens115116final_message = stream.get_final_message()117118print(f"\n\n[Tokens used: {total_tokens}]")119return "".join(content_parts)120```121122---123124## Error Handling in Streams125126```python127try:128with client.messages.stream(129model="claude-opus-4-7",130max_tokens=64000,131messages=[{"role": "user", "content": "Write a story"}]132) as stream:133for text in stream.text_stream:134print(text, end="", flush=True)135except anthropic.APIConnectionError:136print("\nConnection lost. Please retry.")137except anthropic.RateLimitError:138print("\nRate limited. Please wait and retry.")139except anthropic.APIStatusError as e:140print(f"\nAPI error: {e.status_code}")141```142143---144145## Stream Event Types146147| Event Type | Description | When it fires |148| --------------------- | --------------------------- | --------------------------------- |149| `message_start` | Contains message metadata | Once at the beginning |150| `content_block_start` | New content block beginning | When a text/tool_use block starts |151| `content_block_delta` | Incremental content update | For each token/chunk |152| `content_block_stop` | Content block complete | When a block finishes |153| `message_delta` | Message-level updates | Contains `stop_reason`, usage |154| `message_stop` | Message complete | Once at the end |155156## Best Practices1571581. **Always flush output** — Use `flush=True` to show tokens immediately1592. **Handle partial responses** — If the stream is interrupted, you may have incomplete content1603. **Track token usage** — The `message_delta` event contains usage information1614. **Use timeouts** — Set appropriate timeouts for your application1625. **Default to streaming** — Use `.get_final_message()` to get the complete response even when streaming, giving you timeout protection without needing to handle individual events163