Source from repo
Building LLM-Powered Applications with Claude

Build LLM-powered apps with the Anthropic Claude API or SDK across Python, TypeScript, Java, Go, Ruby, C#, and PHP.
anthropicsGitHub anthropicsOfficialSource repo Original GitHub link Publisher page
Files
Skill
n/a
Size
517.2 KB
Entrypoint
SKILL.md
Format
git-repo
Open file
typescript/claude-api/README.md

Syntax-highlighted preview of this file as included in the skill package.
Rendered Source
markdown334 linesFree
typescript/claude-api/README.md
1# Claude API — TypeScript
2 
3## Installation
4 
5```bash
6npm install @anthropic-ai/sdk
7```
8 
9## Client Initialization
10 
11```typescript
12import Anthropic from "@anthropic-ai/sdk";
13 
14// Default (uses ANTHROPIC_API_KEY env var)
15const client = new Anthropic();
16 
17// Explicit API key
18const client = new Anthropic({ apiKey: "your-api-key" });
19```
20 
21---
22 
23## Basic Message Request
24 
25```typescript
26const response = await client.messages.create({
27  model: "claude-opus-4-7",
28  max_tokens: 16000,
29  messages: [{ role: "user", content: "What is the capital of France?" }],
30});
31// response.content is ContentBlock[] — a discriminated union. Narrow by .type
32// before accessing .text (TypeScript will error on content[0].text without this).
33for (const block of response.content) {
34  if (block.type === "text") {
35    console.log(block.text);
36  }
37}
38```
39 
40---
41 
42## System Prompts
43 
44```typescript
45const response = await client.messages.create({
46  model: "claude-opus-4-7",
47  max_tokens: 16000,
48  system:
49    "You are a helpful coding assistant. Always provide examples in Python.",
50  messages: [{ role: "user", content: "How do I read a JSON file?" }],
51});
52```
53 
54---
55 
56## Vision (Images)
57 
58### URL
59 
60```typescript
61const response = await client.messages.create({
62  model: "claude-opus-4-7",
63  max_tokens: 16000,
64  messages: [
65    {
66      role: "user",
67      content: [
68        {
69          type: "image",
70          source: { type: "url", url: "https://example.com/image.png" },
71        },
72        { type: "text", text: "Describe this image" },
73      ],
74    },
75  ],
76});
77```
78 
79### Base64
80 
81```typescript
82import fs from "fs";
83 
84const imageData = fs.readFileSync("image.png").toString("base64");
85 
86const response = await client.messages.create({
87  model: "claude-opus-4-7",
88  max_tokens: 16000,
89  messages: [
90    {
91      role: "user",
92      content: [
93        {
94          type: "image",
95          source: { type: "base64", media_type: "image/png", data: imageData },
96        },
97        { type: "text", text: "What's in this image?" },
98      ],
99    },
100  ],
101});
102```
103 
104---
105 
106## Prompt Caching
107 
108**Caching is a prefix match** — any byte change anywhere in the prefix invalidates everything after it. For placement patterns, architectural guidance (frozen system prompt, deterministic tool order, where to put volatile content), and the silent-invalidator audit checklist, read `shared/prompt-caching.md`.
109 
110### Automatic Caching (Recommended)
111 
112Use top-level `cache_control` to automatically cache the last cacheable block in the request:
113 
114```typescript
115const response = await client.messages.create({
116  model: "claude-opus-4-7",
117  max_tokens: 16000,
118  cache_control: { type: "ephemeral" }, // auto-caches the last cacheable block
119  system: "You are an expert on this large document...",
120  messages: [{ role: "user", content: "Summarize the key points" }],
121});
122```
123 
124### Manual Cache Control
125 
126For fine-grained control, add `cache_control` to specific content blocks:
127 
128```typescript
129const response = await client.messages.create({
130  model: "claude-opus-4-7",
131  max_tokens: 16000,
132  system: [
133    {
134      type: "text",
135      text: "You are an expert on this large document...",
136      cache_control: { type: "ephemeral" }, // default TTL is 5 minutes
137    },
138  ],
139  messages: [{ role: "user", content: "Summarize the key points" }],
140});
141 
142// With explicit TTL (time-to-live)
143const response2 = await client.messages.create({
144  model: "claude-opus-4-7",
145  max_tokens: 16000,
146  system: [
147    {
148      type: "text",
149      text: "You are an expert on this large document...",
150      cache_control: { type: "ephemeral", ttl: "1h" }, // 1 hour TTL
151    },
152  ],
153  messages: [{ role: "user", content: "Summarize the key points" }],
154});
155```
156 
157### Verifying Cache Hits
158 
159```typescript
160console.log(response.usage.cache_creation_input_tokens); // tokens written to cache (~1.25x cost)
161console.log(response.usage.cache_read_input_tokens);     // tokens served from cache (~0.1x cost)
162console.log(response.usage.input_tokens);                // uncached tokens (full cost)
163```
164 
165If `cache_read_input_tokens` is zero across repeated identical-prefix requests, a silent invalidator is at work — `Date.now()` or a UUID in the system prompt, non-deterministic key ordering, or a varying tool set. See `shared/prompt-caching.md` for the full audit table.
166 
167---
168 
169## Extended Thinking
170 
171> **Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking. `budget_tokens` is removed on Opus 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
172> **Older models:** Use `thinking: {type: "enabled", budget_tokens: N}` (must be < `max_tokens`, min 1024).
173 
174```typescript
175// Opus 4.7 / 4.6: adaptive thinking (recommended)
176const response = await client.messages.create({
177  model: "claude-opus-4-7",
178  max_tokens: 16000,
179  thinking: { type: "adaptive" },
180  output_config: { effort: "high" }, // low | medium | high | max
181  messages: [
182    { role: "user", content: "Solve this math problem step by step..." },
183  ],
184});
185 
186for (const block of response.content) {
187  if (block.type === "thinking") {
188    console.log("Thinking:", block.thinking);
189  } else if (block.type === "text") {
190    console.log("Response:", block.text);
191  }
192}
193```
194 
195---
196 
197## Error Handling
198 
199Use the SDK's typed exception classes — never check error messages with string matching:
200 
201```typescript
202import Anthropic from "@anthropic-ai/sdk";
203 
204try {
205  const response = await client.messages.create({...});
206} catch (error) {
207  if (error instanceof Anthropic.BadRequestError) {
208    console.error("Bad request:", error.message);
209  } else if (error instanceof Anthropic.AuthenticationError) {
210    console.error("Invalid API key");
211  } else if (error instanceof Anthropic.RateLimitError) {
212    console.error("Rate limited - retry later");
213  } else if (error instanceof Anthropic.APIError) {
214    console.error(`API error ${error.status}:`, error.message);
215  }
216}
217```
218 
219All classes extend `Anthropic.APIError` with a typed `status` field. Check from most specific to least specific. See [shared/error-codes.md](../../shared/error-codes.md) for the full error code reference.
220 
221---
222 
223## Multi-Turn Conversations
224 
225The API is stateless — send the full conversation history each time. Use `Anthropic.MessageParam[]` to type the messages array:
226 
227```typescript
228const messages: Anthropic.MessageParam[] = [
229  { role: "user", content: "My name is Alice." },
230  { role: "assistant", content: "Hello Alice! Nice to meet you." },
231  { role: "user", content: "What's my name?" },
232];
233 
234const response = await client.messages.create({
235  model: "claude-opus-4-7",
236  max_tokens: 16000,
237  messages: messages,
238});
239```
240 
241**Rules:**
242 
243- Consecutive same-role messages are allowed — the API combines them into a single turn
244- First message must be `user`
245- Use SDK types (`Anthropic.MessageParam`, `Anthropic.Message`, `Anthropic.Tool`, etc.) for all API data structures — don't redefine equivalent interfaces
246 
247---
248 
249### Compaction (long conversations)
250 
251> **Beta, Opus 4.7, Opus 4.6, and Sonnet 4.6.** When conversations approach the 200K context window, compaction automatically summarizes earlier context server-side. The API returns a `compaction` block; you must pass it back on subsequent requests — append `response.content`, not just the text.
252 
253```typescript
254import Anthropic from "@anthropic-ai/sdk";
255 
256const client = new Anthropic();
257const messages: Anthropic.Beta.BetaMessageParam[] = [];
258 
259async function chat(userMessage: string): Promise<string> {
260  messages.push({ role: "user", content: userMessage });
261 
262  const response = await client.beta.messages.create({
263    betas: ["compact-2026-01-12"],
264    model: "claude-opus-4-7",
265    max_tokens: 16000,
266    messages,
267    context_management: {
268      edits: [{ type: "compact_20260112" }],
269    },
270  });
271 
272  // Append full content — compaction blocks must be preserved
273  messages.push({ role: "assistant", content: response.content });
274 
275  const textBlock = response.content.find(
276    (b): b is Anthropic.Beta.BetaTextBlock => b.type === "text",
277  );
278  return textBlock?.text ?? "";
279}
280 
281// Compaction triggers automatically when context grows large
282console.log(await chat("Help me build a Python web scraper"));
283console.log(await chat("Add support for JavaScript-rendered pages"));
284console.log(await chat("Now add rate limiting and error handling"));
285```
286 
287---
288 
289## Stop Reasons
290 
291The `stop_reason` field in the response indicates why the model stopped generating:
292 
293| Value           | Meaning                                                         |
294| --------------- | --------------------------------------------------------------- |
295| `end_turn`      | Claude finished its response naturally                          |
296| `max_tokens`    | Hit the `max_tokens` limit — increase it or use streaming       |
297| `stop_sequence` | Hit a custom stop sequence                                      |
298| `tool_use`      | Claude wants to call a tool — execute it and continue           |
299| `pause_turn`    | Model paused and can be resumed (agentic flows)                 |
300| `refusal`       | Claude refused for safety reasons — output may not match schema |
301 
302---
303 
304## Cost Optimization Strategies
305 
306### 1. Use Prompt Caching for Repeated Context
307 
308```typescript
309// Automatic caching (simplest — caches the last cacheable block)
310const response = await client.messages.create({
311  model: "claude-opus-4-7",
312  max_tokens: 16000,
313  cache_control: { type: "ephemeral" },
314  system: largeDocumentText, // e.g., 50KB of context
315  messages: [{ role: "user", content: "Summarize the key points" }],
316});
317 
318// First request: full cost
319// Subsequent requests: ~90% cheaper for cached portion
320```
321 
322### 2. Use Token Counting Before Requests
323 
324```typescript
325const countResponse = await client.messages.countTokens({
326  model: "claude-opus-4-7",
327  messages: messages,
328  system: system,
329});
330 
331const estimatedInputCost = countResponse.input_tokens * 0.000005; // $5/1M tokens
332console.log(`Estimated input cost: $${estimatedInputCost.toFixed(4)}`);
333```
334
Preparing the source view

Building LLM-Powered Applications with Claude

typescript/claude-api/README.md