Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Build LLM-powered apps with the Anthropic Claude API or SDK across Python, TypeScript, Java, Go, Ruby, C#, and PHP.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
shared/token-counting.md
1# Token Counting23Use the `count_tokens` endpoint (`POST /v1/messages/count_tokens`) for accurate4token counts against Claude models. Token counts are **model-specific** — pass5the same model ID you'll use for inference.67**Do not use `tiktoken`.** It's OpenAI's tokenizer. It undercounts Claude8tokens by ~15–20% on typical text, and by much more on code or non-English9input. Any estimate from `tiktoken`, `gpt-tokenizer`, or similar is wrong for10Claude.1112## Count a file or string1314```python15from anthropic import Anthropic1617client = Anthropic()18resp = client.messages.count_tokens(19model="claude-opus-4-8",20messages=[{"role": "user", "content": open("CLAUDE.md").read()}],21)22print(resp.input_tokens)23```2425TypeScript: `await client.messages.countTokens({model, messages})` →26`.input_tokens`. See `{lang}/claude-api/README.md` for other SDKs.2728## CLI2930```sh31ant messages count-tokens --model claude-opus-4-8 \32--message '{role: user, content: "@./CLAUDE.md"}' \33--transform input_tokens -r34```3536## Diffing a file across two versions3738The endpoint is stateless — count each version separately and subtract:3940```python41from anthropic import Anthropic42import subprocess4344client = Anthropic()45def count(text: str) -> int:46return client.messages.count_tokens(47model="claude-opus-4-8",48messages=[{"role": "user", "content": text}],49).input_tokens5051before = subprocess.check_output(["git", "show", "HEAD:CLAUDE.md"], text=True)52after = open("CLAUDE.md").read()53print(count(after) - count(before))54```5556Full docs: see the Token Counting entry in `shared/live-sources.md`.57