Source from repo

Cloudflare Platform Skill

Comprehensive Cloudflare platform skill covering Workers, D1, R2, KV, AI, Durable Objects, and security.

cloudflareGitHub cloudflareSource repo Original GitHub link Publisher page

Files

321

Skill

n/a

Size

1.4 MB

Entrypoint

SKILL.md

Format

git-repo

Open file

references/workers-ai/gotchas.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown115 linesFree

references/workers-ai/gotchas.md

1# Workers AI Gotchas
2 
3## Critical: @cloudflare/ai is DEPRECATED
4 
5```typescript
6// ❌ WRONG - Don't install @cloudflare/ai
7import Ai from '@cloudflare/ai';
8 
9// ✅ CORRECT - Use native binding
10export default {
11  async fetch(request: Request, env: Env) {
12    await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { messages: [...] });
13  }
14}
15```
16 
17## Development
18 
19### "AI inference doesn't work locally"
20```bash
21# ❌ Local AI doesn't work
22wrangler dev
23# ✅ Use remote
24wrangler dev --remote
25```
26 
27### "env.AI is undefined"
28Add binding to wrangler.jsonc:
29```jsonc
30{ "ai": { "binding": "AI" } }
31```
32 
33## API Responses
34 
35### Embedding response shape varies
36```typescript
37// @cf/baai/bge-base-en-v1.5 returns: { data: [[0.1, 0.2, ...]] }
38const embedding = response.data[0]; // Get first element
39```
40 
41### Stream returns ReadableStream
42```typescript
43const stream = await env.AI.run(model, { messages: [...], stream: true });
44for await (const chunk of stream) { console.log(chunk.response); }
45```
46 
47## Rate Limits & Pricing
48 
49| Model Type | Neurons/Request |
50|------------|-----------------|
51| Small text (7B) | ~50-200 |
52| Large text (70B) | ~500-2000 |
53| Embeddings | ~5-20 |
54| Image gen | ~10,000+ |
55 
56**Free tier**: 10,000 neurons/day
57 
58```typescript
59// ❌ EXPENSIVE - 70B model
60await env.AI.run('@cf/meta/llama-3.1-70b-instruct', ...);
61// ✅ CHEAPER - Use smallest that works
62await env.AI.run('@cf/meta/llama-3.1-8b-instruct', ...);
63```
64 
65## Model-Specific
66 
67### Function calling
68Only `@cf/meta/llama-3.1-*` and `mistral-7b-instruct-v0.2` support tools.
69 
70### Empty response
71Check context limits (2K-8K tokens). Validate input structure.
72 
73### Inconsistent responses
74Set `temperature: 0` for deterministic outputs.
75 
76### Cold start latency
77First request: 1-3s. Use AI Gateway caching for frequent prompts.
78 
79## TypeScript
80 
81```typescript
82interface Env {
83  AI: Ai; // From @cloudflare/workers-types
84}
85 
86interface TextGenerationResponse { response: string; }
87interface EmbeddingResponse { data: number[][]; shape: number[]; }
88```
89 
90## Common Errors
91 
92### 7502: Model not found
93Check exact model name at developers.cloudflare.com/workers-ai/models/
94 
95### 7504: Input validation failed
96```typescript
97// Text gen requires messages array
98await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
99  messages: [{ role: 'user', content: 'Hello' }]  // ✅
100});
101 
102// Embeddings require text
103await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: 'Hello' });  // ✅
104```
105 
106## Vercel AI SDK Integration
107 
108```typescript
109import { openai } from '@ai-sdk/openai';
110const model = openai('gpt-3.5-turbo', {
111  baseURL: 'https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai/v1',
112  headers: { Authorization: 'Bearer <API_TOKEN>' }
113});
114```
115

Preparing the source view

Cloudflare Platform Skill

references/workers-ai/gotchas.md