Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Comprehensive Cloudflare platform skill covering Workers, D1, R2, KV, AI, Durable Objects, and security.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
references/workers-ai/gotchas.md
1# Workers AI Gotchas23## Critical: @cloudflare/ai is DEPRECATED45```typescript6// ❌ WRONG - Don't install @cloudflare/ai7import Ai from '@cloudflare/ai';89// ✅ CORRECT - Use native binding10export default {11async fetch(request: Request, env: Env) {12await env.AI.run('@cf/meta/llama-3.1-8b-instruct', { messages: [...] });13}14}15```1617## Development1819### "AI inference doesn't work locally"20```bash21# ❌ Local AI doesn't work22wrangler dev23# ✅ Use remote24wrangler dev --remote25```2627### "env.AI is undefined"28Add binding to wrangler.jsonc:29```jsonc30{ "ai": { "binding": "AI" } }31```3233## API Responses3435### Embedding response shape varies36```typescript37// @cf/baai/bge-base-en-v1.5 returns: { data: [[0.1, 0.2, ...]] }38const embedding = response.data[0]; // Get first element39```4041### Stream returns ReadableStream42```typescript43const stream = await env.AI.run(model, { messages: [...], stream: true });44for await (const chunk of stream) { console.log(chunk.response); }45```4647## Rate Limits & Pricing4849| Model Type | Neurons/Request |50|------------|-----------------|51| Small text (7B) | ~50-200 |52| Large text (70B) | ~500-2000 |53| Embeddings | ~5-20 |54| Image gen | ~10,000+ |5556**Free tier**: 10,000 neurons/day5758```typescript59// ❌ EXPENSIVE - 70B model60await env.AI.run('@cf/meta/llama-3.1-70b-instruct', ...);61// ✅ CHEAPER - Use smallest that works62await env.AI.run('@cf/meta/llama-3.1-8b-instruct', ...);63```6465## Model-Specific6667### Function calling68Only `@cf/meta/llama-3.1-*` and `mistral-7b-instruct-v0.2` support tools.6970### Empty response71Check context limits (2K-8K tokens). Validate input structure.7273### Inconsistent responses74Set `temperature: 0` for deterministic outputs.7576### Cold start latency77First request: 1-3s. Use AI Gateway caching for frequent prompts.7879## TypeScript8081```typescript82interface Env {83AI: Ai; // From @cloudflare/workers-types84}8586interface TextGenerationResponse { response: string; }87interface EmbeddingResponse { data: number[][]; shape: number[]; }88```8990## Common Errors9192### 7502: Model not found93Check exact model name at developers.cloudflare.com/workers-ai/models/9495### 7504: Input validation failed96```typescript97// Text gen requires messages array98await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {99messages: [{ role: 'user', content: 'Hello' }] // ✅100});101102// Embeddings require text103await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: 'Hello' }); // ✅104```105106## Vercel AI SDK Integration107108```typescript109import { openai } from '@ai-sdk/openai';110const model = openai('gpt-3.5-turbo', {111baseURL: 'https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai/v1',112headers: { Authorization: 'Bearer <API_TOKEN>' }113});114```115