What is Workers AI?
Workers AI lets you run AI models on Cloudflare's edge network — text generation, image classification, embeddings, speech-to-text, translation, and more. No GPU provisioning, no model hosting.
Free tier: 10,000 neurons/day (enough for ~100+ requests).
Quick Start
npm create cloudflare@latest my-ai-app -- --template worker-typescript
cd my-ai-app
# wrangler.toml
[ai]
binding = "AI"
Text Generation (LLM)
export default {
async fetch(request: Request, env: Env) {
const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain WebAssembly in 3 sentences." },
],
max_tokens: 256,
});
return Response.json(response);
},
};
Streaming Responses
const stream = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
messages: [{ role: "user", content: "Write a poem about coding" }],
stream: true,
});
return new Response(stream, {
headers: { "Content-Type": "text/event-stream" },
});
Text Embeddings
const embeddings = await env.AI.run("@cf/baai/bge-base-en-v1.5", {
text: ["How to deploy a web app", "Best practices for CI/CD"],
});
console.log(embeddings.data[0].length); // 768 dimensions
Combine with Vectorize for semantic search.
Image Classification
const imageData = await request.arrayBuffer();
const result = await env.AI.run("@cf/microsoft/resnet-50", {
image: [...new Uint8Array(imageData)],
});
console.log(result); // [{label: "cat", score: 0.95}, ...]
Image Generation
const image = await env.AI.run("@cf/stabilityai/stable-diffusion-xl-base-1.0", {
prompt: "A futuristic city skyline at sunset, cyberpunk style",
});
return new Response(image, {
headers: { "Content-Type": "image/png" },
});
Translation
const translated = await env.AI.run("@cf/meta/m2m100-1.2b", {
text: "Hello, how are you?",
source_lang: "english",
target_lang: "spanish",
});
// {translated_text: "Hola, ¿cómo estás?"}
Speech-to-Text
const audioData = await request.arrayBuffer();
const transcription = await env.AI.run("@cf/openai/whisper", {
audio: [...new Uint8Array(audioData)],
});
console.log(transcription.text);
Text Summarization
const summary = await env.AI.run("@cf/facebook/bart-large-cnn", {
input_text: longArticleText,
max_length: 150,
});
Available Models
| Category | Model | Use Case |
|---|---|---|
| LLM | Llama 3.1 8B | Chat, code gen |
| Embeddings | BGE Base | Semantic search |
| Image Gen | SDXL | Image creation |
| Vision | ResNet-50 | Image classification |
| Translation | M2M-100 | 100+ languages |
| Speech | Whisper | Audio transcription |
| Summarization | BART | Text summarization |
REST API
curl "https://api.cloudflare.com/client/v4/accounts/ACCOUNT_ID/ai/run/@cf/meta/llama-3.1-8b-instruct" \
-H "Authorization: Bearer $CF_TOKEN" \
-d '{"messages": [{"role": "user", "content": "Hello"}]}'
Need AI integration or edge computing setup?
📧 spinov001@gmail.com
🔧 My tools on Apify Store
Edge AI or centralized GPU — what's your approach?
Top comments (0)