DEV Community

Alex Spinov
Alex Spinov

Posted on

Cloudflare Workers AI Has a Free API — Run LLMs at the Edge Without Paying OpenAI

You're paying OpenAI $20/month for API access. Cloudflare gives you the same models for free.

Not a toy. Not a demo. Production-ready AI inference at the edge — Llama 3, Stable Diffusion, Whisper, embeddings, and 50+ more models.

No GPU. No Docker. No infrastructure. One curl command.

What You Get for Free

Cloudflare Workers AI free tier includes:

  • 10,000 neurons/day (enough for ~100-500 requests depending on model)
  • 50+ models: text generation, image generation, speech-to-text, translation, embeddings
  • Edge deployment: runs on Cloudflare's global network (300+ cities)
  • No cold starts: models are always warm
  • No credit card required

Quick Start: Text Generation with Llama 3

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/meta/llama-3.1-8b-instruct \
  -H "Authorization: Bearer {API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a helpful assistant"},
      {"role": "user", "content": "Explain WebSockets in 2 sentences"}
    ]
  }'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "result": {
    "response": "WebSockets provide a persistent, full-duplex communication channel between a client and server over a single TCP connection. Unlike HTTP, which follows a request-response pattern, WebSockets allow both parties to send data independently at any time."
  },
  "success": true
}
Enter fullscreen mode Exit fullscreen mode

That's it. No SDK, no library, no setup.

Generate Images with Stable Diffusion

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/stabilityai/stable-diffusion-xl-base-1.0 \
  -H "Authorization: Bearer {API_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{ "prompt": "a futuristic city at sunset, cyberpunk style" }' \
  --output image.png
Enter fullscreen mode Exit fullscreen mode

You get a PNG back. No DALL-E subscription needed.

Transcribe Audio with Whisper

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/openai/whisper \
  -H "Authorization: Bearer {API_TOKEN}" \
  -F file=@audio.mp3
Enter fullscreen mode Exit fullscreen mode

Returns:

{
  "result": {
    "text": "Your transcribed text appears here",
    "word_count": 5,
    "words": [...]
  }
}
Enter fullscreen mode Exit fullscreen mode

OpenAI Whisper API costs $0.006/minute. Cloudflare: free.

Text Embeddings for Semantic Search

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/baai/bge-base-en-v1.5 \
  -H "Authorization: Bearer {API_TOKEN}" \
  -d '{ "text": ["What is machine learning?", "How do neural networks work?"] }'
Enter fullscreen mode Exit fullscreen mode

Returns 768-dimensional vectors. Store them in Cloudflare Vectorize (also has a free tier) and you have a complete semantic search system.

Why This Matters

Feature OpenAI Cloudflare Workers AI
Free tier $5 credit (expires) 10,000 neurons/day (forever)
Models GPT-4, DALL-E Llama 3, SD-XL, Whisper, 50+
Latency ~500ms ~100ms (edge)
Cold starts Yes No
Credit card Required Not required
Open source models No Yes

Available Models (Highlights)

Text Generation:

  • @cf/meta/llama-3.1-8b-instruct — best general-purpose
  • @cf/mistral/mistral-7b-instruct-v0.2 — fast and capable
  • @cf/google/gemma-7b-it — Google's open model

Image Generation:

  • @cf/stabilityai/stable-diffusion-xl-base-1.0
  • @cf/bytedance/stable-diffusion-xl-lightning

Speech:

  • @cf/openai/whisper — speech-to-text
  • @cf/openai/whisper-tiny-en — faster, English only

Embeddings:

  • @cf/baai/bge-base-en-v1.5
  • @cf/baai/bge-large-en-v1.5

Build a Complete App in 20 Lines

// worker.js — deploy with `wrangler deploy`
export default {
  async fetch(request, env) {
    const { prompt } = await request.json();

    const answer = await env.AI.run(
      "@cf/meta/llama-3.1-8b-instruct",
      { messages: [{ role: "user", content: prompt }] }
    );

    return new Response(JSON.stringify(answer), {
      headers: { "content-type": "application/json" }
    });
  }
};
Enter fullscreen mode Exit fullscreen mode

That's a production AI API. Deployed globally. For free.

Getting Started

  1. Create a Cloudflare account (free)
  2. Go to Workers & PagesWorkers AI
  3. Get your Account ID and API Token
  4. Start making requests

No waitlist. No approval. No credit card.


Building something with free APIs? I maintain a curated list of 130+ free web scraping tools and write about developer tools weekly. Follow for more.

Need a custom data pipeline or scraper? Email me at spinov001@gmail.com — I've built 77 production scrapers.
Also: Neon Free Postgres | Vercel Free API | Hetzner 4x More Server
NEW: I Ran an AI Agent for 16 Days — What Works

Top comments (0)