A developer wanted to add image generation to his app. AWS SageMaker: $400/month for a GPU instance. Replicate: pay per prediction, starting at fractions of a cent. For his 100 images/day, it cost $3/month.
What Replicate Offers
Replicate pricing:
- Free tier: some models are free to run
- Pay per prediction: most models cost $0.0001-$0.10 per run
- No GPUs to manage — models run on Replicate's infrastructure
- Thousands of open-source models — Stable Diffusion, Llama, Whisper, etc.
- Custom models — deploy your own with Cog
- Streaming — real-time output for LLMs
- Webhooks — async prediction notifications
Quick Start
npm install replicate
import Replicate from 'replicate';
const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });
// Generate an image with SDXL
const output = await replicate.run(
'stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b',
{ input: { prompt: 'A serene mountain lake at sunset, photorealistic' } }
);
console.log(output); // ['https://replicate.delivery/...png']
REST API
# Create a prediction
curl -X POST 'https://api.replicate.com/v1/predictions' \
-H 'Authorization: Bearer YOUR_API_TOKEN' \
-H 'Content-Type: application/json' \
-d '{
"version": "39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",
"input": { "prompt": "A cat astronaut on Mars" }
}'
# Get prediction result
curl 'https://api.replicate.com/v1/predictions/PREDICTION_ID' \
-H 'Authorization: Bearer YOUR_API_TOKEN'
# List your predictions
curl 'https://api.replicate.com/v1/predictions' \
-H 'Authorization: Bearer YOUR_API_TOKEN'
Common Models
// Image generation (SDXL)
const image = await replicate.run('stability-ai/sdxl', {
input: { prompt: 'A futuristic city', width: 1024, height: 1024 }
});
// Speech to text (Whisper)
const transcript = await replicate.run('openai/whisper', {
input: { audio: 'https://example.com/audio.mp3', model: 'large-v3' }
});
// Text generation (Llama)
const text = await replicate.run('meta/meta-llama-3-70b-instruct', {
input: {
prompt: 'Explain quantum computing in simple terms',
max_tokens: 500
}
});
// Image upscaling
const upscaled = await replicate.run('nightmareai/real-esrgan', {
input: { image: 'https://example.com/low-res.jpg', scale: 4 }
});
// Remove background
const result = await replicate.run('cjwbw/rembg', {
input: { image: 'https://example.com/photo.jpg' }
});
Streaming (LLMs)
// Stream tokens as they generate
for await (const event of replicate.stream('meta/meta-llama-3-70b-instruct', {
input: { prompt: 'Write a poem about coding' }
})) {
process.stdout.write(event.data);
}
Webhooks (Async)
// Start prediction with webhook callback
await replicate.predictions.create({
version: 'stability-ai/sdxl:...',
input: { prompt: 'A beautiful sunset' },
webhook: 'https://yourapp.com/api/replicate-webhook',
webhook_events_filter: ['completed']
});
// Handle webhook
app.post('/api/replicate-webhook', (req, res) => {
const { output, status } = req.body;
if (status === 'succeeded') {
saveImage(output[0]); // Save the generated image
}
res.sendStatus(200);
});
Deploy Custom Models
# cog.yaml + predict.py — deploy any model
# predict.py
from cog import BasePredictor, Input
class Predictor(BasePredictor):
def setup(self):
self.model = load_my_model()
def predict(self, text: str = Input(description="Input text")) -> str:
return self.model.generate(text)
cog push r8.im/your-username/your-model
Need AI-powered web scraping? Check out my web scraping actors on Apify — smart data extraction.
Need custom AI integration? Email me at spinov001@gmail.com.
Top comments (0)