A practical tutorial for developers who want to add AI generation to their apps—fast
Adding AI image generation to your application shouldn't take weeks. With the right tools and approach, you can ship working features in under an hour.
I'm going to show you exactly how.
We'll build a simple Express API that generates images from text prompts, caches results intelligently, and handles errors gracefully. By the end, you'll have production-ready code you can adapt to your needs.
No fluff. Just working code and practical patterns.
What We're Building
A REST API with these endpoints:
POST /api/generate
- Generates image from text prompt
- Returns URL to generated image
- Caches results for identical prompts
GET /api/status/:jobId
- Checks generation status for async jobs
- Returns progress and result when complete
Features we'll implement:
- ✅ Image generation with multiple model options
- ✅ Intelligent caching to reduce costs
- ✅ Error handling and retry logic
- ✅ Async processing for long-running generations
- ✅ Usage tracking and rate limiting
Tech stack:
- Node.js with Express
- Redis for caching
- WaveSpeedAI for image generation
- Bull for job queues
Why WaveSpeedAI?
Before we code, quick context on why I'm using WaveSpeedAI rather than integrating directly with individual model providers:
Single Integration: One API gives you access to 100+ models from multiple providers (Alibaba, ByteDance, Google, OpenAI, etc.)
No Cold Starts: Models stay warm, eliminating 5-30 second initialization delays
Built-in Failover: If one model fails, automatically tries alternatives
Cost Optimization: Test multiple models easily to find the best quality-to-cost ratio

According to Stack Overflow's 2024 Developer Survey, 76% of developers now use AI tools in their workflow, with unified APIs cited as dramatically reducing integration time.
Direct integration with multiple providers takes weeks. Unified APIs get you shipping in hours.
Alright, let's build.
Step 1: Project Setup
# Create project
mkdir ai-image-api
cd ai-image-api
npm init -y
# Install dependencies
npm install express redis ioredis bull axios dotenv
npm install --save-dev nodemon
# Create structure
mkdir src
touch src/server.js src/generator.js src/cache.js .env
Package.json scripts:
{
"scripts": {
"dev": "nodemon src/server.js",
"start": "node src/server.js"
}
}
Environment variables (.env):
PORT=3000
REDIS_URL=redis://localhost:6379
WAVESPEED_API_KEY=your_api_key_here
NODE_ENV=development
Get your WaveSpeedAI API key from wavespeed.ai.
Step 2: Cache Layer
Smart caching reduces costs by 60-80%. Let's build it first:
src/cache.js:
const Redis = require('ioredis');
const crypto = require('crypto');
class CacheService {
constructor(redisUrl) {
this.redis = new Redis(redisUrl);
this.defaultTTL = 86400; // 24 hours
}
// Generate cache key from normalized parameters
getCacheKey(params) {
const normalized = {
prompt: params.prompt.toLowerCase().trim(),
model: params.model,
width: params.width,
height: params.height
};
const hash = crypto
.createHash('sha256')
.update(JSON.stringify(normalized))
.digest('hex');
return `img:${hash}`;
}
// Check if result exists
async get(params) {
const key = this.getCacheKey(params);
const cached = await this.redis.get(key);
if (cached) {
console.log('Cache hit:', key);
return JSON.parse(cached);
}
console.log('Cache miss:', key);
return null;
}
// Store result
async set(params, result, ttl = this.defaultTTL) {
const key = this.getCacheKey(params);
await this.redis.setex(
key,
ttl,
JSON.stringify(result)
);
console.log('Cached:', key);
}
// Track cache statistics
async getStats() {
const info = await this.redis.info('stats');
const lines = info.split('\r\n');
const stats = {};
lines.forEach(line => {
const [key, value] = line.split(':');
if (key && value) {
stats[key] = value;
}
});
return {
hits: parseInt(stats.keyspace_hits) || 0,
misses: parseInt(stats.keyspace_misses) || 0,
hitRate: stats.keyspace_hits
? (parseInt(stats.keyspace_hits) /
(parseInt(stats.keyspace_hits) + parseInt(stats.keyspace_misses)))
: 0
};
}
}
module.exports = CacheService;
Why this matters: Identical prompts return cached results instantly, saving both generation time and money. The cache hit rate is your key optimization metric.
Step 3: Generation Service
Now the core functionality:
src/generator.js:
const axios = require('axios');
class GeneratorService {
constructor(config) {
this.apiKey = config.apiKey;
this.baseUrl = 'https://api.wavespeed.ai/v1';
this.timeout = 60000; // 60 seconds
}
// Main generation method
async generate(params) {
const {
prompt,
model = 'wavespeed-ai/z-image/turbo', // Default to fast model
width = 1024,
height = 1024,
quality = 'standard'
} = params;
try {
const response = await axios.post(
`${this.baseUrl}/generate`,
{
model,
prompt,
width,
height,
quality
},
{
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'Content-Type': 'application/json'
},
timeout: this.timeout
}
);
return {
success: true,
url: response.data.url,
model: model,
cost: response.data.cost || this.estimateCost(model),
duration: response.data.duration
};
} catch (error) {
console.error('Generation failed:', error.message);
// Provide useful error messages
if (error.response) {
throw new Error(
`API Error (${error.response.status}): ${
error.response.data.message || 'Unknown error'
}`
);
} else if (error.code === 'ECONNABORTED') {
throw new Error('Request timeout - generation took too long');
} else {
throw new Error(`Network error: ${error.message}`);
}
}
}
// Generate with automatic fallback
async generateWithFallback(params) {
const models = [
params.model || 'wavespeed-ai/qwen-image/text-to-image-2512',
'wavespeed-ai/z-image/turbo', // Fast fallback
'bytedance/seedream-v4.5' // Quality fallback
];
let lastError;
for (let i = 0; i < models.length; i++) {
try {
console.log(`Attempting generation with ${models[i]}`);
const result = await this.generate({
...params,
model: models[i]
});
return {
...result,
fallbackUsed: i > 0,
attemptNumber: i + 1
};
} catch (error) {
lastError = error;
console.warn(`Model ${models[i]} failed:`, error.message);
// Don't retry on client errors
if (error.message.includes('400') || error.message.includes('401')) {
throw error;
}
// Wait before next attempt
if (i < models.length - 1) {
await this.sleep(1000 * Math.pow(2, i)); // Exponential backoff
}
}
}
throw new Error(`All models failed. Last error: ${lastError.message}`);
}
// Estimate cost for budget tracking
estimateCost(model) {
const pricing = {
'wavespeed-ai/z-image/turbo': 0.005,
'wavespeed-ai/qwen-image/text-to-image-2512': 0.025,
'bytedance/seedream-v4.5': 0.04
};
return pricing[model] || 0.02;
}
// Helper: sleep utility
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
// List available models
getAvailableModels() {
return [
{
id: 'wavespeed-ai/z-image/turbo',
name: 'Z-Image Turbo',
speed: 'very fast',
cost: 'very low',
quality: 'good'
},
{
id: 'wavespeed-ai/qwen-image/text-to-image-2512',
name: 'Qwen Image 2512',
speed: 'fast',
cost: 'low',
quality: 'excellent'
},
{
id: 'bytedance/seedream-v4.5',
name: 'Seedream 4.5',
speed: 'moderate',
cost: 'moderate',
quality: 'premium'
}
];
}
}
module.exports = GeneratorService;
Key patterns:
- Automatic retry with exponential backoff
- Fallback to alternative models
- Detailed error handling
- Cost estimation for budget tracking
Step 4: Express Server
Bring it all together:
src/server.js:
require('dotenv').config();
const express = require('express');
const CacheService = require('./cache');
const GeneratorService = require('./generator');
const app = express();
app.use(express.json());
// Initialize services
const cache = new CacheService(process.env.REDIS_URL);
const generator = new GeneratorService({
apiKey: process.env.WAVESPEED_API_KEY
});
// Health check
app.get('/health', (req, res) => {
res.json({ status: 'ok', timestamp: new Date().toISOString() });
});
// Main generation endpoint
app.post('/api/generate', async (req, res) => {
const startTime = Date.now();
try {
const { prompt, model, width, height, quality } = req.body;
// Validation
if (!prompt || prompt.trim().length === 0) {
return res.status(400).json({
error: 'Prompt is required'
});
}
if (prompt.length > 1000) {
return res.status(400).json({
error: 'Prompt too long (max 1000 characters)'
});
}
const params = { prompt, model, width, height, quality };
// Check cache first
const cached = await cache.get(params);
if (cached) {
const duration = Date.now() - startTime;
return res.json({
...cached,
cached: true,
responseTime: duration
});
}
// Generate new image
const result = await generator.generateWithFallback(params);
// Cache the result
await cache.set(params, result);
const duration = Date.now() - startTime;
res.json({
...result,
cached: false,
responseTime: duration
});
} catch (error) {
console.error('Generation error:', error);
res.status(500).json({
error: error.message,
timestamp: new Date().toISOString()
});
}
});
// List available models
app.get('/api/models', (req, res) => {
res.json({
models: generator.getAvailableModels()
});
});
// Cache statistics
app.get('/api/stats', async (req, res) => {
try {
const stats = await cache.getStats();
res.json(stats);
} catch (error) {
res.status(500).json({
error: 'Failed to fetch statistics'
});
}
});
// Start server
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
console.log(`✅ Server running on port ${PORT}`);
console.log(`📝 Generate: POST http://localhost:${PORT}/api/generate`);
console.log(`📊 Stats: GET http://localhost:${PORT}/api/stats`);
});
// Graceful shutdown
process.on('SIGTERM', () => {
console.log('SIGTERM received, shutting down gracefully');
process.exit(0);
});
Step 5: Testing It Out
Start Redis (if not already running):
redis-server
Start the server:
npm run dev
Test the API:
# Generate an image
curl -X POST http://localhost:3000/api/generate \
-H "Content-Type: application/json" \
-d '{
"prompt": "A serene mountain lake at sunset, photorealistic",
"model": "wavespeed-ai/qwen-image/text-to-image-2512",
"width": 1024,
"height": 1024
}'
# Check cache statistics
curl http://localhost:3000/api/stats
# List available models
curl http://localhost:3000/api/models
Response example:
{
"success": true,
"url": "https://cdn.wavespeed.ai/generated/abc123.png",
"model": "wavespeed-ai/qwen-image/text-to-image-2512",
"cost": 0.025,
"duration": 4.2,
"cached": false,
"responseTime": 4250
}
Run the same request again—it'll return instantly from cache:
{
"success": true,
"url": "https://cdn.wavespeed.ai/generated/abc123.png",
"model": "wavespeed-ai/qwen-image/text-to-image-2512",
"cost": 0.025,
"duration": 4.2,
"cached": true,
"responseTime": 15
}
Notice the responseTime dropped from 4250ms to 15ms. That's the power of caching.
Caching provides massive performance improvements and cost savings for repeated requests
Step 6: Adding Async Processing (Optional but Recommended)
For longer-running generations (video, complex images), use a queue:
npm install bull
src/queue.js:
const Bull = require('bull');
const GeneratorService = require('./generator');
const CacheService = require('./cache');
class GenerationQueue {
constructor(redisUrl, wavespeedKey) {
this.queue = new Bull('image-generation', redisUrl);
this.generator = new GeneratorService({ apiKey: wavespeedKey });
this.cache = new CacheService(redisUrl);
this.setupProcessor();
}
setupProcessor() {
// Process 3 jobs concurrently
this.queue.process(3, async (job) => {
const { params } = job.data;
console.log(`Processing job ${job.id}`);
try {
// Update progress
await job.progress(25);
// Generate
const result = await this.generator.generateWithFallback(params);
await job.progress(75);
// Cache result
await this.cache.set(params, result);
await job.progress(100);
return result;
} catch (error) {
console.error(`Job ${job.id} failed:`, error);
throw error;
}
});
this.queue.on('completed', (job, result) => {
console.log(`Job ${job.id} completed`);
});
this.queue.on('failed', (job, error) => {
console.error(`Job ${job.id} failed:`, error.message);
});
}
async enqueue(params) {
const job = await this.queue.add(
{ params },
{
attempts: 3,
backoff: {
type: 'exponential',
delay: 5000
}
}
);
return job.id;
}
async getStatus(jobId) {
const job = await this.queue.getJob(jobId);
if (!job) return null;
const state = await job.getState();
const progress = job.progress();
return {
id: job.id,
state,
progress,
result: state === 'completed' ? job.returnvalue : null
};
}
}
module.exports = GenerationQueue;
Add to server.js:
const GenerationQueue = require('./queue');
const queue = new GenerationQueue(
process.env.REDIS_URL,
process.env.WAVESPEED_API_KEY
);
// Async generation endpoint
app.post('/api/generate-async', async (req, res) => {
try {
const { prompt, model, width, height } = req.body;
if (!prompt) {
return res.status(400).json({ error: 'Prompt required' });
}
const jobId = await queue.enqueue({
prompt, model, width, height
});
res.json({
jobId,
status: 'queued',
statusUrl: `/api/status/${jobId}`
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
// Status endpoint
app.get('/api/status/:jobId', async (req, res) => {
try {
const status = await queue.getStatus(req.params.jobId);
if (!status) {
return res.status(404).json({ error: 'Job not found' });
}
res.json(status);
} catch (error) {
res.status(500).json({ error: error.message });
}
});
Now you can handle long-running generations without blocking:
# Start generation
curl -X POST http://localhost:3000/api/generate-async \
-H "Content-Type: application/json" \
-d '{"prompt": "Complex scene with many details"}'
# Response:
# {"jobId": "1234", "status": "queued", "statusUrl": "/api/status/1234"}
# Check status
curl http://localhost:3000/api/status/1234
# Response (in progress):
# {"id": "1234", "state": "active", "progress": 50, "result": null}
# Response (completed):
# {"id": "1234", "state": "completed", "progress": 100, "result": {...}}
Production Considerations
Before deploying to production, add these improvements:
1. Rate Limiting
const rateLimit = require('express-rate-limit');
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // 100 requests per window
message: 'Too many requests, please try again later'
});
app.use('/api/generate', limiter);
2. Authentication
const authenticate = (req, res, next) => {
const apiKey = req.headers['x-api-key'];
if (!apiKey || apiKey !== process.env.CLIENT_API_KEY) {
return res.status(401).json({ error: 'Unauthorized' });
}
next();
};
app.use('/api', authenticate);
3. Monitoring
const prometheus = require('prom-client');
const generationCounter = new prometheus.Counter({
name: 'generations_total',
help: 'Total number of generations',
labelNames: ['model', 'cached']
});
const generationDuration = new prometheus.Histogram({
name: 'generation_duration_seconds',
help: 'Generation duration in seconds',
labelNames: ['model']
});
// Record metrics in your endpoints
generationCounter.inc({ model: result.model, cached: false });
generationDuration.observe({ model: result.model }, duration / 1000);
4. Error Tracking
const Sentry = require('@sentry/node');
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.NODE_ENV
});
app.use(Sentry.Handlers.errorHandler());
Cost Optimization Tips
After running this in production, here's what I learned about costs:
1. Cache Everything You Can
Our cache hit rate went from 12% initially to 68% after optimizations. This reduced costs by 65%.
2. Choose Models Strategically
- Social media: Use fast models ($0.005)
- Marketing materials: Use premium models ($0.04)
- Internal tools: Use cheapest that meets quality bar
3. Batch Similar Requests
If generating many similar images, batch them to leverage API efficiencies.
4. Set Budget Alerts
const DAILY_BUDGET = 50; // $50 per day
async function checkBudget() {
const today = new Date().toISOString().split('T')[0];
const spent = await redis.get(`budget:${today}`) || 0;
if (parseFloat(spent) >= DAILY_BUDGET) {
throw new Error('Daily budget exceeded');
}
}
async function recordCost(cost) {
const today = new Date().toISOString().split('T')[0];
await redis.incrbyfloat(`budget:${today}`, cost);
await redis.expire(`budget:${today}`, 86400 * 2);
}
What We Built
In 30 minutes (or less), we created:
✅ Production-ready image generation API
✅ Intelligent caching (60-80% cost reduction)
✅ Automatic fallback and retry logic
✅ Async processing for long jobs
✅ Error handling and monitoring
✅ Cost tracking and optimization
Total lines of code: ~400
External dependencies: 2 (Redis + WaveSpeedAI)
Deployment complexity: Low (standard Node.js app)
Next Steps
Want to extend this? Try:
- Add more models: Browse WaveSpeedAI's catalog for specialized options
- Implement webhooks: Notify clients when async jobs complete
- Add image storage: Upload generated images to S3/CloudFlare
- Build a UI: Create a simple frontend for testing
- Add video generation: Use WaveSpeedAI's video models for richer content
Code Repository:
Full code available on GitHub
Questions? Drop them in the comments. I'd love to hear what you build with this!
Tags: #ai #nodejs #api #tutorial #imagegeneration
Top comments (0)