When I launched unlimited AI image generator free to the public, the first performance problem wasn't the AI inference — it was image delivery.
Generated images were arriving fine. Getting them to the user's browser efficiently was the bottleneck. Here's what I found and how I fixed it.
The Problem
Users generate an image. The model produces it. Then what?
The naive approach: return the image as a base64 string directly in the API response.
// Naive approach — works but not optimal
const imageBuffer = await runInference(prompt);
const base64 = imageBuffer.toString('base64');
return res.json({ image: `data:image/png;base64,${base64}` });
Problems with this:
- Large payload size (PNG base64 adds ~33% overhead)
- No caching
- No CDN delivery
- Browser has to decode base64 before rendering For a tool where image quality perception matters, this showed.
The Next.js Image Component Constraint
Next.js <Image> component handles optimization automatically — but only for images with known dimensions at build time or from approved domains.
For dynamically generated images from external sources, you need to configure remotePatterns:
// next.config.js
const nextConfig = {
images: {
remotePatterns: [
{
protocol: 'https',
hostname: 'res.cloudinary.com',
pathname: '/your-cloud-name/**',
},
],
},
};
Without this, <Image> throws errors on dynamically sourced URLs.
The Cloudinary Layer
I added Cloudinary as an intermediary between inference output and browser delivery.
The flow:
Inference API → Raw image → Upload to Cloudinary →
Cloudinary URL returned → Next.js <Image> renders →
Cloudinary CDN delivers WebP to browser
Key Cloudinary transformations applied automatically:
// Cloudinary URL with transformations
const optimizedUrl = cloudinary.url(publicId, {
format: 'webp', // WebP conversion
quality: 'auto', // Automatic quality optimization
fetch_format: 'auto', // Format based on browser support
width: 1024,
crop: 'limit',
});
Results:
- PNG → WebP: ~40-60% file size reduction
- CDN delivery: edge nodes worldwide
- Caching: repeated requests served instantly
Handling Temporary vs Persistent Storage
Not all generated images need permanent storage. For a tool where prompts aren't saved:
// Temporary upload — auto-deletes after 1 hour
const uploadResult = await cloudinary.uploader.upload(imageBuffer, {
folder: 'generated',
resource_type: 'image',
invalidate: true,
// Auto-delete after delivery window
tags: ['temp', 'generated'],
});
This keeps storage costs minimal — images exist long enough to be downloaded, then removed.
LCP Impact
Largest Contentful Paint is the metric that matters most for perceived performance on a generation tool. The generated image IS the LCP element for most users.
Changes that improved LCP:
1. Preconnect to Cloudinary
<link rel="preconnect" href="https://res.cloudinary.com" />
2. Priority on the generated image
<Image
src={generatedImageUrl}
alt={altText}
width={1024}
height={1024}
priority // Tells Next.js to preload this image
sizes="(max-width: 768px) 100vw, 50vw"
/>
3. Skeleton loading state
Show a loading placeholder with the exact dimensions of the output image. Prevents layout shift when the image loads.
{isGenerating ? (
<div className="w-full aspect-square bg-neutral-100
dark:bg-neutral-800 rounded-2xl animate-pulse" />
) : (
<Image src={imageUrl} ... />
)}
The Aspect Ratio Problem
Users select different aspect ratios — 1:1, 16:9, 9:16, 4:3. The image container needs to match the selected ratio before the image loads, otherwise there's a layout shift when it arrives.
const aspectClasses = {
'1:1': 'aspect-square',
'16:9': 'aspect-video',
'9:16': 'aspect-[9/16]',
'4:3': 'aspect-[4/3]',
};
<div className={`w-full ${aspectClasses[selectedRatio]}`}>
{isGenerating ? <Skeleton /> : <Image ... />}
</div>
Tailwind's arbitrary aspect ratio syntax handles the non-standard ratios cleanly.
Results Summary
| Metric | Before | After |
|---|---|---|
| Average file size | ~800KB PNG | ~180KB WebP |
| LCP (median) | 4.2s | 1.8s |
| CDN cache hit rate | 0% | ~60% |
| Layout shift | Present | Eliminated |
The CDN cache hit rate at 60% surprised me — many users generate similar prompts, and Cloudinary serves cached versions of identical outputs instantly.
TL;DR
- Use Cloudinary as intermediary for dynamic image optimization
- Configure
remotePatternsinnext.config.jsfor approved domains - Set
priorityon the LCP image element - Pre-size containers with aspect ratio classes to prevent layout shift
- Temporary storage for generated images — delete after delivery window For the full technical breakdown of the stack, I wrote a detailed architecture post here.
What's your approach to dynamic image delivery in Next.js? Comments open.
Top comments (0)