Using Bob to build a user interface to use “x/flux2-klein” and “x/z-image-turbo” with Ollama.
Introduction
It’s been a few days since I spotted two intriguing image generation models on Ollama, and the hype across LinkedIn and other tech circles regarding x/flux2-klein and z-image-turbo is definitely real. After testing them locally, I was impressed by their performance, but running commands in a terminal only goes so far — I wanted to wrap them in a polished, functional application with a proper UI. Naturally, I involved Bob in this task (as you’ve probably realized by now, I don’t tackle anything these days without my favorite AI partner 😉). Once again, Bob did an awesome job, and with just a few minor adjustments, we had a working interface. We will walk through the application and the build process below.
Building the Application: Features and Implementation

To bring these powerful models to life, Bob helped me architect a local-first web application that bridges the gap between the Ollama CLI and a user-friendly experience. The goal was to create something “clean and modern” that felt as responsive as the “Turbo” models it supports.
The implementation focuses on several key features that make the generation process seamless.
Key Features of the Ollama Image Generator
- 🎨 Clean and Modern User Interface: A streamlined, responsive web dashboard built with vanilla JavaScript and CSS, ensuring the focus remains on the generated art.
- 🤖 Multi-Model Support: Native integration for the two standout models currently trending; (x/flux2-klein:4b: Optimized for high-quality, detailed visual output) and (x/z-image-turbo:fp8: Designed for lightning-fast generation speeds).
- 📝 Real-Time Interaction: Text prompt input that connects directly to the Ollama backend for immediate feedback.
- ⬇️ Smart Downloads: Once an image is generated, you can save it locally with custom filenames (defaulting to a {sanitized-prompt}-{date}.png format).
- 📊 Generation History: A persistent tracking system to keep record of your creative sessions.
- 🔄 Connection Monitoring: A real-time status indicator to ensure your local Ollama instance is connected and ready to process requests.
- 💾 100% Local Processing: Privacy is baked in — everything runs on your own hardware with no cloud dependency or external data transmission.
Step-by-Step Installation and Deployment
With the architecture in place, getting the environment running is straightforward. Here is how you can set up the project and even take it to production using the automation Bob helped me build.
1. Local Setup
Before starting, ensure you have Node.js (v14+) and Ollama installed. You’ll need to pull the specific models mentioned earlier using your terminal:
ollama pull x/flux-klein:9b
ollama pull x/z-image-turbo:fp8
#####
ollama list
NAME ID SIZE MODIFIED
x/z-image-turbo:fp8 1053737ea587 12 GB 4 hours ago
x/flux2-klein:4b 8c7f37810489 5.7 GB 4 hours ago
llama3.2:latest a80c4f17acd5 2.0 GB 5 weeks ago
ibm/granite4:tiny-h 566b725534ea 4.2 GB 7 weeks ago
granite3.2-vision:2b 3be41a661804 2.4 GB 7 weeks ago
ibm/granite4:latest 98b5cfd619dd 2.1 GB 7 weeks ago
ministral-3:latest 77300ee7514e 6.0 GB 7 weeks ago
llama3.2-vision:latest 6f2f9757ae97 7.8 GB 7 weeks ago
embeddinggemma:latest 85462619ee72 621 MB 7 weeks ago
llama3:latest 365c0bd3c000 4.7 GB 7 weeks ago
granite3.3:latest fd429f23b909 4.9 GB 8 weeks ago
deepseek-r1:latest 6995872bfe4c 5.2 GB 2 months ago
llama3:8b-instruct-q4_0 365c0bd3c000 4.7 GB 2 months ago
mistral:7b 6577803aa9a0 4.4 GB 2 months ago
ibm/granite4:micro 89962fcc7523 2.1 GB 2 months ago
mxbai-embed-large:latest 468836162de7 669 MB 3 months ago
all-minilm:latest 1b226e2802db 45 MB 3 months ago
granite-embedding:latest eb4c533ba6f7 62 MB 3 months ago
qwen3-vl:235b-cloud 7fc468f95411 - 3 months ago
granite4:micro-h ba791654cc27 1.9 GB 3 months ago
granite4:latest 4235724a127c 2.1 GB 3 months ago
granite-embedding:278m 1a37926bf842 562 MB 3 months ago
nomic-embed-text:latest 0a109f422b47 274 MB 5 months ago
To launch the app locally:
- Navigate to the project directory:
cd /xxx/ollama-image-generator. - Install dependencies by running npm install.
- Start the application using the provided script:
./start.sh
.
- Access the UI at
http://localhost:3000.
const express = require('express');
const cors = require('cors');
const axios = require('axios');
const { exec } = require('child_process');
const util = require('util');
const path = require('path');
const execPromise = util.promisify(exec);
const app = express();
const PORT = process.env.PORT || 3000;
app.use(cors());
app.use(express.json());
app.use(express.static('public'));
// Endpoint to generate image using Ollama HTTP API
// This endpoint handles image generation requests from the frontend
app.post('/api/generate', async (req, res) => {
const { prompt, model } = req.body;
if (!prompt || !model) {
return res.status(400).json({ error: 'Prompt and model are required' });
}
try {
console.log(`Generating image with model: ${model}`);
console.log(`Prompt: ${prompt}`);
// Call Ollama's HTTP API endpoint for image generation
// Documentation: https://github.com/ollama/ollama/blob/main/docs/api.md
//
// Request format:
// POST /api/generate
// {
// "model": "x/flux2-klein:4b",
// "prompt": "your prompt here",
// "stream": false // We use non-streaming for simplicity
// }
//
// Response format: Newline-delimited JSON (NDJSON)
// Each line is a separate JSON object showing generation progress
// The final line contains the complete image in the 'image' field
const response = await axios.post('http://localhost:11434/api/generate', {
model: model,
prompt: prompt,
stream: false // Non-streaming mode returns all data at once
}, {
timeout: 180000, // 3 minutes timeout (image generation is slow)
maxContentLength: 50 * 1024 * 1024, // 50MB max response size
maxBodyLength: 50 * 1024 * 1024 // 50MB max request size
});
console.log('Ollama API response received');
console.log('Response data type:', typeof response.data);
console.log('Is Buffer:', Buffer.isBuffer(response.data));
let imageData = null;
let ollamaResponse = '';
// CRITICAL: Ollama returns newline-delimited JSON (NDJSON) for image generation
// Format: Each line is a separate JSON object
// Example:
// {"model":"x/flux2-klein:4b","created_at":"...","response":"","done":false}
// {"model":"x/flux2-klein:4b","created_at":"...","response":"","done":false}
// {"model":"x/flux2-klein:4b","created_at":"...","done":true,"image":"base64data..."}
//
// The LAST line contains the complete image in the 'image' field (singular, not 'images')
if (typeof response.data === 'string') {
console.log('Response is a string, length:', response.data.length);
console.log('First 200 chars:', response.data.substring(0, 200));
try {
// Split the newline-delimited JSON into individual lines
const lines = response.data.trim().split('\n');
console.log('Number of lines:', lines.length);
// Parse the LAST line which contains the final response with image data
const lastLine = lines[lines.length - 1];
const parsed = JSON.parse(lastLine);
console.log('Parsed response keys:', Object.keys(parsed));
// IMPORTANT: Image generation models return a singular 'image' field
// NOT an 'images' array. This is different from some other APIs.
if (parsed.image) {
// Convert base64 PNG data to a data URI for browser display
imageData = `data:image/png;base64,${parsed.image}`;
console.log('✓ Found image in parsed response (singular image field)');
}
// Fallback: Check for images array (some models might use this)
else if (parsed.images && parsed.images.length > 0) {
imageData = `data:image/png;base64,${parsed.images[0]}`;
console.log('✓ Found image in parsed response images array');
}
// For text-based models, the response field contains the text
else if (parsed.response) {
ollamaResponse = parsed.response;
console.log('Found response field, length:', ollamaResponse.length);
}
} catch (e) {
console.log('Failed to parse as JSON:', e.message);
ollamaResponse = response.data;
}
}
// Fallback: Handle binary Buffer responses (rare for Ollama)
else if (Buffer.isBuffer(response.data)) {
const base64Data = response.data.toString('base64');
imageData = `data:image/png;base64,${base64Data}`;
console.log('✓ Converted Buffer to base64 image');
}
// Fallback: Handle pre-parsed JSON with images array
else if (response.data.images && response.data.images.length > 0) {
imageData = `data:image/png;base64,${response.data.images[0]}`;
console.log('✓ Found image in images array');
}
// Fallback: Handle pre-parsed JSON with response field
else if (response.data.response) {
ollamaResponse = response.data.response;
console.log('Response length:', ollamaResponse.length);
}
if (!imageData && !ollamaResponse) {
console.log('✗ No image or response data found');
}
res.json({
success: true,
result: imageData || ollamaResponse || 'No image generated',
model: model,
hasImage: !!imageData,
debug: {
responseType: typeof response.data,
isBuffer: Buffer.isBuffer(response.data),
responseLength: ollamaResponse.length,
hasImages: !!response.data.images
}
});
} catch (error) {
console.error('Error generating image:', error.message);
console.error('Error details:', error.response?.data);
res.status(500).json({
error: 'Failed to generate image',
details: error.message,
response: error.response?.data || ''
});
}
});
// Endpoint to check available models
app.get('/api/models', async (req, res) => {
try {
const response = await axios.get('http://localhost:11434/api/tags');
const models = response.data.models || [];
res.json({
models: models.map(m => ({ name: m.name }))
});
} catch (error) {
console.error('Error fetching models:', error.message);
res.status(500).json({
error: 'Failed to fetch models',
details: error.message
});
}
});
// Health check endpoint
app.get('/api/health', async (req, res) => {
try {
await axios.get('http://localhost:11434/api/tags', { timeout: 5000 });
res.json({ status: 'ok', ollama: 'connected' });
} catch (error) {
res.status(503).json({ status: 'error', ollama: 'disconnected', details: error.message });
}
});
app.listen(PORT, () => {
console.log(`Server running on http://localhost:${PORT}`);
console.log(`Using Ollama API for image generation`);
console.log(`Supported models: x/flux2-klein:4b, x/z-image-turbo:fp8`);
});
// Made with Bob
2. Dockerizing the Workflow
If you prefer a containerized approach, Bob assisted in creating a Dockerfile to simplify deployment.
- Build the image:
docker build -t ollama-image-generator:latest .. - Run the container: Use the docker run command, ensuring you point the
OLLAMA_URLenvironment variable to your local host.
# Multi-stage build for Ollama Image Generator
FROM node:18-alpine AS builder
# Set working directory
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production
# Production stage
FROM node:18-alpine
# Set working directory
WORKDIR /app
# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nodejs -u 1001
# Copy dependencies from builder
COPY --from=builder /app/node_modules ./node_modules
# Copy application files
COPY --chown=nodejs:nodejs . .
# Set environment variables
ENV NODE_ENV=production \
PORT=3000 \
OLLAMA_URL=http://ollama:11434
# Expose port
EXPOSE 3000
# Switch to non-root user
USER nodejs
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/api/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"
# Start application
CMD ["node", "server.js"]
3. Scaling with Kubernetes
For those looking to run this in a more robust environment, the project includes full Kubernetes manifests.
- ConfigMap: Handles environment configuration like the Ollama service URL.
- Deployment: Manages the application lifecycle with support for scaling to multiple replicas.
- Service: A
LoadBalancerservice that exposes the interface on port80. You can deploy the entire stack to your cluster with a single command:kubectl apply -f k8s/.
apiVersion: v1
kind: ConfigMap
metadata:
name: ollama-image-generator-config
labels:
app: ollama-image-generator
data:
# Ollama service URL
# Update this to point to your Ollama service
# For local development: http://localhost:11434
# For Kubernetes: http://ollama-service:11434
ollama_url: "http://ollama-service:11434"
# Application port
port: "3000"
# Node environment
node_env: "production"
# Made with Bob
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama-image-generator
labels:
app: ollama-image-generator
version: v1
spec:
replicas: 2
selector:
matchLabels:
app: ollama-image-generator
template:
metadata:
labels:
app: ollama-image-generator
version: v1
spec:
containers:
- name: ollama-image-generator
image: ollama-image-generator:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
name: http
protocol: TCP
env:
- name: NODE_ENV
value: "production"
- name: PORT
value: "3000"
- name: OLLAMA_URL
valueFrom:
configMapKeyRef:
name: ollama-image-generator-config
key: ollama_url
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
securityContext:
runAsNonRoot: true
runAsUser: 1001
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
capabilities:
drop:
- ALL
restartPolicy: Always
securityContext:
fsGroup: 1001
# Made with Bob
apiVersion: v1
kind: Service
metadata:
name: ollama-image-generator
labels:
app: ollama-image-generator
spec:
type: LoadBalancer
selector:
app: ollama-image-generator
ports:
- name: http
port: 80
targetPort: 3000
protocol: TCP
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
# Made with Bob
How Bob Made it Happen
Using IBM Project Bob, I was able to quickly iterate on the server.js logic required to handle Ollama's unique newline-delimited JSON response format. Because Ollama streams the generation progress, Bob helped me implement a parser that specifically isolates the final base64-encoded PNG data from the stream's last line.
The resulting architecture is a robust Node.js and Express backend that communicates via REST API to the frontend, transforming raw CLI power into a clickable, visual experience.
User Interface and model usage
The user interface that Bob orchestrated is designed for high-velocity experimentation. It features a streamlined dashboard where the end user can select between the x/flux2-klein:4b (High Quality) and x/z-image-turbo:fp8 (Fast Generation) models via a simple dropdown menu. Once a model is selected, you simply type your prompt into the text area — such as the test case “a whimsical forest full of bioluminescent plants and glowing creatures” — to trigger the generation.
The UI then provides a dedicated display area for the resulting art, complete with a button to download and save the image with a custom, sanitized filename.

As we can see in the test results, while the speed is impressive, the visual output can sometimes be mitigated; for instance, the model might struggle with complex text or hyper-specific details, reflecting the current experimental “hype” surrounding these lightweight local models.
- Good ones: As seen below some real good examples 👏 with prompts such as “generate an image of an stronaut gazing at earth during a space walk” or as a provided sample on Ollama “A storefront sign that says “BAKERY” in gold letters” which understandably does not give the same image as on Ollama model’s page.
- Bad ones: even tried several times, I was not able to generate an image with “IBM” as a logo on a building 🤷♂️
- We can always go back to prompts to re-use or re-work them.
Conclusion
In conclusion, while these local models are exceptional for maintaining privacy and building custom applications without relying on cloud-based services, they still have room to grow in terms of versatility. For the moment, I would still continue using advanced services like Google’s Nano Banana and others; the primary reason being the ability to provide my own reference images — just like the one generated for this blog post — to compose and iterate on entirely new visual concepts. While the local stack is powerful and private, the multimodal flexibility of cloud-based suites remains hard to beat for complex creative workflows.
Thanks for reading 🔥
Links
- flux-klein: https://ollama.com/x/flux2-klein
- z-image-turbo: https://ollama.com/x/z-image-turbo
- GitHub repository of this project: https://github.com/aairom/ollama-image-generator/tree/main











Top comments (0)