Alain Airom

Posted on Jan 23

The Best of Both Worlds: Merging IBM’s Project Bob with Ollama’s Image Ecosystem

#bob #ollama #imagegenerationv #aiimagegeneration

Using Bob to build a user interface to use “x/flux2-klein” and “x/z-image-turbo” with Ollama.

Introduction

It’s been a few days since I spotted two intriguing image generation models on Ollama, and the hype across LinkedIn and other tech circles regarding x/flux2-klein and z-image-turbo is definitely real. After testing them locally, I was impressed by their performance, but running commands in a terminal only goes so far — I wanted to wrap them in a polished, functional application with a proper UI. Naturally, I involved Bob in this task (as you’ve probably realized by now, I don’t tackle anything these days without my favorite AI partner 😉). Once again, Bob did an awesome job, and with just a few minor adjustments, we had a working interface. We will walk through the application and the build process below.

Building the Application: Features and Implementation

To bring these powerful models to life, Bob helped me architect a local-first web application that bridges the gap between the Ollama CLI and a user-friendly experience. The goal was to create something “clean and modern” that felt as responsive as the “Turbo” models it supports.

The implementation focuses on several key features that make the generation process seamless.

Key Features of the Ollama Image Generator

🎨 Clean and Modern User Interface: A streamlined, responsive web dashboard built with vanilla JavaScript and CSS, ensuring the focus remains on the generated art.
🤖 Multi-Model Support: Native integration for the two standout models currently trending; (x/flux2-klein:4b: Optimized for high-quality, detailed visual output) and (x/z-image-turbo:fp8: Designed for lightning-fast generation speeds).
📝 Real-Time Interaction: Text prompt input that connects directly to the Ollama backend for immediate feedback.
⬇️ Smart Downloads: Once an image is generated, you can save it locally with custom filenames (defaulting to a {sanitized-prompt}-{date}.png format).
📊 Generation History: A persistent tracking system to keep record of your creative sessions.
🔄 Connection Monitoring: A real-time status indicator to ensure your local Ollama instance is connected and ready to process requests.
💾 100% Local Processing: Privacy is baked in — everything runs on your own hardware with no cloud dependency or external data transmission.

Step-by-Step Installation and Deployment

With the architecture in place, getting the environment running is straightforward. Here is how you can set up the project and even take it to production using the automation Bob helped me build.

1. Local Setup

Before starting, ensure you have Node.js (v14+) and Ollama installed. You’ll need to pull the specific models mentioned earlier using your terminal:

ollama pull x/flux-klein:9b
ollama pull x/z-image-turbo:fp8

#####
ollama list
NAME                        ID              SIZE      MODIFIED     
x/z-image-turbo:fp8         1053737ea587    12 GB     4 hours ago     
x/flux2-klein:4b            8c7f37810489    5.7 GB    4 hours ago     
llama3.2:latest             a80c4f17acd5    2.0 GB    5 weeks ago     
ibm/granite4:tiny-h         566b725534ea    4.2 GB    7 weeks ago     
granite3.2-vision:2b        3be41a661804    2.4 GB    7 weeks ago     
ibm/granite4:latest         98b5cfd619dd    2.1 GB    7 weeks ago     
ministral-3:latest          77300ee7514e    6.0 GB    7 weeks ago     
llama3.2-vision:latest      6f2f9757ae97    7.8 GB    7 weeks ago     
embeddinggemma:latest       85462619ee72    621 MB    7 weeks ago     
llama3:latest               365c0bd3c000    4.7 GB    7 weeks ago     
granite3.3:latest           fd429f23b909    4.9 GB    8 weeks ago     
deepseek-r1:latest          6995872bfe4c    5.2 GB    2 months ago    
llama3:8b-instruct-q4_0     365c0bd3c000    4.7 GB    2 months ago    
mistral:7b                  6577803aa9a0    4.4 GB    2 months ago    
ibm/granite4:micro          89962fcc7523    2.1 GB    2 months ago    
mxbai-embed-large:latest    468836162de7    669 MB    3 months ago    
all-minilm:latest           1b226e2802db    45 MB     3 months ago    
granite-embedding:latest    eb4c533ba6f7    62 MB     3 months ago    
qwen3-vl:235b-cloud         7fc468f95411    -         3 months ago    
granite4:micro-h            ba791654cc27    1.9 GB    3 months ago    
granite4:latest             4235724a127c    2.1 GB    3 months ago    
granite-embedding:278m      1a37926bf842    562 MB    3 months ago    
nomic-embed-text:latest     0a109f422b47    274 MB    5 months ago

To launch the app locally:

Navigate to the project directory: cd /xxx/ollama-image-generator.
Install dependencies by running npm install.
Start the application using the provided script:

./start.sh

Access the UI at http://localhost:3000.

const express = require('express');
const cors = require('cors');
const axios = require('axios');
const { exec } = require('child_process');
const util = require('util');
const path = require('path');

const execPromise = util.promisify(exec);

const app = express();
const PORT = process.env.PORT || 3000;

app.use(cors());
app.use(express.json());
app.use(express.static('public'));

// Endpoint to generate image using Ollama HTTP API
// This endpoint handles image generation requests from the frontend
app.post('/api/generate', async (req, res) => {
    const { prompt, model } = req.body;

    if (!prompt || !model) {
        return res.status(400).json({ error: 'Prompt and model are required' });
    }

    try {
        console.log(`Generating image with model: ${model}`);
        console.log(`Prompt: ${prompt}`);

        // Call Ollama's HTTP API endpoint for image generation
        // Documentation: https://github.com/ollama/ollama/blob/main/docs/api.md
        //
        // Request format:
        // POST /api/generate
        // {
        //   "model": "x/flux2-klein:4b",
        //   "prompt": "your prompt here",
        //   "stream": false  // We use non-streaming for simplicity
        // }
        //
        // Response format: Newline-delimited JSON (NDJSON)
        // Each line is a separate JSON object showing generation progress
        // The final line contains the complete image in the 'image' field
        const response = await axios.post('http://localhost:11434/api/generate', {
            model: model,
            prompt: prompt,
            stream: false  // Non-streaming mode returns all data at once
        }, {
            timeout: 180000, // 3 minutes timeout (image generation is slow)
            maxContentLength: 50 * 1024 * 1024, // 50MB max response size
            maxBodyLength: 50 * 1024 * 1024     // 50MB max request size
        });

        console.log('Ollama API response received');
        console.log('Response data type:', typeof response.data);
        console.log('Is Buffer:', Buffer.isBuffer(response.data));

        let imageData = null;
        let ollamaResponse = '';

        // CRITICAL: Ollama returns newline-delimited JSON (NDJSON) for image generation
        // Format: Each line is a separate JSON object
        // Example:
        // {"model":"x/flux2-klein:4b","created_at":"...","response":"","done":false}
        // {"model":"x/flux2-klein:4b","created_at":"...","response":"","done":false}
        // {"model":"x/flux2-klein:4b","created_at":"...","done":true,"image":"base64data..."}
        //
        // The LAST line contains the complete image in the 'image' field (singular, not 'images')

        if (typeof response.data === 'string') {
            console.log('Response is a string, length:', response.data.length);
            console.log('First 200 chars:', response.data.substring(0, 200));

            try {
                // Split the newline-delimited JSON into individual lines
                const lines = response.data.trim().split('\n');
                console.log('Number of lines:', lines.length);

                // Parse the LAST line which contains the final response with image data
                const lastLine = lines[lines.length - 1];
                const parsed = JSON.parse(lastLine);

                console.log('Parsed response keys:', Object.keys(parsed));

                // IMPORTANT: Image generation models return a singular 'image' field
                // NOT an 'images' array. This is different from some other APIs.
                if (parsed.image) {
                    // Convert base64 PNG data to a data URI for browser display
                    imageData = `data:image/png;base64,${parsed.image}`;
                    console.log('✓ Found image in parsed response (singular image field)');
                }
                // Fallback: Check for images array (some models might use this)
                else if (parsed.images && parsed.images.length > 0) {
                    imageData = `data:image/png;base64,${parsed.images[0]}`;
                    console.log('✓ Found image in parsed response images array');
                }
                // For text-based models, the response field contains the text
                else if (parsed.response) {
                    ollamaResponse = parsed.response;
                    console.log('Found response field, length:', ollamaResponse.length);
                }
            } catch (e) {
                console.log('Failed to parse as JSON:', e.message);
                ollamaResponse = response.data;
            }
        }
        // Fallback: Handle binary Buffer responses (rare for Ollama)
        else if (Buffer.isBuffer(response.data)) {
            const base64Data = response.data.toString('base64');
            imageData = `data:image/png;base64,${base64Data}`;
            console.log('✓ Converted Buffer to base64 image');
        }
        // Fallback: Handle pre-parsed JSON with images array
        else if (response.data.images && response.data.images.length > 0) {
            imageData = `data:image/png;base64,${response.data.images[0]}`;
            console.log('✓ Found image in images array');
        }
        // Fallback: Handle pre-parsed JSON with response field
        else if (response.data.response) {
            ollamaResponse = response.data.response;
            console.log('Response length:', ollamaResponse.length);
        }

        if (!imageData && !ollamaResponse) {
            console.log('✗ No image or response data found');
        }

        res.json({
            success: true,
            result: imageData || ollamaResponse || 'No image generated',
            model: model,
            hasImage: !!imageData,
            debug: {
                responseType: typeof response.data,
                isBuffer: Buffer.isBuffer(response.data),
                responseLength: ollamaResponse.length,
                hasImages: !!response.data.images
            }
        });

    } catch (error) {
        console.error('Error generating image:', error.message);
        console.error('Error details:', error.response?.data);
        res.status(500).json({
            error: 'Failed to generate image',
            details: error.message,
            response: error.response?.data || ''
        });
    }
});

// Endpoint to check available models
app.get('/api/models', async (req, res) => {
    try {
        const response = await axios.get('http://localhost:11434/api/tags');
        const models = response.data.models || [];

        res.json({
            models: models.map(m => ({ name: m.name }))
        });
    } catch (error) {
        console.error('Error fetching models:', error.message);
        res.status(500).json({
            error: 'Failed to fetch models',
            details: error.message
        });
    }
});

// Health check endpoint
app.get('/api/health', async (req, res) => {
    try {
        await axios.get('http://localhost:11434/api/tags', { timeout: 5000 });
        res.json({ status: 'ok', ollama: 'connected' });
    } catch (error) {
        res.status(503).json({ status: 'error', ollama: 'disconnected', details: error.message });
    }
});

app.listen(PORT, () => {
    console.log(`Server running on http://localhost:${PORT}`);
    console.log(`Using Ollama API for image generation`);
    console.log(`Supported models: x/flux2-klein:4b, x/z-image-turbo:fp8`);
});

// Made with Bob

2. Dockerizing the Workflow

If you prefer a containerized approach, Bob assisted in creating a Dockerfile to simplify deployment.

Build the image: docker build -t ollama-image-generator:latest ..
Run the container: Use the docker run command, ensuring you point the OLLAMA_URL environment variable to your local host.

# Multi-stage build for Ollama Image Generator
FROM node:18-alpine AS builder

# Set working directory
WORKDIR /app

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Production stage
FROM node:18-alpine

# Set working directory
WORKDIR /app

# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001

# Copy dependencies from builder
COPY --from=builder /app/node_modules ./node_modules

# Copy application files
COPY --chown=nodejs:nodejs . .

# Set environment variables
ENV NODE_ENV=production \
    PORT=3000 \
    OLLAMA_URL=http://ollama:11434

# Expose port
EXPOSE 3000

# Switch to non-root user
USER nodejs

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD node -e "require('http').get('http://localhost:3000/api/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"

# Start application
CMD ["node", "server.js"]

3. Scaling with Kubernetes

For those looking to run this in a more robust environment, the project includes full Kubernetes manifests.

ConfigMap: Handles environment configuration like the Ollama service URL.
Deployment: Manages the application lifecycle with support for scaling to multiple replicas.
Service: A LoadBalancer service that exposes the interface on port 80. You can deploy the entire stack to your cluster with a single command: kubectl apply -f k8s/.

apiVersion: v1
kind: ConfigMap
metadata:
  name: ollama-image-generator-config
  labels:
    app: ollama-image-generator
data:
  # Ollama service URL
  # Update this to point to your Ollama service
  # For local development: http://localhost:11434
  # For Kubernetes: http://ollama-service:11434
  ollama_url: "http://ollama-service:11434"

  # Application port
  port: "3000"

  # Node environment
  node_env: "production"

# Made with Bob

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ollama-image-generator
  labels:
    app: ollama-image-generator
    version: v1
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ollama-image-generator
  template:
    metadata:
      labels:
        app: ollama-image-generator
        version: v1
    spec:
      containers:
      - name: ollama-image-generator
        image: ollama-image-generator:latest
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
          name: http
          protocol: TCP
        env:
        - name: NODE_ENV
          value: "production"
        - name: PORT
          value: "3000"
        - name: OLLAMA_URL
          valueFrom:
            configMapKeyRef:
              name: ollama-image-generator-config
              key: ollama_url
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /api/health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /api/health
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        securityContext:
          runAsNonRoot: true
          runAsUser: 1001
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: false
          capabilities:
            drop:
            - ALL
      restartPolicy: Always
      securityContext:
        fsGroup: 1001

# Made with Bob

apiVersion: v1
kind: Service
metadata:
  name: ollama-image-generator
  labels:
    app: ollama-image-generator
spec:
  type: LoadBalancer
  selector:
    app: ollama-image-generator
  ports:
  - name: http
    port: 80
    targetPort: 3000
    protocol: TCP
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800

# Made with Bob

How Bob Made it Happen

Using IBM Project Bob, I was able to quickly iterate on the server.js logic required to handle Ollama's unique newline-delimited JSON response format. Because Ollama streams the generation progress, Bob helped me implement a parser that specifically isolates the final base64-encoded PNG data from the stream's last line.

The resulting architecture is a robust Node.js and Express backend that communicates via REST API to the frontend, transforming raw CLI power into a clickable, visual experience.

User Interface and model usage

The user interface that Bob orchestrated is designed for high-velocity experimentation. It features a streamlined dashboard where the end user can select between the x/flux2-klein:4b (High Quality) and x/z-image-turbo:fp8 (Fast Generation) models via a simple dropdown menu. Once a model is selected, you simply type your prompt into the text area — such as the test case “a whimsical forest full of bioluminescent plants and glowing creatures” — to trigger the generation.

The UI then provides a dedicated display area for the resulting art, complete with a button to download and save the image with a custom, sanitized filename.

As we can see in the test results, while the speed is impressive, the visual output can sometimes be mitigated; for instance, the model might struggle with complex text or hyper-specific details, reflecting the current experimental “hype” surrounding these lightweight local models.

Good ones: As seen below some real good examples 👏 with prompts such as “generate an image of an stronaut gazing at earth during a space walk” or as a provided sample on Ollama “A storefront sign that says “BAKERY” in gold letters” which understandably does not give the same image as on Ollama model’s page.

Bad ones: even tried several times, I was not able to generate an image with “IBM” as a logo on a building 🤷‍♂️

We can always go back to prompts to re-use or re-work them.

Conclusion

In conclusion, while these local models are exceptional for maintaining privacy and building custom applications without relying on cloud-based services, they still have room to grow in terms of versatility. For the moment, I would still continue using advanced services like Google’s Nano Banana and others; the primary reason being the ability to provide my own reference images — just like the one generated for this blog post — to compose and iterate on entirely new visual concepts. While the local stack is powerful and private, the multimodal flexibility of cloud-based suites remains hard to beat for complex creative workflows.

Thanks for reading 🔥

DEV Community