DEV Community

Cover image for OpenShift Virtualization Migration Advisor — Local-First, Powered by Gemma 4 26B MoE
Bharath Nelapatla
Bharath Nelapatla

Posted on

OpenShift Virtualization Migration Advisor — Local-First, Powered by Gemma 4 26B MoE

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

OpenShift Virtualization Migration Advisor — a local-first assessment tool that ingests legacy hypervisor configurations (VMware .vmx, libvirt domain XML, OVF, RHV/oVirt exports) and produces a structured migration report for moving workloads to Red Hat OpenShift Virtualization.

The problem it solves is specific and unglamorous: enterprises consolidating off vSphere and legacy KVM have a discovery bottleneck. Their VM inventories live in config files that contain infrastructure secrets — storage paths, VLAN topology, encryption key references, FIPS posture, licence keys. Sending those to a hosted LLM is a non-starter for regulated workloads.

So I built the assessment to run entirely on the host machine. Paste a config or upload an inventory → get a six-section migration report covering inventory mapping, OpenShift Virt primitive equivalents (VirtualMachine, DataVolume, NetworkAttachmentDefinition, StorageClass), compatibility risk flags, MTV-vs-virt-v2v tooling recommendation, effort sizing, and security posture preservation. Nothing leaves the box.

Demo

The tool generated this report from a single VMware .vmx file, fully on-device using Gemma 4 26B MoE via Ollama. Screenshots below; full generated output included so judges can see the actual model output, not a curated highlight.

Gradio UI with VMware sample loaded
Local Gradio UI. Pick a sample → click Generate → report streams in from local Gemma 4.

Generated migration report
Six-section migration report streamed from Gemma 4 26B MoE running locally. No cloud API calls.

Full Report (generated locally)

From samples/vmware-web-tier.vmx — 311.7s on a 16 GB-VRAM laptop with 85% CPU offload

> Source detected: VMware vSphere (.vmx)
> Model: gemma4:26b
> Generated locally in: 311.7s on a consumer laptop GPU + CPU offload

1. Inventory Summary

Name vCPU Memory Disk(s) Network(s) Guest OS Notes
web-tier-01 4 (2 sockets × 2 cores) 8192 MiB web-tier-01.vmdk VLAN-100-DMZ rhel9-64 Production Nginx/PHP-FPM; requires UEFI Secure Boot; vSphere encrypted.

2. OpenShift Virtualization Equivalents

Resource OpenShift Virtualization Primitive Configuration Details
Compute VirtualMachine spec.template.spec.domain.cpu: 2 sockets, 2 cores; spec.template.spec.domain.resources.requests.memory: 8Gi
Firmware VirtualMachine spec.template.spec.domain.firmware: efi (Secure Boot enabled)
Storage DataVolume storageClassName: [not specified in source]; requires conversion from pvscsi to virtio-scsi
Network NetworkAttachmentDefinition Target VLAN-100-DMZ via Multus CNI

3. Compatibility & Risk Flags

  • Critical — SCSI Controller Mismatch: Source uses pvscsi. OpenShift Virt requires virtio-scsi. Driver injection or kernel reconfiguration may be necessary if the RHEL 9 initramfs lacks virtio drivers (unlikely for RHEL 9).
  • Critical — Encryption / KMS Breakage: Source uses vSphere-specific KMS (vsphere-kms-key-001). This encryption cannot be migrated. Data must be decrypted during migration or re-encrypted using OpenShift-native methods (encrypted PVCs).
  • High — Network Driver Mismatch: vmxnet3 must be replaced with virtio for optimal performance and compatibility.
  • High — UEFI Secure Boot: OpenShift Virt supports UEFI; the migration pipeline must ensure the VirtualMachine object is correctly configured with firmware: efi and a bootloader compatible with the new virtual hardware.
  • Medium — CD-ROM dependency: The rhel-9.4.iso is attached. Must be manually re-attached via a ConfigMap or DataVolume if required for post-migration configuration.

4. Migration Path

web-tier-01: MTV (Migration Toolkit for Virtualization).
Justification: MTV is the recommended path for vSphere-to-OpenShift migrations — it automates the conversion of VMDK to DataVolume and handles the mapping of vSphere networks to NetworkAttachmentDefinition.

5. Effort & TCO Notes

  • Effort (web-tier-01): S Justification: Single, modern Linux guest (RHEL 9) with standard hardware requirements and no complex PCI passthrough.
  • TCO / Licensing:
    • Requires OpenShift Container Platform (OCP) subscription
    • RHEL guest licensing applies (standard RHEL / RHEL for Virtual Datacenters)
  • Consolidation: If other web-tier nodes exist in the source, migrate them in a single MTV plan to reduce migration window and operational overhead.

6. Security & Compliance

Source Security Feature OpenShift Virtualization Equivalent
vSphere Encryption (KMS) Encrypted PVCs (via StorageClass / CSI) or dm-crypt within the Guest OS
VLAN-100-DMZ Isolation NetworkAttachmentDefinition + NetworkPolicy for micro-segmentation
UEFI Secure Boot VirtualMachine spec firmware: efi with Secure Boot enabled
Production Workload Isolation Namespace-level isolation in OpenShift

Code

Repository: https://github.com/Bharathtrainer/openshift-migration-advisor

How I Used Gemma 4

I chose Gemma 4 26B MoE (gemma4:26b) after starting on 31B Dense and discovering it was the wrong tool for this workload.

The honest path: I picked 31B Dense first because the highest-quality reasoning seemed like the obvious choice for infrastructure assessment. Two problems surfaced on real-world inputs:

  1. Ollama Flash Attention prefill stall on Dense (ollama#15350) hangs the 31B variant on prompts beyond ~3–4K tokens. A multi-VM datacenter inventory blows past that on the first VM. The bug is specific to Dense's hybrid sliding+global attention; MoE handles the same prompts cleanly.
  2. Active-parameter efficiency. 26B MoE activates ~4B parameters per token versus 31B for Dense. On a consumer laptop GPU, that's the difference between a model that works (with some CPU offload) and one that doesn't fit at all.

What I kept from picking MoE over Dense:

  • 256K context window — enough to ingest an entire small-datacenter inventory in one shot
  • Stable long-prompt prefill on Ollama's current build
  • Native reasoning mode via the <|think|> system-prompt token
  • Workable throughput on consumer hardware — generation runs even when 85% of layers spill to CPU

Honest performance note: the report above generated in 311.7 seconds on a 16 GB-VRAM laptop GPU with 85% CPU offload (ollama ps confirms the split). On a workstation with 24+ GB VRAM the same generation should land in 30–60 seconds. This is exactly the kind of detail you want a tool to expose, not hide — local AI's pitch is data sovereignty, and the tradeoff is hardware-dependent latency. Field engineers running this for offline assessment will accept 5 minutes for a report they can't legally send to a cloud API.

When MoE is not the right pick: short, single-turn, hard math/code reasoning where Dense's per-token capacity matters more than throughput. For long, structured, enterprise-document reasoning over large configs, MoE wins. That's the call this build makes, and the rationale is documented in the README with the GitHub issue link, not vibes.

One Gemma 4-specific detail worth flagging: I follow the recommended sampling (temperature=1.0, top_p=0.95, top_k=64) and set OLLAMA_FLASH_ATTENTION=1 + OLLAMA_KV_CACHE_TYPE=q4_0 to keep the KV cache compact enough for a 16K context window. Those four config values are the difference between this running at usable speed and not running at all.


Built entirely on a laptop. No cloud API key was used at any point in the construction of this submission. The report you see above was generated by Gemma 4 running on the same machine.

Top comments (0)