I got into image generation the normal way: Midjourney → “wow” → “how does this work?”
And then you try to run something locally and reality gently taps you on the shoulder with a brick:
- Python-first stacks
- CUDA assumptions
- dependency graphs shaped like spaghetti
- “works on my machine” energy… on someone else’s machine
If you’re on Apple Silicon (or any ARM box), you get the bonus level: “second-class citizen mode.”
Now add Java to the mix and the typical solution becomes:
“Run a Python server next to the JVM and pretend that’s fine.”
I’ve done it. It works. It also feels like you’re building a tiny distributed system just to generate a PNG.
So I tried a different approach:
Load the model into the JVM process. Call native inference directly. Keep lifecycle and memory under Java’s control.
No REST sidecar. No subprocess. No “hope the port is free.”
Just Quarkus + FLUX.2-klein-4B + Java FFM (Project Panama).
This is the teaser. The full tutorial (with all the sharp edges and the “why does this crash when it can’t find a shader file?” moments) is linked at the end.
The pitch: native inference as a first-class Java citizen
Most local-gen setups treat the model like an external service.
That’s fine until you care about:
- startup time (loading weights every request… nope)
- memory ownership
- concurrency limits
- predictable failure behavior
- “what happens when it segfaults?”
With FFM, the JVM can load a shared library and call C functions directly in-process.
That means the Java app owns:
- model lifecycle (load once, reuse)
- native memory boundaries (explicit arenas, explicit lifetimes)
- concurrency policy (single context, pooled contexts, whatever you decide)
Basically: the JVM stops being a client and becomes the host.
The core trick: don’t bind the whole C project
If you point jextract at a large C codebase and hope for the best, it will do what all great tools do:
It will teach you humility.
So the tutorial uses a pattern that keeps you sane:
- Compile FLUX into a shared library (
.dylib/.so) - Define a tiny wrapper header that exposes only what Java needs
- Generate bindings for that header
Your wrapper becomes the stable “native boundary”:
init(model_path)generate(ctx, prompt, output_path, width, height, steps, guidance, seed)free(ctx)
No giant structs crossing the boundary. No leaking internal headers into Java-land. No “FFM archaeology.”
The other trick: the model is… large
The model download is about 16 GB.
That’s not a typo. That’s Tuesday.
Which is why the tutorial is very explicit about:
- loading once at startup
- keeping weights resident
- exposing a simple REST endpoint that queues work safely
Also: yes, CPU inference can be viable if you pick the right model and resolution. The point is not “make it instant.”
The point is “make it predictable.”
A tiny taste: what the Java side looks like
The Java binding layer is intentionally boring (which is the highest compliment in this area):
- allocate prompt + paths in a confined
Arena - call
flux_wrapper_generate(...) - return a file path
- keep the native context alive until shutdown
And before we touch native code, we validate everything on the Java side, because:
Native code doesn’t throw exceptions. It throws your JVM out the window.
So the record that validates request parameters is not “nice to have.” It’s protective gear.
Why this is fun (and useful)
This isn’t about building a Midjourney competitor.
It’s about proving a bigger point:
- Java can host modern native AI workloads locally
- FFM is practical for real integrations
- Quarkus is a great “container” for native inference (startup/shutdown, config, REST)
And once you have a local image generator in Java, a lot of other ideas get… tempting.
The full tutorial (all commands + code)
The teaser ends here. The full build includes:
- compiling FLUX into a shared library for FFM
- creating a minimal
flux_wrapper.h/.c - generating bindings with
jextract - patching library loading for macOS quirks
- Quarkus service lifecycle (load once, reuse)
- REST API to generate + serve the image
- logs, timings, and “it finally worked!” payoff
👉 https://www.the-main-thread.com/p/java-local-image-generation-quarkus-ffm-flux
Warning: you may end up explaining to your team why your “Java service” now makes art.
Top comments (0)