DEV Community

Cover image for I ran Flux Schnell + LLMs on a $50 GPU. No CUDA. No cloud. No ROCm.
AIVisionsLab
AIVisionsLab

Posted on

I ran Flux Schnell + LLMs on a $50 GPU. No CUDA. No cloud. No ROCm.

All images in this article were generated locally on the RX 580 8GB described below.

The narrative was clear

In 2026, every guide says the same thing:

"Your AMD RX 580 can't run AI. Buy a new GPU."

AMD dropped ROCm support for Polaris/GCN4 in v5.x.
DirectML crashed with OpaqueTensorImpl errors.
OpenVINO failed silently.

So we had a 8GB GPU sitting at 0% utilization while the CPU burned through LLM responses at 3 tokens/second.

We refused to buy a new GPU.


The fix: Vulkan

The ggml project — the engine behind llama.cpp and stable-diffusion.cpp — supports Vulkan as a GPU backend. Vulkan is an open standard that still supports the RX 580 natively since its 2017 drivers.

No CUDA. No ROCm. No DirectML. Just Vulkan.


Results (real terminal logs, not benchmarks)

Workload Model Speed
LLM inference Mistral 7B Q4 15–16 tok/s
Image generation DreamShaper 8 GGUF ~72s/image
FLUX.1 Schnell flux1-schnell-q4_k (hybrid) ~14 min @ 1024×1024

CPU baseline without GPU: 3–5 tok/s.
Vulkan uplift: 3–4× on a GPU that "doesn't support AI."


Hardware

GPU:     AMD RX 580 2048SP — 8GB GDDR5 (Polaris / GCN4)
CPU:     Intel Xeon E5-2690 v3 — 12c/24t (2014)
RAM:     32GB DDR4 REG ECC
Storage: NVMe 1TB — 1.7–3.5 GB/s
OS:      Windows 10 Pro + WSL2 Ubuntu 22.04
Enter fullscreen mode Exit fullscreen mode

The NVMe alone reduced FLUX model load time from 25 minutes to 30 seconds.
Storage is as critical as the GPU.


Build llama.cpp with Vulkan

# Run in Developer PowerShell for VS
cd E:\
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build -DGGML_VULKAN=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release -j20
Enter fullscreen mode Exit fullscreen mode

Validate:

cd build\bin\Release
.\llama-cli.exe --list-devices
# Expected: Vulkan0: AMD Radeon RX 580 2048SP ✅
Enter fullscreen mode Exit fullscreen mode

Build stable-diffusion.cpp with Vulkan

git clone --recursive https://github.com/leejet/stable-diffusion.cpp
cd stable-diffusion.cpp && mkdir build && cd build
cmake .. -DGGML_VULKAN=ON -DCMAKE_BUILD_TYPE=Release
cmake --build . --config Release -j20
Enter fullscreen mode Exit fullscreen mode

Run the server

E:
cd "E:\stable-diffusion.cpp\build\bin\Release"
sd-server.exe --listen-ip 0.0.0.0 --listen-port 7860 ^
  -m "E:\models\dreamshaper_8.safetensors"
Enter fullscreen mode Exit fullscreen mode

Connect OpenWebUI → Admin → Images → Automatic1111 → http://YOUR_LOCAL_IP:7860/


⚠️ Critical: two types of GGUF

If you try to run FLUX and get new_sd_ctx_t failed — you downloaded the wrong GGUF.

Source Compatible with
city96 (HuggingFace) ComfyUI only
leejet (HuggingFace) stable-diffusion.cpp ✅

Always use: https://huggingface.co/leejet/FLUX.1-schnell-gguf


What failed (documented)

Attempt Error Why
DirectML OpaqueTensorImpl MS tensors can't talk to ComfyUI backends
ROCm Kernel panics GCN4 dropped in v5.x — permanent
OpenVINO No module 'ldm' Extension targets old A1111 arch
CPU + HDD 19 min/image No GPU + mechanical I/O bottleneck

Full documentation

📖 Complete guide (PT/EN/ES/FR/AR) with architecture diagrams, benchmarks, automation scripts:
👉 setup-ia-local-rx580-vulkan.web.app

📦 GitHub (scripts + docs):
👉 github.com/aivisionslab-studios/rx580-local-ai-guide


The problem was never the GPU.

Top comments (0)