DEV Community

Dmytro Poliakov
Dmytro Poliakov

Posted on

Fixing the Missing think Tag Glitch When Running DeepSeek V3.2 GGUF on CPU

Recently, I deployed the new DeepSeek V3.2 GGUF (Unsharded) on a CPU-only setup (32 cores, 768GB RAM) using llama.cpp + Open WebUI. Everything ran smoothly until I noticed the model's reasoning output was missing the opening tag.

Because of this, Open WebUI couldn't collapse the thought block, and the UI rendered the reasoning as plain, unformatted text.

After digging into the llama.cpp source and community discussions, I found the root cause: the internal chat template wasn't being loaded automatically for this specific GGUF shard. The fix is straightforward but wasn't immediately obvious in the default startup flags.

The Fix: Explicitly point llama-server to the official Jinja chat template:

--chat-template-file models/templates/deepseek-ai-DeepSeek-V3.2.jinja
Enter fullscreen mode Exit fullscreen mode

I rebuilt llama.cpp from the latest master branch, updated the startup command, and the issue disappeared. The opening tag is now preserved, and Open WebUI correctly collapses the reasoning block.

If you're running DeepSeek V3.2 locally on CPU and hitting the same glitch, here's the full breakdown and working command:

🔗 https://www.hiddenobelisk.com/deepseek-v3-2-on-cpu-fixing-the-missing-opening-tag-glitch/

Hope this saves you some debugging time! 🛠️

Top comments (0)