DEV Community

# gguf

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
NeMo out, GGUF in: how parakeet.cpp ports NVIDIA ASR to C++

NeMo out, GGUF in: how parakeet.cpp ports NVIDIA ASR to C++

Comments
6 min read
llama-bench skipped FA on capable GPUs — b9437 corrects it

llama-bench skipped FA on capable GPUs — b9437 corrects it

Comments
7 min read
Local LLM Security Best Practices: Beyond Basic Hashing

Local LLM Security Best Practices: Beyond Basic Hashing

Comments
4 min read
How to Pick a GGUF Quant Level for Your VRAM Budget

How to Pick a GGUF Quant Level for Your VRAM Budget

Comments
3 min read
GGUF & Modelfile: The Power User's Guide to Local LLMs

GGUF & Modelfile: The Power User's Guide to Local LLMs

Comments
5 min read
GGUF Quantization Explained: Q4_K_M vs Q5_K_M vs Q8 — Which to Pick (2026)

GGUF Quantization Explained: Q4_K_M vs Q5_K_M vs Q8 — Which to Pick (2026)

Comments
4 min read
Llama-Server Router Mode - Dynamic Model Switching Without Restarts

Llama-Server Router Mode - Dynamic Model Switching Without Restarts

Comments
9 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.