Gguf - DEV Community

Skip to content

DEV Community

👋 Sign in for the ability to sort posts by relevant, latest, or top.

Pneumetron

Jul 19

Unsloth Releases Inkling-GGUF: A Multimodal MoE Model for Developers

#aiml #multimodalmodels #mixtureofexperts #gguf

3 min read

Pneumetron

Jul 18

GnLOLot Releases MiniCPM5-1B-Claude-Opus-Fable5-V2-Thinking-GGUF for Enhanced Local AI Development

#gguf #llamacpp #quantized #minicpm5

3 min read

Pneumetron

Jul 15

Bonsai-27B: A 1-Bit LLM for On-Device Inference with Llama.cpp and MLX

#llm #quantization #1bit #gguf

3 min read

Creeta

Jun 18

NeMo out, GGUF in: how parakeet.cpp ports NVIDIA ASR to C++

#parakeet #ggml #gguf #asr

6 min read

Creeta

Jun 18

llama-bench skipped FA on capable GPUs — b9437 corrects it

#llamacpp #llm #gguf #flashattention

7 min read

Jay Grider

Jun 13

Local LLM Security Best Practices: Beyond Basic Hashing

#llmsecurity #localai #supplychain #gguf

4 min read

Kunal

Jul 6

LLM Quantization Levels Compared: Q4_K_M vs Q8_0 vs FP16 [2026]

#localllm #quantization #gguf #ollama

15 min read

Lingdas1

May 23

GGUF & Modelfile: The Power User's Guide to Local LLMs

#gguf #llm #opensource #tutorial

5 min read

Jun 11

How to Pick a GGUF Quant Level for Your VRAM Budget

#localllm #gguf #quantization #gpu

4 min read

May 13

GGUF Quantization Explained: Q4_K_M vs Q5_K_M vs Q8 — Which to Pick (2026)

#llamacpp #gguf #quantization #localai

5 min read

👋 Sign in for the ability to sort posts by relevant, latest, or top.