DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Production-Grade RAG: Why Vector Search Isn't Enough (and How Hybrid Search Fills the Gaps)

Production-Grade RAG: Why Vector Search Isn't Enough (and How Hybrid Search Fills the Gaps)

3
Comments
6 min read
Repair Agents, Memory OS, Interview Copilot, Alignment Insights, Multimodal Flow, and CVS AI Academy

Repair Agents, Memory OS, Interview Copilot, Alignment Insights, Multimodal Flow, and CVS AI Academy

Comments
2 min read
I Reduced My System Prompt Tokens by 70% Using a Custom Prompt DSL

I Reduced My System Prompt Tokens by 70% Using a Custom Prompt DSL

2
Comments
6 min read
Echo: results so far

Echo: results so far

2
Comments
6 min read
The "open-source NotebookLM" lie — and the one repo that actually earns the label

The "open-source NotebookLM" lie — and the one repo that actually earns the label

Comments
4 min read
Google Releases DiffusionGemma: Parallel Block Decoding

Google Releases DiffusionGemma: Parallel Block Decoding

2
Comments
6 min read
Stop Syncing Elasticsearch: Native Hybrid Search with Spring AI and Pgvector sparsevec

Stop Syncing Elasticsearch: Native Hybrid Search with Spring AI and Pgvector sparsevec

Comments
2 min read
We Gave AI the Keys. Nobody Asked If It Knows How to Drive.

We Gave AI the Keys. Nobody Asked If It Knows How to Drive.

Comments
4 min read
Gemma 4 QAT on a 1080 Ti: What 'Quantization-Aware' Actually Buys — and Fitting the 12B on 8 GB at 16k

Gemma 4 QAT on a 1080 Ti: What 'Quantization-Aware' Actually Buys — and Fitting the 12B on 8 GB at 16k

Comments
5 min read
How I Fine-Tuned Llama 3 to Think Like DeepSeek — A Practical Guide to LoRA & QLoRA

How I Fine-Tuned Llama 3 to Think Like DeepSeek — A Practical Guide to LoRA & QLoRA

Comments
6 min read
RAG vs Fine-Tuning: Which Approach Should You Choose?

RAG vs Fine-Tuning: Which Approach Should You Choose?

Comments
3 min read
The Retrieval Failure That Looked Like a Model Problem

The Retrieval Failure That Looked Like a Model Problem

1
Comments
3 min read
Bifrost vs TrueFoundry: What changes when you go from OSS gateway to enterprise platform

Bifrost vs TrueFoundry: What changes when you go from OSS gateway to enterprise platform

Comments 1
6 min read
Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4

Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4

Comments
7 min read
What Are Tokens and Why Do They Matter in LLMs?

What Are Tokens and Why Do They Matter in LLMs?

2
Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.