DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
KV cache and PagedAttention: what they do and why they matter

KV cache and PagedAttention: what they do and why they matter

1
Comments
8 min read
CortexOps vs Langfuse: Open Source AI Observability Compared

CortexOps vs Langfuse: Open Source AI Observability Compared

Comments
3 min read
Understanding Retrieval-Augmented Generation (RAG): The AI Architecture That Makes LLMs Smarter

Understanding Retrieval-Augmented Generation (RAG): The AI Architecture That Makes LLMs Smarter

Comments
3 min read
Treat prompt libraries as first-class deliverables for reliable AI code assistance

Treat prompt libraries as first-class deliverables for reliable AI code assistance

Comments
5 min read
How to Build a RAG Pipeline for an Enterprise Knowledge Base That Actually Works in Production

How to Build a RAG Pipeline for an Enterprise Knowledge Base That Actually Works in Production

1
Comments
7 min read
Qwen3.6-27B + vLLM + Hermes on 24GB VRAM: May 2026 Recipe

Qwen3.6-27B + vLLM + Hermes on 24GB VRAM: May 2026 Recipe

Comments
4 min read
Don't make the agent do the geometry

Don't make the agent do the geometry

1
Comments
4 min read
Fine-tuning vs RAG: Two Ways to Teach an LLM

Fine-tuning vs RAG: Two Ways to Teach an LLM

Comments
1 min read
We Let 40 Engineers Loose With Coding Agents. The Bill Was Brutal.

We Let 40 Engineers Loose With Coding Agents. The Bill Was Brutal.

Comments
3 min read
SEO Isn't Dead — But GEO Is already eating Its lunch

SEO Isn't Dead — But GEO Is already eating Its lunch

2
Comments
4 min read
Unit Test AI Guide — Zero Hallucination, Cross-Stack Standard

Unit Test AI Guide — Zero Hallucination, Cross-Stack Standard

Comments
11 min read
The Multi-Runtime Agent Problem: Why Your Team Needs More Than One Runtime

The Multi-Runtime Agent Problem: Why Your Team Needs More Than One Runtime

Comments
5 min read
Gemma 2's Architecture: More Performance from Less Model

Gemma 2's Architecture: More Performance from Less Model

Comments
3 min read
Why LLMs Hallucinate, and How to Reduce It

Why LLMs Hallucinate, and How to Reduce It

1
Comments
2 min read
The Context Window: an LLM's Short-Term Memory, Explained

The Context Window: an LLM's Short-Term Memory, Explained

1
Comments
1 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.