DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Three Chat Template Patterns That Silently Kill Your Prompt Cache

Three Chat Template Patterns That Silently Kill Your Prompt Cache

Comments
7 min read
SOC-in-a-Box: One LLM, Eight Hats, A Production-Bar AI SOC on a Single GPU

SOC-in-a-Box: One LLM, Eight Hats, A Production-Bar AI SOC on a Single GPU

Comments
11 min read
KV cache quantization: what FP8/INT8 K and V actually buy you, and where they break

KV cache quantization: what FP8/INT8 K and V actually buy you, and where they break

1
Comments
8 min read
OpenClaw Windows Node, MemPalace & NVIDIA Cosmos Boost Local AI & Open Models

OpenClaw Windows Node, MemPalace & NVIDIA Cosmos Boost Local AI & Open Models

Comments
3 min read
Why Most AI Agent Projects Fail in Production

Why Most AI Agent Projects Fail in Production

Comments
4 min read
How to Build a Portfolio Chatbot With RAG on the Free Tier

How to Build a Portfolio Chatbot With RAG on the Free Tier

1
Comments
11 min read
Friday Fixes: Housekeeping the Homelab and Hub

Friday Fixes: Housekeeping the Homelab and Hub

Comments
9 min read
NVIDIA’s new model on SageMaker, a CLI for AI pipelines, UK AI rules, and a worm threat

NVIDIA’s new model on SageMaker, a CLI for AI pipelines, UK AI rules, and a worm threat

Comments
2 min read
The Essence

The Essence

Comments
4 min read
Context Engineering Is the Skill That Actually Ships Reliable AI Agents

Context Engineering Is the Skill That Actually Ships Reliable AI Agents

Comments
6 min read
How I Cut Agent Token Usage by 89% Without Touching the Agent

How I Cut Agent Token Usage by 89% Without Touching the Agent

Comments
4 min read
Gemma 4 makes on-device multimodal AI good enough to ship

Gemma 4 makes on-device multimodal AI good enough to ship

Comments
4 min read
How I added real-time Slack alerts to an open-source LLM gateway in one day

How I added real-time Slack alerts to an open-source LLM gateway in one day

Comments
1 min read
Long-Term Memory for LLM Agents That Works

Long-Term Memory for LLM Agents That Works

1
Comments
6 min read
Claude Fable 5 Is Here. Here's What Actually Matters for Developers 👨🏾‍💻

Claude Fable 5 Is Here. Here's What Actually Matters for Developers 👨🏾‍💻

1
Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.