DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Building a cost-efficient LLM caching layer in Python

Building a cost-efficient LLM caching layer in Python

Comments
5 min read
Gemma4 Apex GGUF, Ollama Context Optimization, & Llama3 Benchmarks

Gemma4 Apex GGUF, Ollama Context Optimization, & Llama3 Benchmarks

Comments
3 min read
Building Autonomous DevOps Agents with MCP and LangChain

Building Autonomous DevOps Agents with MCP and LangChain

Comments
5 min read
How Do You Fit a Trillion-Parameter Model Into a Kubernetes Cluster?

How Do You Fit a Trillion-Parameter Model Into a Kubernetes Cluster?

Comments
17 min read
# Multi-Head Latent Attention (MLA)

# Multi-Head Latent Attention (MLA)

Comments
9 min read
The Production Metric That Warns Us Before AI Failures Happen

The Production Metric That Warns Us Before AI Failures Happen

Comments
3 min read
fftext: summarize, translate, and fact-check any text on your laptop. No API key.

fftext: summarize, translate, and fact-check any text on your laptop. No API key.

Comments
3 min read
A Month with DeepSeek: What Happened When I Replaced Claude Opus for Real Work

A Month with DeepSeek: What Happened When I Replaced Claude Opus for Real Work

Comments
7 min read
Reading Anthropic's Glasswing initial update

Reading Anthropic's Glasswing initial update

Comments
3 min read
Your Agent Just Called the Same Tool 47 Times. Here's the 20-Line Detector.

Your Agent Just Called the Same Tool 47 Times. Here's the 20-Line Detector.

Comments
7 min read
The 34x Pricing Gap: Why AI Model Selection in 2026 Is a Math Problem, Not a Loyalty Problem

The 34x Pricing Gap: Why AI Model Selection in 2026 Is a Math Problem, Not a Loyalty Problem

Comments
5 min read
Long-Context Models Killed RAG. Except for the 6 Cases Where They Made It Worse.

Long-Context Models Killed RAG. Except for the 6 Cases Where They Made It Worse.

Comments
8 min read
Making LLM Calls Reliable: Retry, Semaphore, Cache, and Batch

Making LLM Calls Reliable: Retry, Semaphore, Cache, and Batch

Comments
4 min read
Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift

Compass v1.1.0 · we shipped a memory plugin that catches its own consumption drift

Comments
5 min read
How Claude Code Thinks: Inside Your AI Coding Assistant

How Claude Code Thinks: Inside Your AI Coding Assistant

1
Comments 1
5 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.