DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
LLM Study Diary #2: Tokenization

LLM Study Diary #2: Tokenization

Comments
2 min read
You Vibe-Coded Your SaaS Landing Page — Google Can't See It

You Vibe-Coded Your SaaS Landing Page — Google Can't See It

Comments
2 min read
LLM Foundry on a tiny model: the stack still does the heavy lifting

LLM Foundry on a tiny model: the stack still does the heavy lifting

Comments
1 min read
llama.cpp MTP Beta, Gemma GGUF Fixes, & Sentinel Local-First AI Coding App

llama.cpp MTP Beta, Gemma GGUF Fixes, & Sentinel Local-First AI Coding App

Comments
3 min read
Why Dense Search Fails in Production RAG — And How Hybrid Search Fixes It

Why Dense Search Fails in Production RAG — And How Hybrid Search Fixes It

1
Comments 3
5 min read
Vision Models for OCR: When They Beat Tesseract and When They Don't

Vision Models for OCR: When They Beat Tesseract and When They Don't

Comments
7 min read
How Should We Evaluate AI Coding Tools in Real Engineering Environments

How Should We Evaluate AI Coding Tools in Real Engineering Environments

Comments
4 min read
The LLM-shaped hole in your XGBoost pipeline

The LLM-shaped hole in your XGBoost pipeline

Comments
1 min read
How I cut my multi-turn LLM API costs by 90% (O(N ) O(N))

How I cut my multi-turn LLM API costs by 90% (O(N ) O(N))

Comments
2 min read
Six Principles in Practice: How an Agentic E2E Found 11 Production Bugs in 8 Runs

Six Principles in Practice: How an Agentic E2E Found 11 Production Bugs in 8 Runs

Comments
13 min read
Chunking in RAG: why your splitter matters more than your embedding model

Chunking in RAG: why your splitter matters more than your embedding model

2
Comments
5 min read
What MCP Really Is — A Demo You Can Run on Your Laptop in 5 Minutes

What MCP Really Is — A Demo You Can Run on Your Laptop in 5 Minutes

Comments
10 min read
The $47K agent loop: why logging, monitoring, and max_tokens all failed to stop it

The $47K agent loop: why logging, monitoring, and max_tokens all failed to stop it

2
Comments 1
6 min read
Simple A2A implementation with Strands

Simple A2A implementation with Strands

6
Comments
4 min read
Hearth: scale-to-zero LLM serving on Kubernetes — and you can hack on it without a GPU

Hearth: scale-to-zero LLM serving on Kubernetes — and you can hack on it without a GPU

2
Comments 1
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.