DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Prompt Caching Works. Your Prompt Assembly Code Does Not.

Prompt Caching Works. Your Prompt Assembly Code Does Not.

Comments
4 min read
Opus 4.7 vs GLM 5.1: is mixing models worth it?

Opus 4.7 vs GLM 5.1: is mixing models worth it?

Comments
13 min read
Upgrading Kiwi-chan’s Brain: Pushing a 30GB "Frankenstein" GPU Rig to the Limit with Qwen 3.6-35B-A3B

Upgrading Kiwi-chan’s Brain: Pushing a 30GB "Frankenstein" GPU Rig to the Limit with Qwen 3.6-35B-A3B

Comments
4 min read
Mistral Medium 3.5 GGUF, FlashQLA Boost for Qwen, & Ollama Playground

Mistral Medium 3.5 GGUF, FlashQLA Boost for Qwen, & Ollama Playground

Comments
3 min read
When the Reranker Hurts: Recall@5 Cases Where Two-Stage Retrieval Loses to One

When the Reranker Hurts: Recall@5 Cases Where Two-Stage Retrieval Loses to One

Comments
7 min read
Why Strict JSON Mode Doesn't Stop Hallucinated Tool Calls

Why Strict JSON Mode Doesn't Stop Hallucinated Tool Calls

Comments
7 min read
Every LLM Eval Library Has the Same Bug: Stochastic Judges Used as Deterministic Oracles

Every LLM Eval Library Has the Same Bug: Stochastic Judges Used as Deterministic Oracles

Comments
7 min read
Local AI Accessibility, JetBrains’ 2026 IDE Plans, and Agentic Architecture Pitfalls

Local AI Accessibility, JetBrains’ 2026 IDE Plans, and Agentic Architecture Pitfalls

Comments
2 min read
Announcing Cliche

Announcing Cliche

Comments
3 min read
Why I Built an AI That Tries to Destroy Your Legal Argument

Why I Built an AI That Tries to Destroy Your Legal Argument

Comments
11 min read
Building an AI Agent That Owns Post-Call Execution: Architecture Decisions

Building an AI Agent That Owns Post-Call Execution: Architecture Decisions

Comments
6 min read
Anthropic Prompt Caching Saves 90% — Here's the One Caveat Nobody Mentions

Anthropic Prompt Caching Saves 90% — Here's the One Caveat Nobody Mentions

Comments
7 min read
When 'Take a Deep Breath' Stopped Working: Prompt Tricks With an Expiry Date

When 'Take a Deep Breath' Stopped Working: Prompt Tricks With an Expiry Date

Comments
7 min read
Cosine Similarity Lies. Here's What to Use When Your Embeddings All Cluster at 0.85

Cosine Similarity Lies. Here's What to Use When Your Embeddings All Cluster at 0.85

Comments
7 min read
Tokenizer Quirks: Claude, GPT, and Gemini Don't Count the Same Text the Same Way

Tokenizer Quirks: Claude, GPT, and Gemini Don't Count the Same Text the Same Way

Comments
6 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.