DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Tackle High Token Usage with GraphRAG

Tackle High Token Usage with GraphRAG

1
Comments
4 min read
High-Value If, Low-Value Foreach: Why Agents Trade in Judgment Structures, Not Models

High-Value If, Low-Value Foreach: Why Agents Trade in Judgment Structures, Not Models

2
Comments
23 min read
How to build a production RAG pipeline in Python (without a vector database)

How to build a production RAG pipeline in Python (without a vector database)

1
Comments
5 min read
ModelChain: Measurable LLM Router with Adaptive Model Selection, Real-Time Scoring, Budget Guards and Failover for Node.js, Edge and Browser

ModelChain: Measurable LLM Router with Adaptive Model Selection, Real-Time Scoring, Budget Guards and Failover for Node.js, Edge and Browser

Comments 1
3 min read
My AI Agent Kept Lying to Me. Then It Tried to Trick Me.

My AI Agent Kept Lying to Me. Then It Tried to Trick Me.

Comments 2
5 min read
Como treinei uma IA de suporte com histórico real de atendimento: da conversa bruta ao RAG em produção

Como treinei uma IA de suporte com histórico real de atendimento: da conversa bruta ao RAG em produção

1
Comments 1
11 min read
Stop Burning Cash on Long-Context RAG: Ephemeral Prompt Caching with Spring AI and JTokkit

Stop Burning Cash on Long-Context RAG: Ephemeral Prompt Caching with Spring AI and JTokkit

Comments 1
2 min read
The Daimon Java SDK: Chat, Stream, and Query Memory from 3 Lines of Java

The Daimon Java SDK: Chat, Stream, and Query Memory from 3 Lines of Java

Comments
5 min read
Show HN: Needle distilled Gemini tool calling en 26M parámetros — lectura técnica sin hype

Show HN: Needle distilled Gemini tool calling en 26M parámetros — lectura técnica sin hype

Comments
9 min read
Stop Burning Tokens on Chat / Agent Loops — Here's What Actually Works

Stop Burning Tokens on Chat / Agent Loops — Here's What Actually Works

Comments 1
6 min read
When the LLM Refuses: A Fallback Chain That Salvages Most Refusals

When the LLM Refuses: A Fallback Chain That Salvages Most Refusals

Comments 1
5 min read
Welcome to the Slop KPI Era: How Tokenmaxxing Is Making AI Worse

Welcome to the Slop KPI Era: How Tokenmaxxing Is Making AI Worse

1
Comments
4 min read
Your RAG Pipeline Is Failing 40% of Queries. Here's the Fix.

Your RAG Pipeline Is Failing 40% of Queries. Here's the Fix.

Comments
2 min read
Inworld TTS Paralinguistic Tags Don't Work — Here's What Does

Inworld TTS Paralinguistic Tags Don't Work — Here's What Does

Comments 1
4 min read
Qwen3.7 Max vs Open-Weight LLMs: Practical Migration Notes

Qwen3.7 Max vs Open-Weight LLMs: Practical Migration Notes

2
Comments
5 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.