DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
We ported how brains manage the cost of thinking to LLM systems

We ported how brains manage the cost of thinking to LLM systems

Comments 2
9 min read
AI Agents write code that compiles, but they still lie to the user. Here is how to fix the pipeline

AI Agents write code that compiles, but they still lie to the user. Here is how to fix the pipeline

Comments
1 min read
I Benchmarked 47 LLM Providers Against Real Queries - Here's What I Found 📊

I Benchmarked 47 LLM Providers Against Real Queries - Here's What I Found 📊

Comments
8 min read
TitanCore Core-1 – Trillion-parameter LLM training infra in C++/CUDA with ZeRO-3

TitanCore Core-1 – Trillion-parameter LLM training infra in C++/CUDA with ZeRO-3

Comments
1 min read
0% vs 50%: Making a RAG Agent Refuse to Hallucinate

0% vs 50%: Making a RAG Agent Refuse to Hallucinate

Comments 1
2 min read
Building and Running Llama.cpp on an Air-Gapped Mac

Building and Running Llama.cpp on an Air-Gapped Mac

Comments
3 min read
AIMO: AI Mention Optimization — The Discipline of Being Recommended by AI Assistants

AIMO: AI Mention Optimization — The Discipline of Being Recommended by AI Assistants

Comments
6 min read
Multi-Agent Kill Switch: Why Stopping the Orchestrator Doesn't Stop the Swarm

Multi-Agent Kill Switch: Why Stopping the Orchestrator Doesn't Stop the Swarm

1
Comments 1
11 min read
I Built a Production-Oriented Multi-Provider AI Chatbot in Rust — Here's How

I Built a Production-Oriented Multi-Provider AI Chatbot in Rust — Here's How

1
Comments 1
5 min read
How RAGScope Knows Which Chunks Your LLM Actually Used

How RAGScope Knows Which Chunks Your LLM Actually Used

Comments 2
4 min read
llama.cpp Optimizations & New Qwopus3.5-9B GGUF Model Boost Local AI Performance

llama.cpp Optimizations & New Qwopus3.5-9B GGUF Model Boost Local AI Performance

Comments
3 min read
Managing LLM Token Limits in Long MDX Articles

Managing LLM Token Limits in Long MDX Articles

Comments
5 min read
Why I used three different critic roles instead of one (and what the eval taught me)

Why I used three different critic roles instead of one (and what the eval taught me)

Comments 2
6 min read
Show HN: Needle distilled Gemini tool calling into 26M parameters — technical read, zero hype

Show HN: Needle distilled Gemini tool calling into 26M parameters — technical read, zero hype

Comments
8 min read
Fitting LLM Reply Suggestions Into Every Provider's Prompt Cache — Without Structured Output

Fitting LLM Reply Suggestions Into Every Provider's Prompt Cache — Without Structured Output

Comments 1
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.