DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
I Cut My LLM API Bill by 73% — Here's the Exact Optimization Playbook

I Cut My LLM API Bill by 73% — Here's the Exact Optimization Playbook

Comments
5 min read
What Production ML Systems Taught Me About AI Hallucinations

What Production ML Systems Taught Me About AI Hallucinations

Comments
4 min read
AI Red-Teaming Techniques: A Practical Starting Point for Security Teams

AI Red-Teaming Techniques: A Practical Starting Point for Security Teams

Comments 1
4 min read
Local Inference Boost: Qwen 3.6 Benchmarks, KV Cache Quantization, & Ollama UI

Local Inference Boost: Qwen 3.6 Benchmarks, KV Cache Quantization, & Ollama UI

Comments
3 min read
Kimi K2.6 Beats Frontier Models in Coding Benchmarks

Kimi K2.6 Beats Frontier Models in Coding Benchmarks

Comments
6 min read
From Burnout to Building: One Indie Dev's Story Behind Mozart

From Burnout to Building: One Indie Dev's Story Behind Mozart

Comments
5 min read
267 tok/s local inference on RTX 5090 – llama.cpp MTP + Qwen3-35B-A3B MoE

267 tok/s local inference on RTX 5090 – llama.cpp MTP + Qwen3-35B-A3B MoE

Comments
1 min read
Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

1
Comments
2 min read
Claude 4.7 Released with 1M Token Context

Claude 4.7 Released with 1M Token Context

Comments
1 min read
Stop Hardcoding AI Prompts: A Developer’s Guide to PromptCache

Stop Hardcoding AI Prompts: A Developer’s Guide to PromptCache

Comments
8 min read
Building an AI Agent in Go: What I Learned

Building an AI Agent in Go: What I Learned

5
Comments 1
6 min read
How to Run LLM Evaluations in CI Without Paying $249/Month

How to Run LLM Evaluations in CI Without Paying $249/Month

1
Comments 1
3 min read
Google is embedding an agent in Android. Your app is now an API.

Google is embedding an agent in Android. Your app is now an API.

Comments
3 min read
Evaluating LLMs in Production Without Paying $249/Month for Braintrust

Evaluating LLMs in Production Without Paying $249/Month for Braintrust

Comments
3 min read
I Made 4 LLMs Argue With Each Other to Write Better Runbooks. Here's What Happened.

I Made 4 LLMs Argue With Each Other to Write Better Runbooks. Here's What Happened.

Comments
5 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.