DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Brazilian Lawyers Fined R$84,000 for Prompt Injection in Court — Here's What Caught Them (and What Didn't)

Brazilian Lawyers Fined R$84,000 for Prompt Injection in Court — Here's What Caught Them (and What Didn't)

Comments
5 min read
Your AI speed benchmark is measuring the one workload you don't run

Your AI speed benchmark is measuring the one workload you don't run

Comments
3 min read
How the itrstats tax assistant works: one query, every layer

How the itrstats tax assistant works: one query, every layer

Comments
10 min read
The Shai-Hulud Worm Is Now Open Source — Here's How to Stop Self-Replicating Prompts Before They Reach Your LLM

The Shai-Hulud Worm Is Now Open Source — Here's How to Stop Self-Replicating Prompts Before They Reach Your LLM

1
Comments
5 min read
Three Months of Speed-Up Experiments on a 3090 Ti: Autoregressive DFlash MTP for Qwen3.6-27B

Three Months of Speed-Up Experiments on a 3090 Ti: Autoregressive DFlash MTP for Qwen3.6-27B

Comments
18 min read
Building llama.cpp from source on a Dell Precision T5820 with an RTX 3090 Ti (after seven power cycles)

Building llama.cpp from source on a Dell Precision T5820 with an RTX 3090 Ti (after seven power cycles)

Comments
16 min read
Inference Arbitrage: How I Route 200+ Daily LLM Calls Across Five Models

Inference Arbitrage: How I Route 200+ Daily LLM Calls Across Five Models

Comments
10 min read
LLM Benchmark Rankings 2026: 15 Models Tested on 38 Real Coding Tasks

LLM Benchmark Rankings 2026: 15 Models Tested on 38 Real Coding Tasks

Comments
28 min read
How I Track Claude, Codex, and Gemini Quotas from One Script

How I Track Claude, Codex, and Gemini Quotas from One Script

Comments
6 min read
The LLM Kept Saying “Fixed.” For Three Months, It Wasn’t.

The LLM Kept Saying “Fixed.” For Three Months, It Wasn’t.

Comments
7 min read
Why MTP doesn't speed up your llama.cpp inference (and how to actually fix it)

Why MTP doesn't speed up your llama.cpp inference (and how to actually fix it)

Comments
5 min read
Designing a Multi-Agent AI System for Content Analysis and Recommendations

Designing a Multi-Agent AI System for Content Analysis and Recommendations

Comments
7 min read
I Cut My LLM API Bill by 73% — Here's the Exact Optimization Playbook

I Cut My LLM API Bill by 73% — Here's the Exact Optimization Playbook

Comments
5 min read
What Production ML Systems Taught Me About AI Hallucinations

What Production ML Systems Taught Me About AI Hallucinations

Comments
4 min read
How LLMs Actually Work (And What That Means for Your Architecture Decisions)

How LLMs Actually Work (And What That Means for Your Architecture Decisions)

Comments
6 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.