DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
I Built an Adversarial Eval Framework and Attacked 5 LLMs — Every Single One Failed

I Built an Adversarial Eval Framework and Attacked 5 LLMs — Every Single One Failed

6
Comments 5
8 min read
The AI Is a Mirror: What a Year of Naming My Agents Taught Me

The AI Is a Mirror: What a Year of Naming My Agents Taught Me

Comments 2
5 min read
# Scaffolding-Driven vs Model-Driven Planning: Where Agent Systems Actually Break *By Eyoel Nebiyu*

# Scaffolding-Driven vs Model-Driven Planning: Where Agent Systems Actually Break *By Eyoel Nebiyu*

Comments
4 min read
Where your Claude Code bill actually goes — I measured 66 of my own sessions

Where your Claude Code bill actually goes — I measured 66 of my own sessions

Comments 3
5 min read
Pre-Build Existence Audit Rule : looking for the failure modes I'm still missing

Pre-Build Existence Audit Rule : looking for the failure modes I'm still missing

Comments
4 min read
Couple Both Ways: bidirectional checks against silent drift

Couple Both Ways: bidirectional checks against silent drift

Comments 1
10 min read
CPU Inference on AMD EPYC 9334: Real Numbers for LLM and TTS Workloads

CPU Inference on AMD EPYC 9334: Real Numbers for LLM and TTS Workloads

Comments
4 min read
Production LLM Guardrails: 8 Controls Every AI Team Needs

Production LLM Guardrails: 8 Controls Every AI Team Needs

Comments
5 min read
My Agent Never Said "I Don't Know"

My Agent Never Said "I Don't Know"

Comments 1
5 min read
Agentic Web Browsing Workflows with Python and Playwright

Agentic Web Browsing Workflows with Python and Playwright

Comments 1
7 min read
What 12 LLMs Actually Cost in Production — Real Data from Benchwright

What 12 LLMs Actually Cost in Production — Real Data from Benchwright

Comments
7 min read
Why Unit Tests Aren't Enough for LLM Features

Why Unit Tests Aren't Enough for LLM Features

Comments
6 min read
How I stopped Claude Code from hallucinating 42% of my React Code

How I stopped Claude Code from hallucinating 42% of my React Code

Comments
14 min read
Where your AI budget is actually going (it’s not what you think)

Where your AI budget is actually going (it’s not what you think)

Comments
3 min read
I Built a SaaS Risk Scanner That Collects 35+ Signals Per Vendor. Here's What I Learned About Scraping, LLMs, and Solo Engineering.

I Built a SaaS Risk Scanner That Collects 35+ Signals Per Vendor. Here's What I Learned About Scraping, LLMs, and Solo Engineering.

Comments 1
8 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.