DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Anthropic CVP Run 3 — Does Claude's Safety Stack Scale Down to Haiku 4.5?

Anthropic CVP Run 3 — Does Claude's Safety Stack Scale Down to Haiku 4.5?

Comments
3 min read
Stop Configuring the Same LLMs Over and Over: Introducing LLMC

Stop Configuring the Same LLMs Over and Over: Introducing LLMC

Comments
3 min read
Introducing Batch Processing for ZeroGPU

Introducing Batch Processing for ZeroGPU

1
Comments
3 min read
AI Weekly 4/17–4/24 | OpenAI Stack, Anthropic Politics, Figma Tumbles

AI Weekly 4/17–4/24 | OpenAI Stack, Anthropic Politics, Figma Tumbles

Comments
11 min read
I Replaced $800/mo in API Costs with a Local Llama 4 Setup for E-Commerce

I Replaced $800/mo in API Costs with a Local Llama 4 Setup for E-Commerce

Comments
4 min read
GPU cloud servers for AI workloads: how to choose the right instance and deploy without waste

GPU cloud servers for AI workloads: how to choose the right instance and deploy without waste

1
Comments
15 min read
About the Impostor Instinct, Superpower, and an Honest Pivot

About the Impostor Instinct, Superpower, and an Honest Pivot

1
Comments 2
3 min read
Qwen 3.6, llama.cpp Speculative Decoding, Deepseek TileKernels for Local AI on Consumer GPUs

Qwen 3.6, llama.cpp Speculative Decoding, Deepseek TileKernels for Local AI on Consumer GPUs

Comments
3 min read
I built a new file format to cut AI token costs by 70% — here's how it works

I built a new file format to cut AI token costs by 70% — here's how it works

1
Comments
5 min read
I evaluated the leaked system prompts of the biggest AI coding tools. Here's what I found.

I evaluated the leaked system prompts of the biggest AI coding tools. Here's what I found.

Comments
4 min read
Doby: How I Cut Claude Code's Navigation Tokens by 95% with a Spec-First Workflow

Doby: How I Cut Claude Code's Navigation Tokens by 95% with a Spec-First Workflow

Comments
1 min read
Best MCP Server Directories for Developers

Best MCP Server Directories for Developers

2
Comments 1
17 min read
I open-sourced a 4-agent blood-panel triage workflow on heym, with a deterministic Python safety gate that runs BEFORE any LLM token

I open-sourced a 4-agent blood-panel triage workflow on heym, with a deterministic Python safety gate that runs BEFORE any LLM token

5
Comments 1
5 min read
Most RAG Problems Are R(etrieval) Problems

Most RAG Problems Are R(etrieval) Problems

3
Comments 5
3 min read
AI Agents in Practice — Part 3: How the Control Loop Actually Works

AI Agents in Practice — Part 3: How the Control Loop Actually Works

Comments 4
12 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.