DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
How Much VRAM Do You *Actually* Need for Local LLMs?

How Much VRAM Do You *Actually* Need for Local LLMs?

Comments
2 min read
Building Reliable AI Systems: Why Prompting Isn’t Enough

Building Reliable AI Systems: Why Prompting Isn’t Enough

Comments
3 min read
Achieving Maximum Throughput on vLLM with a Single RTX 3090: A Production Guide for 7B LLMs

Achieving Maximum Throughput on vLLM with a Single RTX 3090: A Production Guide for 7B LLMs

1
Comments
4 min read
DeepSeek-V4 is Here, and Yes — 1M Context Is Finally for Everyone

DeepSeek-V4 is Here, and Yes — 1M Context Is Finally for Everyone

Comments
5 min read
The Agentic AI Revolution: What's Actually Happening in April 2026

The Agentic AI Revolution: What's Actually Happening in April 2026

Comments
2 min read
Stop Getting Rate-Limited: Building Bulletproof LLM API Consumption Patterns

Stop Getting Rate-Limited: Building Bulletproof LLM API Consumption Patterns

Comments
3 min read
I switched from OpenAI to z.ai for codiai coding review ng and I'm genuinely happy with it — honest review

I switched from OpenAI to z.ai for codiai coding review ng and I'm genuinely happy with it — honest review

Comments
3 min read
SimCore: I built a social simulation engine where LLM agents live on a real map of your city

SimCore: I built a social simulation engine where LLM agents live on a real map of your city

Comments
1 min read
7 Platforms That Turn Agent Evals Into RL Training Data

7 Platforms That Turn Agent Evals Into RL Training Data

Comments
8 min read
Local LLMs & Multimodal: Qwen GGUF, Nemotron-3-Nano-Omni, MiMo V2.5-Pro Released

Local LLMs & Multimodal: Qwen GGUF, Nemotron-3-Nano-Omni, MiMo V2.5-Pro Released

Comments
3 min read
How I built a Go proxy that keeps your LLM conversation alive when cloud quota runs out

How I built a Go proxy that keeps your LLM conversation alive when cloud quota runs out

1
Comments
2 min read
TurboQuant on a MacBook Pro: two findings the upstream discussion missed

TurboQuant on a MacBook Pro: two findings the upstream discussion missed

Comments
7 min read
How to Build a Local Agentic Search Pipeline That Actually Gets Facts Right

How to Build a Local Agentic Search Pipeline That Actually Gets Facts Right

Comments
6 min read
AI Agents in Production Are Flying Blind — AgentLens Fixes That

AI Agents in Production Are Flying Blind — AgentLens Fixes That

Comments
2 min read
QA Bug Triage Pipeline: From App Reviews to Searchable Bug Reports

QA Bug Triage Pipeline: From App Reviews to Searchable Bug Reports

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.