DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
What is Production-Grade Agentic AI?

What is Production-Grade Agentic AI?

Comments
3 min read
2026 Q1 is the year developers still build the agent harness. 2026 Q3 / 2027 is the year the LLM builds its own harness.

2026 Q1 is the year developers still build the agent harness. 2026 Q3 / 2027 is the year the LLM builds its own harness.

Comments
4 min read
How I Unified 14+ AI Models Behind One OpenAI-Compatible API

How I Unified 14+ AI Models Behind One OpenAI-Compatible API

Comments
2 min read
Why Enterprises Should Not Let LLMs Execute SQL Directly?

Why Enterprises Should Not Let LLMs Execute SQL Directly?

Comments
1 min read
Reasoning happens before the response

Reasoning happens before the response

Comments
5 min read
Qwen 3.6 27B and 35B MTP vs Standard on 16GB GPU

Qwen 3.6 27B and 35B MTP vs Standard on 16GB GPU

Comments
8 min read
AI Weekly — 2026-05-22 to 2026-05-29 | Anthropic's $965B Moment and the Infrastructure Bet

AI Weekly — 2026-05-22 to 2026-05-29 | Anthropic's $965B Moment and the Infrastructure Bet

Comments
5 min read
One Open Source Project per Day #74: ai-engineering-from-scratch - Build AI Full-stack Skills from Ground Up

One Open Source Project per Day #74: ai-engineering-from-scratch - Build AI Full-stack Skills from Ground Up

Comments
2 min read
Stop letting LLMs hallucinate dates — a tool for AI agents

Stop letting LLMs hallucinate dates — a tool for AI agents

5
Comments 1
2 min read
NVIDIA's Nemotron Diffusion: One Model, Three Generation Modes, 6 Faster

NVIDIA's Nemotron Diffusion: One Model, Three Generation Modes, 6 Faster

Comments
3 min read
Continuous batching wrecked our p99 latency. Here's the trace.

Continuous batching wrecked our p99 latency. Here's the trace.

Comments
4 min read
LangChain JsonOutputParser: Fix Malformed JSON from LLMs

LangChain JsonOutputParser: Fix Malformed JSON from LLMs

Comments
2 min read
Quantizing Gemma 4 on Mac with llama.cpp

Quantizing Gemma 4 on Mac with llama.cpp

1
Comments
4 min read
Building a cost-efficient LLM caching layer in Python

Building a cost-efficient LLM caching layer in Python

Comments
5 min read
Why Claude Code Sessions Diverge: A Mechanism Catalog

Why Claude Code Sessions Diverge: A Mechanism Catalog

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.