DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
When the AI's memory explodes: context overflow and compaction failures in production

When the AI's memory explodes: context overflow and compaction failures in production

Comments
3 min read
Why the f*** does AI always use em dashes — the involuntary AI watermark

Why the f*** does AI always use em dashes — the involuntary AI watermark

5
Comments 15
2 min read
SGLang vs vLLM: Which is Better for Your Needs in 2026?

SGLang vs vLLM: Which is Better for Your Needs in 2026?

Comments
5 min read
6 JavaScript Patterns That Turn LLM APIs Into Production AI Systems

6 JavaScript Patterns That Turn LLM APIs Into Production AI Systems

Comments
4 min read
What Is Tool Chaining in LLMs? Why It Breaks and How to Think About Orchestration

What Is Tool Chaining in LLMs? Why It Breaks and How to Think About Orchestration

1
Comments
7 min read
Your MCP Agents Are Over-Privileged. Here's How to Fix It.

Your MCP Agents Are Over-Privileged. Here's How to Fix It.

1
Comments
9 min read
Unleashing AI in Quantum Research: Why TensorCircuit-NG is the Ultimate Foundation for the Agent Era

Unleashing AI in Quantum Research: Why TensorCircuit-NG is the Ultimate Foundation for the Agent Era

1
Comments
3 min read
AI in machines: why the problem runs deeper than we think

AI in machines: why the problem runs deeper than we think

3
Comments 2
3 min read
NVIDIA AI Releases Nemotron-Terminal: A Systematic Data Engineering Pipeline for Scaling LLM Terminal Agents

NVIDIA AI Releases Nemotron-Terminal: A Systematic Data Engineering Pipeline for Scaling LLM Terminal Agents

Comments
4 min read
Anthropic Built a 300K-Query Behavioral Auditing Tool Because Model Behavior Changes. Here's the Production Version.

Anthropic Built a 300K-Query Behavioral Auditing Tool Because Model Behavior Changes. Here's the Production Version.

Comments
4 min read
No GPU? No problem!, running local AI efficiently on my CPU.

No GPU? No problem!, running local AI efficiently on my CPU.

Comments
5 min read
# Understanding RAPTOR: A Powerful Architecture for Hierarchical Retrieval in RAG Systems

# Understanding RAPTOR: A Powerful Architecture for Hierarchical Retrieval in RAG Systems

1
Comments
6 min read
The Orchestration Gap: Why Your AI Agents Can't Find Each Other

The Orchestration Gap: Why Your AI Agents Can't Find Each Other

Comments
2 min read
How Bifrost's MCP Gateway Cuts AI Agent Token Costs by 92% Without Sacrificing Capability

How Bifrost's MCP Gateway Cuts AI Agent Token Costs by 92% Without Sacrificing Capability

11
Comments 3
8 min read
You're Shipping Untested Prompts to Production (Here's How to Fix It)

You're Shipping Untested Prompts to Production (Here's How to Fix It)

Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.