DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
RAG isn't memory. It's Ctrl+F with embeddings.

RAG isn't memory. It's Ctrl+F with embeddings.

Comments
9 min read
From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model ($3/day $0/day)

From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model ($3/day $0/day)

Comments
6 min read
The Amnesia Tax: Why Stateless Agents Are Eating Your Margins

The Amnesia Tax: Why Stateless Agents Are Eating Your Margins

Comments
3 min read
Escaping API Quotas: How I Built a Local 14B Multi-Agent Squad for 16GB VRAM (Qwen3.5 & DeepSeek-R1)

Escaping API Quotas: How I Built a Local 14B Multi-Agent Squad for 16GB VRAM (Qwen3.5 & DeepSeek-R1)

Comments 1
3 min read
The Context Window Is the New Memory Architecture

The Context Window Is the New Memory Architecture

Comments
4 min read
GPT-5.1 scored 26%. Gemini 3 Flash scored 74%. Same prompt, same tools.

GPT-5.1 scored 26%. Gemini 3 Flash scored 74%. Same prompt, same tools.

Comments
8 min read
NemoClaw practical guide for secure OpenClaw operations in 2026

NemoClaw practical guide for secure OpenClaw operations in 2026

Comments
12 min read
ChatGPT 5.4 v/s Claude Opus 4.6: Which Model Should You use?

ChatGPT 5.4 v/s Claude Opus 4.6: Which Model Should You use?

19
Comments 3
10 min read
Compliance Without Comprehension

Compliance Without Comprehension

1
Comments
3 min read
I benchmarked 3 local LLMs on 50 factual questions -here's what failed

I benchmarked 3 local LLMs on 50 factual questions -here's what failed

1
Comments
1 min read
LangChain Agents Deep Dive: The Ultimate Guide to Building Intelligent Agents in 2026

LangChain Agents Deep Dive: The Ultimate Guide to Building Intelligent Agents in 2026

Comments
6 min read
How I Crashed My AI Agent Fleet in 30 Minutes (And Fixed It): VRAM Management on Apple Silicon

How I Crashed My AI Agent Fleet in 30 Minutes (And Fixed It): VRAM Management on Apple Silicon

Comments
5 min read
My AI Agent Ate 178,000 Tokens in 30 Minutes — Here's Why (And How to Prevent It)

My AI Agent Ate 178,000 Tokens in 30 Minutes — Here's Why (And How to Prevent It)

Comments
5 min read
Qwen3.5 in Pure C

Qwen3.5 in Pure C

Comments
5 min read
The liar's dividend has a second payout, and devs helped build it

The liar's dividend has a second payout, and devs helped build it

Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.