DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
NER: Gemini vs Spacy vs Compromise

NER: Gemini vs Spacy vs Compromise

1
Comments
4 min read
How Developers Can Use AI for Smarter Google Search

How Developers Can Use AI for Smarter Google Search

Comments
3 min read
The 600x LLM Price Gap Is Your Biggest Optimization Opportunity

The 600x LLM Price Gap Is Your Biggest Optimization Opportunity

1
Comments
2 min read
I Built a Fully Local Paper RAG on an RTX 4060 8GB — BGE-M3 + Qwen2.5-32B + ChromaDB

I Built a Fully Local Paper RAG on an RTX 4060 8GB — BGE-M3 + Qwen2.5-32B + ChromaDB

Comments
10 min read
Running Qwen2.5-32B on RTX 4060 8GB — Beating M4 at 10.8 t/s with llama.cpp

Running Qwen2.5-32B on RTX 4060 8GB — Beating M4 at 10.8 t/s with llama.cpp

1
Comments
7 min read
I built LLM Council: frontier models debating in an immersive 3D chamber

I built LLM Council: frontier models debating in an immersive 3D chamber

1
Comments
3 min read
Show HN: I built a private AI inference API in Australia — data sovereignty, Gemma3, live now

Show HN: I built a private AI inference API in Australia — data sovereignty, Gemma3, live now

Comments 1
1 min read
AI Gateway Caching Explained — Why L1 + L2 Cache Layers Cut 90% of Your LLM Bill

AI Gateway Caching Explained — Why L1 + L2 Cache Layers Cut 90% of Your LLM Bill

5
Comments 1
6 min read
We built an AI that audits other AI agents (here's how A2A works in production)

We built an AI that audits other AI agents (here's how A2A works in production)

Comments
4 min read
How LLMs Can Control Your Computer - Voice-Driven, Local, No API Keys

How LLMs Can Control Your Computer - Voice-Driven, Local, No API Keys

Comments
3 min read
What MCP Actually Is (And Why It Exists)

What MCP Actually Is (And Why It Exists)

2
Comments 3
4 min read
Local LLMs vs Cloud APIs — A Real Cost Comparison (2026)

Local LLMs vs Cloud APIs — A Real Cost Comparison (2026)

1
Comments
2 min read
Postman for AI – a tool that has been missing for a while

Postman for AI – a tool that has been missing for a while

1
Comments
4 min read
Anatomy of a RAG System Architecture

Anatomy of a RAG System Architecture

Comments
5 min read
Transformer Architecture in 2026: From Attention to Mixture of Experts (MoE)

Transformer Architecture in 2026: From Attention to Mixture of Experts (MoE)

3
Comments
3 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.