DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
When to Move Beyond LiteLLM (And When Not To)

When to Move Beyond LiteLLM (And When Not To)

1
Comments
6 min read
I Built a Python Agent That Uses a Vector DB as Memory, Not Retrieval

I Built a Python Agent That Uses a Vector DB as Memory, Not Retrieval

11
Comments 8
6 min read
Claude Code Source Analysis Series, Chapter 5: Tools Overview

Claude Code Source Analysis Series, Chapter 5: Tools Overview

1
Comments
12 min read
Tool-Response Engineering: The Frontier Beyond Prompt Engineering

Tool-Response Engineering: The Frontier Beyond Prompt Engineering

Comments
17 min read
Most RAG failures don’t crash. They silently return bad answers. I built a repair layer for that.

Most RAG failures don’t crash. They silently return bad answers. I built a repair layer for that.

Comments
1 min read
On-device LLM on iPhone: which runtime is fastest? MLX vs llama.cpp vs LiteRT-LM vs CoreML

On-device LLM on iPhone: which runtime is fastest? MLX vs llama.cpp vs LiteRT-LM vs CoreML

1
Comments 1
4 min read
Agents assemble. One agent is a hire. Many agents are a workforce.

Agents assemble. One agent is a hire. Many agents are a workforce.

Comments
5 min read
Gemma 4: Frontier AI in Your Hands”

Gemma 4: Frontier AI in Your Hands”

1
Comments
2 min read
BeeLlama.cpp enhances llama.cpp, Qwen 35B hits 128K context, iOS local LLMs with Ollama

BeeLlama.cpp enhances llama.cpp, Qwen 35B hits 128K context, iOS local LLMs with Ollama

Comments
3 min read
Deterministic reliability stack for LLM pipelines

Deterministic reliability stack for LLM pipelines

Comments
1 min read
LLM Token Counting and Cost Optimization: A Practical Guide

LLM Token Counting and Cost Optimization: A Practical Guide

1
Comments
5 min read
Generation 1 — Standalone Models (2018–2022)

Generation 1 — Standalone Models (2018–2022)

Comments
5 min read
Why Most WordPress SEO Plugins Are Not Ready for AI Search Yet

Why Most WordPress SEO Plugins Are Not Ready for AI Search Yet

Comments
5 min read
A Survey of LLM-based Deep Search Agents Adaptive Path Planning via Weighted A* and Heuristic Rewards

A Survey of LLM-based Deep Search Agents Adaptive Path Planning via Weighted A* and Heuristic Rewards

Comments
4 min read
Evaluating LLM code reviewers: an offline harness for precision, recall, and routing"

Evaluating LLM code reviewers: an offline harness for precision, recall, and routing"

2
Comments
5 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.