DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Reduce API Costs for Large-Scale Document Analysis with Gemini Context Caching

Reduce API Costs for Large-Scale Document Analysis with Gemini Context Caching

Comments
2 min read
Building a 5-in-1 App with Local LLM and Flutter

Building a 5-in-1 App with Local LLM and Flutter

Comments
2 min read
Gemini 2.5 Flash x Nemotron 9B — Optimal Division of Roles for Cloud LLM and Local LLM

Gemini 2.5 Flash x Nemotron 9B — Optimal Division of Roles for Cloud LLM and Local LLM

Comments
3 min read
Building a Free Research Agent with DuckDuckGo Search + Local LLM

Building a Free Research Agent with DuckDuckGo Search + Local LLM

Comments
2 min read
Running NVIDIA Nemotron-Nano-9B-v2-Japanese Locally: Mamba SSM + Thinking Mode Support

Running NVIDIA Nemotron-Nano-9B-v2-Japanese Locally: Mamba SSM + Thinking Mode Support

Comments
2 min read
LoRA and FT Are Unnecessary: How to Approach Distilled Models

LoRA and FT Are Unnecessary: How to Approach Distilled Models

Comments
2 min read
Giving a 'Brain' to Minecraft NPCs with a Local LLM — Nemotron + Mineflayer Implementation Notes

Giving a 'Brain' to Minecraft NPCs with a Local LLM — Nemotron + Mineflayer Implementation Notes

Comments
3 min read
Fast Searching 4 Million Patent Records with FTS5

Fast Searching 4 Million Patent Records with FTS5

Comments
2 min read
Opik: Your Agent's Black Box Flight Recorder

Opik: Your Agent's Black Box Flight Recorder

1
Comments
5 min read
Demystifying Coding Agents

Demystifying Coding Agents

Comments
7 min read
Session 1: vLLM Overview and the User API

Session 1: vLLM Overview and the User API

Comments
12 min read
vLLM — Session 2: The Engine Layer — Request Management

vLLM — Session 2: The Engine Layer — Request Management

Comments
13 min read
Prompt management, RAG, and agents in NodeJS

Prompt management, RAG, and agents in NodeJS

5
Comments 1
6 min read
Built Reddit like community with AutoBe and AutoView (gpt-4.1-mini and qwen3-235b-a22b)

Built Reddit like community with AutoBe and AutoView (gpt-4.1-mini and qwen3-235b-a22b)

Comments
1 min read
Python QuickStart: Calling AnyAPI.ai for LLM Requests (2026 Edition)

Python QuickStart: Calling AnyAPI.ai for LLM Requests (2026 Edition)

Comments
2 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.