DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Asking n People (n Generative AI Models) Simultaneously

Asking n People (n Generative AI Models) Simultaneously

2
Comments
4 min read
The MCP Maturity Model: Evaluating Your Multi-Agent Context Strategy

The MCP Maturity Model: Evaluating Your Multi-Agent Context Strategy

2
Comments 2
20 min read
AI That Shows Its Work: The Transparent Revolution of PALs

AI That Shows Its Work: The Transparent Revolution of PALs

Comments 1
11 min read
Building AI Agents: Architecture, Tools & Implementation Guide

Building AI Agents: Architecture, Tools & Implementation Guide

Comments 1
4 min read
The Real Cost of LLM Inference: Memory Bandwidth, Not FLOPs

The Real Cost of LLM Inference: Memory Bandwidth, Not FLOPs

Comments
3 min read
Running Local AI on Linux With GPU: Ollama + Open WebUI + Gemma

Running Local AI on Linux With GPU: Ollama + Open WebUI + Gemma

24
Comments 2
4 min read
What You’re Getting Wrong When Building AI Applications in 2025

What You’re Getting Wrong When Building AI Applications in 2025

Comments
7 min read
Long Long Ago — The History of Generative AI

Long Long Ago — The History of Generative AI

Comments
5 min read
Skills, MCPs, and Commands are the same context engineering trend.

Skills, MCPs, and Commands are the same context engineering trend.

Comments 1
8 min read
DragonMemory: Neural Sequence Compression for Production RAG

DragonMemory: Neural Sequence Compression for Production RAG

3
Comments
8 min read
The Scaling Arms Race Is Over - The Application Age Has Begun

The Scaling Arms Race Is Over - The Application Age Has Begun

Comments
7 min read
Shrinking Giants: A Word on Floating-Point Precision in LLM Domain for Faster, Cheaper Models

Shrinking Giants: A Word on Floating-Point Precision in LLM Domain for Faster, Cheaper Models

1
Comments 2
8 min read
Building with LLMs at Scale: Part 3 - Higher-Level Abstractions

Building with LLMs at Scale: Part 3 - Higher-Level Abstractions

Comments
9 min read
Building with LLMs at Scale: Part 2 - Ergonomics and Observability

Building with LLMs at Scale: Part 2 - Ergonomics and Observability

Comments
6 min read
Prompt Tracker: Turn Your Coding Sessions into a Star Wars Opening Crawl

Prompt Tracker: Turn Your Coding Sessions into a Star Wars Opening Crawl

Comments
8 min read
I Let an LLM Write JavaScript Inside My AI Runtime. Here’s What Happened

I Let an LLM Write JavaScript Inside My AI Runtime. Here’s What Happened

6
Comments 1
5 min read
TOON: The Token Ninja

TOON: The Token Ninja

1
Comments
3 min read
SCAR: A High-Trust Operating System for AI Coding Assistants (Stop Package Hallucinations in Your Repo)

SCAR: A High-Trust Operating System for AI Coding Assistants (Stop Package Hallucinations in Your Repo)

Comments
2 min read
From Brilliant Interns to Reliable Experts: Why Enterprises Are Betting Big on RAG Systems

From Brilliant Interns to Reliable Experts: Why Enterprises Are Betting Big on RAG Systems

Comments
5 min read
🏗️ Vector Database Architecture: How to Structure Your Data for Production RAG Systems

🏗️ Vector Database Architecture: How to Structure Your Data for Production RAG Systems

Comments
3 min read
How Large Language Models (LLMs) actually work

How Large Language Models (LLMs) actually work

9
Comments
5 min read
RAG Architecture for HR Applications: Building Context-Aware Interview Systems

RAG Architecture for HR Applications: Building Context-Aware Interview Systems

Comments
12 min read
🧰 Meet LM-Kit Tool Calling for Local Agents

🧰 Meet LM-Kit Tool Calling for Local Agents

Comments
12 min read
Stop Wasting LLM Tokens: Introducing CTON (Compact Token-Oriented Notation)

Stop Wasting LLM Tokens: Introducing CTON (Compact Token-Oriented Notation)

7
Comments 1
3 min read
RAG vs MCP: Understanding AI Context Solutions

RAG vs MCP: Understanding AI Context Solutions

8
Comments 4
6 min read
loading...