DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Why your quantized LLM loses its MTP heads and how to keep them

Why your quantized LLM loses its MTP heads and how to keep them

1
Comments
5 min read
About Sharing Local Inference: A Marketplace for Renting Idle GPUs with an OpenAI-Compatible Backend

About Sharing Local Inference: A Marketplace for Renting Idle GPUs with an OpenAI-Compatible Backend

Comments
8 min read
Cut Your AI Agent Token Costs by 75% With One Skill Plugin

Cut Your AI Agent Token Costs by 75% With One Skill Plugin

Comments
2 min read
Why most LLM API usage is quietly inefficient

Why most LLM API usage is quietly inefficient

Comments
4 min read
Qwen sky proof: compressed memory made a tiny model behave better — with the receipts

Qwen sky proof: compressed memory made a tiny model behave better — with the receipts

Comments
1 min read
The 8B Model That Punches at 32B Weight

The 8B Model That Punches at 32B Weight

Comments
2 min read
Hermes Agent CLI cheat sheet — commands, flags, and slash shortcuts

Hermes Agent CLI cheat sheet — commands, flags, and slash shortcuts

1
Comments
8 min read
Smarter Resource Allocation Beats Stronger Models

Smarter Resource Allocation Beats Stronger Models

Comments 1
6 min read
The "Chat" API is a Token Tax: Why we must return to Stateless Completions

The "Chat" API is a Token Tax: Why we must return to Stateless Completions

Comments
2 min read
Turning Your AI Into an Adversarial Security Agent: The SKILLS.md Framework

Turning Your AI Into an Adversarial Security Agent: The SKILLS.md Framework

2
Comments
11 min read
Self-hosted LLM, same prompt, temperature zero - 6 different answers

Self-hosted LLM, same prompt, temperature zero - 6 different answers

Comments
1 min read
Behavioral Annotations: Why readonly and destructive guide LLM Planning

Behavioral Annotations: Why readonly and destructive guide LLM Planning

Comments
3 min read
KODA Format: A Schema-First Data Format to Reduce LLM Token Usage ( 40%)

KODA Format: A Schema-First Data Format to Reduce LLM Token Usage ( 40%)

1
Comments 1
3 min read
LLM Wire Format Benchmark: Which Format Can AI Actually Read and Write?

LLM Wire Format Benchmark: Which Format Can AI Actually Read and Write?

1
Comments
9 min read
AI For Security Review In Application Code

AI For Security Review In Application Code

3
Comments
14 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.