DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Running 1M-token context on a single GPU (the math)

Running 1M-token context on a single GPU (the math)

Comments
2 min read
I benchmarked identity drift across 5 AI agent memory architectures — here's what I found

I benchmarked identity drift across 5 AI agent memory architectures — here's what I found

Comments
3 min read
I Read a Paper That Genuinely Made Me Stop and Think — AI is Now Jailbreaking Other AI

I Read a Paper That Genuinely Made Me Stop and Think — AI is Now Jailbreaking Other AI

Comments
3 min read
One line of Python to extend your LLM's context window 10x

One line of Python to extend your LLM's context window 10x

Comments
1 min read
KV cache memory calculator: how much does your LLM actually use?

KV cache memory calculator: how much does your LLM actually use?

Comments
3 min read
Build Your Own AI-Powered Knowledge Base with LLMs and Obsidian

Build Your Own AI-Powered Knowledge Base with LLMs and Obsidian

3
Comments
6 min read
How Much GPU Memory Does NexusQuant Actually Save?

How Much GPU Memory Does NexusQuant Actually Save?

Comments
4 min read
What I Learned Testing 12 Compression Approaches That Failed

What I Learned Testing 12 Compression Approaches That Failed

Comments
6 min read
The Math Behind E8 Lattice Quantization (with Code)

The Math Behind E8 Lattice Quantization (with Code)

Comments
6 min read
Why Your RAG System Returns Garbage (And How to Actually Fix It)

Why Your RAG System Returns Garbage (And How to Actually Fix It)

Comments
5 min read
Six Characters Fixed My AI's Personality: A Fine-Tuning Story

Six Characters Fixed My AI's Personality: A Fine-Tuning Story

Comments
4 min read
How to deploy NexusQuant in production (and what's missing)

How to deploy NexusQuant in production (and what's missing)

Comments
4 min read
NexusQuant benchmarks: every number, honestly

NexusQuant benchmarks: every number, honestly

Comments
5 min read
NexusQuant vs KVTC vs TurboQuant vs CommVQ — honest comparison

NexusQuant vs KVTC vs TurboQuant vs CommVQ — honest comparison

Comments
4 min read
Compress your LLM's KV cache 33x with zero training

Compress your LLM's KV cache 33x with zero training

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.