DEV Community

# llm

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Pitfalls of Testing LLM Long-Term Memory: A 3‑Day Debugging Saga

Pitfalls of Testing LLM Long-Term Memory: A 3‑Day Debugging Saga

Comments
4 min read
How do Chinese access Claude/GPT API at 0.2x pricing?

How do Chinese access Claude/GPT API at 0.2x pricing?

1
Comments 2
13 min read
Agent-Harness Scaling Law: Feedback Quality Predicts Success, Not Raw Compute: Effective Feedback Compute (EFC)

Agent-Harness Scaling Law: Feedback Quality Predicts Success, Not Raw Compute: Effective Feedback Compute (EFC)

Comments
7 min read
Flash Attention: what it does and why it matters

Flash Attention: what it does and why it matters

Comments
8 min read
Same Lever, Opposite Intent: When Shared Agent Memory Backfires

Same Lever, Opposite Intent: When Shared Agent Memory Backfires

1
Comments 4
2 min read
MCP Model Context Protocol en TypeScript: diseñá tools portables entre Claude, GPT y modelos locales

MCP Model Context Protocol en TypeScript: diseñá tools portables entre Claude, GPT y modelos locales

1
Comments
9 min read
Claude Fable 5 on Databricks is a step-change for agentic workflows

Claude Fable 5 on Databricks is a step-change for agentic workflows

Comments
3 min read
Claude Fable 5: Anthropic's First Mythos-Class Model for General Use

Claude Fable 5: Anthropic's First Mythos-Class Model for General Use

Comments
3 min read
Claude Fable 5: o primeiro modelo Mythos-class para uso geral

Claude Fable 5: o primeiro modelo Mythos-class para uso geral

Comments
4 min read
Flash Attention: what it does and why it matters

Flash Attention: what it does and why it matters

Comments
8 min read
Same question, three answers: a governed MCP server with receipts

Same question, three answers: a governed MCP server with receipts

1
Comments
3 min read
Gemma 4 QAT on 10GB Laptop: Local AI with 6.7GB VRAM

Gemma 4 QAT on 10GB Laptop: Local AI with 6.7GB VRAM

Comments
1 min read
Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

Comments
1 min read
Why We Added Rate Limits Between AI Agents

Why We Added Rate Limits Between AI Agents

Comments
3 min read
Does Bad Memory Make AI More Cautious? We Ran the Experiment

Does Bad Memory Make AI More Cautious? We Ran the Experiment

Comments
8 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.