DEV Community

Machine Learning

A branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Your LLM-as-judge eval set is too small. Here is the math

Your LLM-as-judge eval set is too small. Here is the math

7
Comments 1
4 min read
12 million tokens, linear cost: Subquadratic's bet against the attention tax

12 million tokens, linear cost: Subquadratic's bet against the attention tax

Comments
3 min read
The $1 Trillion Problem: How We're Building AI Agents for the Industry That Hates Software

The $1 Trillion Problem: How We're Building AI Agents for the Industry That Hates Software

8
Comments
3 min read
How FinOps Supports Responsible and Sustainable AI Development

How FinOps Supports Responsible and Sustainable AI Development

Comments
2 min read
Redes Neuronales Convolucionales - Clasificacione de imagenes Landmarks

Redes Neuronales Convolucionales - Clasificacione de imagenes Landmarks

Comments
1 min read
DARKNET-53

DARKNET-53

Comments
2 min read
Promptfoo is a CI gate, not an eval framework. Treating it like one cost us $4,200

Promptfoo is a CI gate, not an eval framework. Treating it like one cost us $4,200

Comments 1
4 min read
I Tested Three AI Memory Retrieval Strategies. The Hard Failure Was Semantic

I Tested Three AI Memory Retrieval Strategies. The Hard Failure Was Semantic

Comments
8 min read
Beyond the "Brute Force Beauty": A Modular, Brain-Inspired LLM Architecture (Thoughts on grand models: Part 2)

Beyond the "Brute Force Beauty": A Modular, Brain-Inspired LLM Architecture (Thoughts on grand models: Part 2)

Comments
4 min read
Token-level eval harness for tool-calling agents: what we wired up

Token-level eval harness for tool-calling agents: what we wired up

Comments 1
4 min read
Embeddings Explained: How AI Turns Words Into Numbers That Actually Mean Something

Embeddings Explained: How AI Turns Words Into Numbers That Actually Mean Something

Comments
9 min read
We Like to Benchmark AI, But What If We've Been Using a Ruler to Measure Weight This Whole Time?

We Like to Benchmark AI, But What If We've Been Using a Ruler to Measure Weight This Whole Time?

1
Comments
5 min read
Per-customer budget caps on our caption pipeline: 3 weeks with virtual keys

Per-customer budget caps on our caption pipeline: 3 weeks with virtual keys

Comments 1
4 min read
Your Next Training Run Might Be Started by an Agent

Your Next Training Run Might Be Started by an Agent

Comments
3 min read
I Fine-Tuned a Compliance Judge and Beat the Stock Model by +29.6pp F1

I Fine-Tuned a Compliance Judge and Beat the Stock Model by +29.6pp F1

Comments
6 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.