Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
evals
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
The Loop Is Only as Good as the Metric
David Aronchick
David Aronchick
David Aronchick
Follow
May 5
The Loop Is Only as Good as the Metric
#
ai
#
evals
#
machinelearning
#
data
Comments
Add Comment
7 min read
Why Most AI Teams Are Flying Blind: And What to Do About It
aasawari sahasrabuddhe
aasawari sahasrabuddhe
aasawari sahasrabuddhe
Follow
Apr 23
Why Most AI Teams Are Flying Blind: And What to Do About It
#
ai
#
evals
#
genai
#
womenintech
Comments
1
 comment
13 min read
Wait, you guys run evals?
Frank Brsrk
Frank Brsrk
Frank Brsrk
Follow
Apr 22
Wait, you guys run evals?
#
ai
#
evals
#
llm
Comments
Add Comment
1 min read
Evaluate LLM code generation with LLM-as-judge evaluators
Scarlett Attensil
Scarlett Attensil
Scarlett Attensil
Follow
for
LaunchDarkly
Mar 26
Evaluate LLM code generation with LLM-as-judge evaluators
#
ai
#
evals
#
llm
#
agents
6
 reactions
Comments
Add Comment
12 min read
From zero evals to a working multimodal evaluation in 30 minutes using LangWatch Skills
Manouk Draisma
Manouk Draisma
Manouk Draisma
Follow
for
LangWatch
Mar 24
From zero evals to a working multimodal evaluation in 30 minutes using LangWatch Skills
#
ai
#
agents
#
evals
#
claudecode
Comments
Add Comment
7 min read
Your coding agent already knows how to test your AI agent (we just turned it into a Skill)
Manouk Draisma
Manouk Draisma
Manouk Draisma
Follow
Mar 23
Your coding agent already knows how to test your AI agent (we just turned it into a Skill)
#
agents
#
agentskills
#
evals
#
simulations
1
 reaction
Comments
Add Comment
4 min read
Build an eval harness for 184 AI agent prompts with promptfoo
Russell Jones
Russell Jones
Russell Jones
Follow
Mar 30
Build an eval harness for 184 AI agent prompts with promptfoo
#
promptfoo
#
evals
#
aiagents
#
llm
Comments
Add Comment
8 min read
Self-improving Coding Agents
Raphael Porto
Raphael Porto
Raphael Porto
Follow
Mar 27
Self-improving Coding Agents
#
agents
#
harness
#
ai
#
evals
1
 reaction
Comments
1
 comment
5 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account