Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
evaluation
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
How I Approach Evaluation When Building AI Features
Jamie Gray
Jamie Gray
Jamie Gray
Follow
Mar 23
How I Approach Evaluation When Building AI Features
#
ai
#
machinelearning
#
testing
#
evaluation
Comments
Add Comment
6 min read
Evaluating Vendor Offerings: A Structured Approach to Identify High-Quality, Compatible Tools at Conferences
Alina Trofimova
Alina Trofimova
Alina Trofimova
Follow
Mar 19
Evaluating Vendor Offerings: A Structured Approach to Identify High-Quality, Compatible Tools at Conferences
#
devops
#
kubecon
#
evaluation
#
kubernetes
Comments
Add Comment
13 min read
EVAL #006: LLM Evaluation Tools — RAGAS vs DeepEval vs Braintrust vs LangSmith vs Arize Phoenix
Ultra Dune
Ultra Dune
Ultra Dune
Follow
Mar 17
EVAL #006: LLM Evaluation Tools — RAGAS vs DeepEval vs Braintrust vs LangSmith vs Arize Phoenix
#
llm
#
evaluation
#
ai
#
machinelearning
Comments
Add Comment
10 min read
Navigating AI Coding Tools: Strategies for Evaluating and Selecting Optimal Developer Solutions
Denis Lavrentyev
Denis Lavrentyev
Denis Lavrentyev
Follow
Mar 10
Navigating AI Coding Tools: Strategies for Evaluating and Selecting Optimal Developer Solutions
#
ai
#
coding
#
evaluation
#
integration
Comments
Add Comment
12 min read
Building an LLM Evaluation Framework That Actually Works
Ritwika Kancharla
Ritwika Kancharla
Ritwika Kancharla
Follow
Mar 3
Building an LLM Evaluation Framework That Actually Works
#
evaluation
#
llm
#
ai
Comments
Add Comment
7 min read
Evals Aren’t a One-Time Report: Build a Living Test Suite That Ships With Every Release.
Lamhot Siagian
Lamhot Siagian
Lamhot Siagian
Follow
Feb 22
Evals Aren’t a One-Time Report: Build a Living Test Suite That Ships With Every Release.
#
llm
#
ai
#
evaluation
1
 reaction
Comments
Add Comment
6 min read
LLM Evaluation and Testing: How to Build an Eval Pipeline That Actually Catches Failures Before Production
HK Lee
HK Lee
HK Lee
Follow
Mar 6
LLM Evaluation and Testing: How to Build an Eval Pipeline That Actually Catches Failures Before Production
#
ai
#
llm
#
evaluation
Comments
1
 comment
14 min read
If you don't red-team your LLM app, your users will
Lamhot Siagian
Lamhot Siagian
Lamhot Siagian
Follow
Feb 22
If you don't red-team your LLM app, your users will
#
ai
#
llm
#
evaluation
#
security
1
 reaction
Comments
Add Comment
7 min read
Go Ahead and Judge Me- Agent Evaluators in AWS AgentCore
mgbec
mgbec
mgbec
Follow
for
AWS Community Builders
Jan 25
Go Ahead and Judge Me- Agent Evaluators in AWS AgentCore
#
evaluation
#
agents
#
amazonbedrock
Comments
Add Comment
6 min read
Why Image Hallucination Is More Dangerous Than Text Hallucination
Priyam
Priyam
Priyam
Follow
Jan 6
Why Image Hallucination Is More Dangerous Than Text Hallucination
#
evaluation
#
ai
#
machinelearning
#
futureagi
Comments
Add Comment
1 min read
The Self-Evolving Agent (Part 3): The Human in the Loop
Imran Siddique
Imran Siddique
Imran Siddique
Follow
Jan 1
The Self-Evolving Agent (Part 3): The Human in the Loop
#
architecture
#
aigovernance
#
evaluation
#
engineeringleadershi
Comments
Add Comment
4 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account