DEV Community

0 seconds of 32 minutes, 17 secondsVolume 90%
Press shift question mark to access a list of keyboard shortcuts
00:00
00:00
32:17
 
Jimmy Guerrero for Voxel51

Posted on

2

Computer Vision Meetup: Evaluating RAG Models for LLMs - Key Metrics and Frameworks

Evaluating the model performance is the key for ensuring effectiveness and reliability of LLM models. In this talk, we will look into the intricate world of RAG evaluation metrics and frameworks, exploring the various approaches to assessing model performance. We will discuss key metrics such as relevance, diversity, coherence, and truthfulness and examine various evaluation frameworks, ranging from traditional benchmarks to domain-specific assessments, highlighting their strengths, limitations, and potential implications for real-world applications.

About the Speaker

Abi Aryan is the founder of Abide AI and a machine learning engineer with over eight years of experience in the ML industry building and deploying machine learning models in production for recommender systems, computer vision, and natural language processing—within a wide range of industries such as ecommerce, insurance, and media and entertainment. Previously, she was a visiting research scholar at the Cognitive Sciences Lab at UCLA where she worked on developing intelligent agents. Also, she has authored research papers on AutoML, multi agent systems, and LLM cost modeling and evaluations and is currently authoring LLMOps: Managing Large Language Models in Production for O’Reilly Publications.

Not a Meetup member? Sign up to attend the next event:

https://voxel51.com/computer-vision-events/

Recorded on Aug 8, 2024 at the AI, Machine Learning and Computer Vision Meetup.

API Trace View

Struggling with slow API calls?

Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay