DEV Community

Play Button Pause Button
Jimmy Guerrero for Voxel51

Posted on

Computer Vision Meetup: Who needs RLHF When You Have SFT?

This talk will center around Reinforcement Learning from Human Feedback, and more importantly, “Why” is it even needed over Supervised Fine-Tuning? We will also understand in easy terms some current open problems in RLHF as far as research in academia is concerned.

Speaker: Srishti Gureja is an ML engineer and researcher broadly interested in two things: ML efficiency techniques, including but not limited to designing algorithms that make maximum use of the hardware at hand, and the alignment in LLMs using literature from RL. She is currently researching better, simpler methods for aligning language models with Eleuther AI and Alex Havrilla from Georgia Tech. her full-time job is as an ML Engineer at Writesonic, a YC-backed startup.

Not a Meetup member? Sign up to attend the next event:

https://voxel51.com/computer-vision-ai-meetups/

Recorded on May 2, 2024 at the AI, Machine Learning and Data Science Meetup.

Top comments (0)