Text-to-image (T2I) diffusion models (such as Stable Diffusion XL, DALL-E 3, etc.) achieve state-of-the-art (SOTA) performance on various compositional T2I benchmarks, at the cost of significant computational resources. For instance, the unCLIP (i.e., DALL-E 2) stack comprises T2I prior and diffusion image decoder. The T2I prior model itself adds a billion parameters, increasing the computational and high-quality data requirements. Maitreya propose the ECLIPSE, a novel contrastive learning method that is both parameter and data-efficient as a way to combat these issues
Speaker: Maitreya Patel is a PHD student studying at Arizona State University focusing on model performance and efficiency. Whether it is model training or inference, Maitreya strives to make optimizations to make AI more accessible and powerful.
Not a Meetup member? Sign up to attend the next event:
https://voxel51.com/computer-vision-events/
Recorded on April 18, 2024 at the AI, Machine Learning and Data Science Meetup
Top comments (0)