DEV Community

Cover image for ๐Ÿ“ŒWhy your GPU gets slower during training (Even though nothing changed!)๐Ÿ“Œ
Prashant Lakhera
Prashant Lakhera

Posted on

๐Ÿ“ŒWhy your GPU gets slower during training (Even though nothing changed!)๐Ÿ“Œ

I've come across so many blogs on this topic, but most of them explain it in such a complicated way that it's almost impossible to follow.
So I decided to break it down in simple, easy-to-understand language, that actually helps you diagnose slowdowns and fix them.

Here are the 5 reasons why your GPU starts fast but gradually becomes slower during training
1๏ธโƒฃ Your workload becomes memory-bound instead of compute-bound
2๏ธโƒฃ Your workload becomes less parallelizable
3๏ธโƒฃ Your tensor shapes stop aligning with GPU-friendly sizes
4๏ธโƒฃ Thermal throttling: GPU heats up and automatically slows down to protect itself
These excerpts are taken from my book "Building a Small Language Model from Scratch: A Practical Guide". If you'd like to dive deeper into the topic, feel free to check out the book.

โœ… Gumroad: https://plakhera.gumroad.com/l/BuildingASmallLanguageModelfromScratch
โœ… Amazon: https://www.amazon.com/dp/B0G64SQ4F8/
โœ… Leanpub: https://leanpub.com/buildingasmalllanguagemodelfromscratch/

๐Ÿ”— Blog link: https://www.linkedin.com/pulse/why-your-gpu-gets-slower-during-training-even-though-nothing-lakhera-wblsc/

Top comments (0)