DEV Community

Discussion on: Running Local LLMs, CPU vs. GPU - a Quick Speed Test

Collapse
 
orlando_arroyo_1 profile image
Orlando Arroyo

Indeed, 100%GPU off-loading.

I also tested an Ryzen 7950X with 0% off loading, but there’s something odd. I set 32 threads but CPU use is not going beyond 60% and only gets 7tok/s. Any thoughts how about possible cause?

Just for fun, I’ll check with an Asus ROG Ally later (Z1 Extreme version).

Thread Thread
 
maximsaplin profile image
Maxim Saplin

Seems the threads param is ignored, I saw same behaviour when testing CPU inference