DEV Community

Discussion on: Running Local LLMs, CPU vs. GPU - a Quick Speed Test

Collapse
 
bharath063 profile image
Bharath B

Intel i7 14700k - 9.82 token/s with no GPU offloading(peaked at 35% CPU usage in LMStudio. Guessing issue with multithreading)
Zotac Trinity non-OC 4080 Super - 71.61 tokens/s max GPU offloading

All numbers measured on non-overclocked factory default setup

Collapse
 
maximsaplin profile image
Maxim Saplin

Thanks for sharing the numbers!

Collapse
 
orlando_arroyo_1 profile image
Orlando Arroyo

Indeed there’s something odd with the multithreading of the CPUs