Discussion on: Running Local LLMs, CPU vs. GPU - a Quick Speed Test

View post

Intel i7 14700k - 9.82 token/s with no GPU offloading(peaked at 35% CPU usage in LMStudio. Guessing issue with multithreading)
Zotac Trinity non-OC 4080 Super - 71.61 tokens/s max GPU offloading

All numbers measured on non-overclocked factory default setup

Maxim Saplin • Mar 17 '24

Thanks for sharing the numbers!

Orlando Arroyo • Mar 20 '24

Indeed there’s something odd with the multithreading of the CPUs