iPad Pro M1 256GB, using LLM Farm to load the model: 12.05tok/s
Asus ROG Ally Z1 Extreme (CPU): 5.25 tok/s using the 25W preset, 5.05tok/s using the 15W preset
Update:
Asked a friend with a M3 Pro 12core CPU 18GB. Running from CPU: 17.93tok/s, GPU: 21.1tok/s
The ROG Ally has a Ryzen Z1 Extreme which appears to be nearly identical to the 7840U, but from what I can discern, the NPU is disabled. So if / when LM Studio gets around to implementing support for that AI accelerator the 7840U should be faster at inferencing workloads.
AMD GPU seems to be an underdog in the ML world, when compared to Nvidia... I doubt that AMD's NPU will see better compatibility with ML stack than it's GPUs
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Just for fun, here are some additional results:
iPad Pro M1 256GB, using LLM Farm to load the model: 12.05tok/s
Asus ROG Ally Z1 Extreme (CPU): 5.25 tok/s using the 25W preset, 5.05tok/s using the 15W preset
Update:
Asked a friend with a M3 Pro 12core CPU 18GB. Running from CPU: 17.93tok/s, GPU: 21.1tok/s
The CPU result for ROG is close to the one from 7840U, after all they almost identical CPUs
The ROG Ally has a Ryzen Z1 Extreme which appears to be nearly identical to the 7840U, but from what I can discern, the NPU is disabled. So if / when LM Studio gets around to implementing support for that AI accelerator the 7840U should be faster at inferencing workloads.
AMD GPU seems to be an underdog in the ML world, when compared to Nvidia... I doubt that AMD's NPU will see better compatibility with ML stack than it's GPUs