LLM on MacOS: XLM support

One of the reasons to go for Mac was the unified memory setup. With the RAM shared between CPU and GPU apple silicon has a (theoretical) cost and power efficiency over a nVidia setup. Not talking about raw power of a 5900 setup, but compared to my earlier lab with a virtualized setup on a Ryzen xxx I do expect grand improvements.

Base case: old lab

[placeholder to set numbers]

Setting up LM Studio

First things first, I had an old setting with Ubuntu running on AMD with laptop GPU. As bit of a geek I prefer terminal over GUI for hobby projects thus Ollama was a logical starting point. But on Mac there is off course a more user friendly (less geeky) setup possible with LM Studio. For my projects are hobbies I like to play a bit more over ease of use, thus terminal and Ollama it is (possible I add mystify as GUI later on for fun).

Unfortunately I Ollama was at the time or writing not 100% stable supporting MLX (apple's CUDA to make it oversimplified). As I've read that MLX can give 30% speed bums I decided to not geek and go for LM Studio :-)

20 Tokens/Second on deepseek qwen 3 -8b, that's not bad

DEV Community

LLM on MacOS: XLM support

Base case: old lab

Setting up LM Studio

Top comments (0)