Running local AI.

#ai #llm #monitoring #showdev

I had this idea to run AI locally on my own laptop. Just to see if I could. Ended up going with Ollama.

At first it was brutal — all CPU, no GPU, super slow. But I messed around, tweaked some stuff, and finally got it to actually run okay. Not fast, but okay.

Then I went down a rabbit hole. I wanted to know what the models were doing. Like, how hot is my CPU getting? How fast is it spitting out tokens? So I started building my own little monitoring setup. Used C for some low-level stuff, Dash for a live dashboard, Python to glue it all together. Oh and lm-sensors to watch the temps because this thing makes my laptop sweat.

Now I can sit there and watch my models run in real time. Token rate, memory, core temps — all on a dashboard.

Feels good having AI running offline. No cloud, no weird latency, just my machine. And a bunch of scripts I broke and fixed along the way.

If you're thinking about trying local AI, just go for it. Just know you'll end up tinkering way more than you expect. Worth it though.