Ollama CPU: Model Manager Script & Inference with a Terminal UI

#ai #chatgpt #ollama #qwen

DuckDuckGo AI Chat allows you to talk with AI models, such as ChatGPT, however it recently started to put some IP address limits on the number of daily interactions. This prompted me to research other local AI models to run with Ollama. The ones which work better up until now (CPU only) are the 14 billion params Qwen 2.5 coder models. I specifically tried the ones with the least quantization. Smaller models with less parameters, such as the 1.5b ones replied me in Chinese, so they are not usable.

To improve inference quality I wrote a custom Modelfile, specifically for coding assistance in Python, FastAPI and Svelte. These are by far the best Ollama models I tried.

I’ve written a custom whiptail script to simplify Ollama administration via docker-compose. I can now upgrade and list all models, add (customize) new ones, as well ad simply downloading and deleting existing ones, all via a simple keyboard interface.

Instead of relying on Alpaca, this time I used a terminal UI called tgpt. The software is similar to ShellGPT, but it seems to work better: it’s fast and reliable, plus, the multi-line input is really convenient. You can use tgpt with other AI systems besides Ollama.

All the links to the codes and resources are in the video description as usual.

Note: post auto-generated from YouTube feeds.