Everyone wants to have their own personal, customized and always ready-to-go assistant like a desktop jarvis. Or at least, I always wanted to have it.
So that's why I have spent the past days on an early version of a terminal assistant powered by open-source llms hosted locally.
The objective is simple: A locally-hosted llm client that runs on a CLI interface that can answer system-specific questions and have access to my environment.
To get it done, I had to identify the different steps that I had to work on:
- Get the Ollama server running locally ✅
- Prepare the program to run on a dedicated terminal session ✅
- Program the llm client and the interactions between it and the tools ✅
Disclaimer: It is the first kind-of-ai-agent software that I develop so feel free to correct me if I could have done something in a better way 😶
First step: Ollama server 🦙
For setting up the ollama server, I just had to download Ollama from the web and get it to run locally with the command ollama serve
Another necessary thing for having an llm model running is obviously downloading the model that I'm going to use with the command ollama pull
.
For my terminal assistant I had to try different models until I found a right balance between response quality, tool use capacity and performance.
The models that I tried are the following:
- gemma3:1b
- gemma3:4b
- mistral:7b
- llama3.1:8b
- qwen3:8b
- qwen3:14b
And from that list, the two that gave out the best results are both qwen3 of 8b and 14b parameters.
App development: terminal session
The assistant is designed to run inside a cli session (specifically KDE's terminal, Konsole)
To do this, whenever the python script is executed, it calls itself inside a Konsole interface with the command konsole -e python3 <script_path> --interactive
, where script_path is the absolute path to the assistant's source code, and
--interactive is the flag that we use to run the instance.
App development: LLM client
The LLM client is the core of the application. It is designed as a class instance wich has some parameters as the model name, the url to the local Ollama server and other configurations.
Inside the class there are some useful tools that the LLM can use in case that the user wants the assistant to interact with the system or get useful info on the web.
The output of each message is yielded as chunks from the generator function get_response_stream wich makes it look faster and nicer.
Demo snapshot 👀
Here's a demonstration of the looks of the current version:
Learnings through the development 🧠
While working on this project I had to investigate a lot of ways to implement tool use on a Ollama model and I have got more comfortable with the langchain framework, which I think is a really nice takeaway.
Features to improve or add 🛠️
- Memory between sessions
- More interaction with the system
- Compatibility with other OS
> Did my project catch your attention?
If so, you can check it at my github: https://github.com/nairec/compy
(Your star could be the first one that I get ever ⭐)
That was all! Have you ever considered having your own intelligent assistant at a command's distance? I would love to see your thoughts, or your opinion on the project!
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments. Some comments have been hidden by the post's author - find out more