One of the interesting use cases for local LLMs are the natural language interfaces for the terminal commands generation. Instead of memorizing command syntax, flags and shell quirks, you can simply describe what you want to do like:
Find all PNG files larger than 5 MB and move them to archive
An LLM not only translates that request into a real shell command, but also executes it.
In this article, I will build a minimal implementation of such a tool in Python - a lightweight command-line assistant called piko::ai. The entire Python code fits in less than 40 lines of code, yet it provides:
- natural language to bash conversion
- configurable LLM backends
- structured JSON responses
- dangerous command detection
- and automatic command execution
The implementation uses a local LLM through Ollama, but the architecture is flexible enough to support almost any provider.
Why build a tool like this?
Shell commands are incredibly powerful, but for complex or chained/piped cases they are also difficult to remember and prone to human error. Even experienced developers regularly search for things like rarely used command switches.
Large language models are surprisingly good at translating intent into shell syntax, so the goal of this tool is to instead of thinking in commands, allow the user to think in actions. For example:
find all jpg files modified in the last 7 days and compress them into images.tar.gz
should generate:
find . -type f -name "*.jpg" -mtime -7 | tar -czf images.tar.gz -T -
For simple commands there is no justification for writing long sentences, but as soon as you start searching the internet or asking AI:
- is it
-mtimeor-dtime? - was it
-czfor-xvffor the archive creation? - should there be
-Tor--files-fromor both are fine?
it may be acutally quicker to just type the request directly in the terminal and get the output for that request also directly in the terminal.
The architecture
The entire flow is extremely simple:
- User types a natural language request.
- Python inserts that request into a prompt template.
- The prompt is sent to an LLM.
- The LLM returns JSON containing a shell command.
- Python checks whether the command is potentially dangerous.
- If approved, the command is executed.
If you have never written stuff like that, you will be surprised how little code is actually required.
The Python implementation
For the TL;DR-readers, below is the complete implementation - if you want to test it out, you can clone the piko_ai GitHub repository. I go into details on what exactly does this code do in the paragraphs below:
import os
import sys
import requests
import json
import subprocess
MAIN_FILE_PATH: str = os.path.join(os.path.dirname(os.path.abspath(__file__)))
PIKO_AI_CONFIG_FILE_PATH: str = os.path.join(MAIN_FILE_PATH, "..", "config", "pai_config.json")
PIKO_AI_PROMPT_FILE_PATH: str = os.path.join(MAIN_FILE_PATH, "..", "config", "pai_prompt.txt")
with open(PIKO_AI_CONFIG_FILE_PATH) as file:
config = json.load(file)
with open(PIKO_AI_PROMPT_FILE_PATH) as file:
prompt_template: str = file.read()
user_request: str = " ".join(sys.argv[1:])
prompt: str = prompt_template.replace("@USER_REQUEST@", user_request)
request_payload: dict = config["llm_request"]
request_payload["prompt"] = prompt
response = requests.post(config["llm_provider_url"], json=request_payload, timeout=config["llm_request_timeout"])
response.raise_for_status()
command: str = json.loads(response.json()["response"])["command"]
print(f"$ {command}")
for dangerous_command in config["dangerous_commands"]:
if dangerous_command in command:
user_response: str = input(f"WARNING: dangerous command detected ({dangerous_command})! Are you sure you want to run it? (y/n)")
if user_response == "y":
break
sys.exit(0)
subprocess.run(command, shell=True)
Let’s break it down piece by piece.
AI is powerful. Snippets are instant.
Stop prompting for the same patterns repeatedly. Get almost 100 free VS Code snippets for C++, Python, CMake and Bazel from piko::snippets GitHub repository.
Step 0: set up constants
import os # for joining paths
import sys # for command line arguments and exit
import requests # for sending requests to LLM
import json # for JSON parsing
import subprocess # for executing generated commands
# path to this file which serves as the reference for further paths
MAIN_FILE_PATH: str = os.path.join(os.path.dirname(os.path.abspath(__file__)))
# path to configuration file
PIKO_AI_CONFIG_FILE_PATH: str = os.path.join(MAIN_FILE_PATH, "..", "config", "pai_config.json")
# path to file with the prompt template
PIKO_AI_PROMPT_FILE_PATH: str = os.path.join(MAIN_FILE_PATH, "..", "config", "pai_prompt.txt")
Step 1: loading configuration and prompt files
with open(PIKO_AI_CONFIG_FILE_PATH) as file:
config = json.load(file)
with open(PIKO_AI_PROMPT_FILE_PATH) as file:
prompt_template: str = file.read()
Even in the smallest tools, I always like to separate:
- application logic
- configuration
- prompts
This separation makes the tool flexible enough to support:
- Ollama
- OpenAI-compatible APIs
- local inference servers
- cloud providers
- completely custom backends
For this example however, I will only use local Ollama model.
Step 2: reading the user request
The CLI request is simply reconstructed from command-line arguments. For more complex command line interfaces argparse could come in handy, but in case of this tool, such one-liner is completely enough to concatenate the entire output into a single string, without forcing user to use "" around the request.
user_request: str = " ".join(sys.argv[1:])
So:
pai find all files larger than 55kB
becomes:
"find all files larger than 55kB"
That text is then injected into the prompt template and assigned to a proper field in the request_payload.
prompt: str = prompt_template.replace("@USER_REQUEST@", user_request)
request_payload: dict = config["llm_request"]
request_payload["prompt"] = prompt
Step 3: prompt engineering
The prompt template looks like this:
You are a shell command generator.
User requests from you a bash command in a human readable language and your job is to convert this request into a bash command that the user can immediately invoke.
Requirements:
- If not stated differently, always assume that the command must be executed in the current working directory (.)
- The command must be syntactically valid
- The command must be fully executable
- Prefer a single grep command over pipelines when possible
- Command must be usable out of the box, so don't provide any mock values, but e.g. when the user says "here", it means "."
User request:
@USER_REQUEST@
This prompt emphasizes several important aspects.
Establishing assumptions
This line:
always assume that the command must be executed in the current working directory
greatly improves usability of small models because users are expected to naturally say things like:
find here all markdown files
instead of:
find all markdown files in /home/user/Documents/applications/private_projects/docs”
Biasing command style
This instruction:
Prefer a single grep command over pipelines when possible
helps shape output quality. Small LLMs often overcomplicate shell commands and at the same time (because of their size), make mistakes in these overcomplicated commands. Prompt constraints can push them toward simpler solutions.
Preventing placeholders
Without instructions like:
don't provide any mock values
models often generate unusable outputs like:
grep -r "keyword" /path/to/directory
instead of:
grep -r "keyword" .
The prompt explicitly forces executable commands and not suggestions.
Read also on pikotutorial.com: How to write Arduino Uno code with Python?
Step 4: sending the request to the LLM
response = requests.post(
config["llm_provider_url"],
json=request_payload,
timeout=config["llm_request_timeout"]
)
The configuration file defines the request payload and the output schema.
{
"llm_request": {
"model": "qwen2.5-coder:1.5b",
"format": {
"type": "object",
"properties": {
"command": {
"type": "string"
}
},
"required": ["command"]
},
"prompt": "TO BE REPLACED BY THE ACTUAL PROMPT",
"options": {
"temperature": 0.1,
"seed": 42
},
"stream": false
},
"llm_provider_url": "http://localhost:11434/api/generate",
"model_name": "qwen2.5-coder:1.5b",
"llm_request_timeout": 60,
}
Several details here are especially important.
Structured JSON output
This is one of the most important implementation details. Instead of asking the model for plain text, the tool requests structured JSON like the one below:
{
"command": "grep -r \"TODO\" ."
}
Without structured output, models may generate additional explanations or markdown formatting. Structured generation constrains the model into machine-readable output.
Low temperature and a fixed seed
"temperature": 0.1,
"seed": 42
Command generation is not creative writing, so we want deterministic, predictable and reproducible outputs. Low temperature and a fixed seed reduce hallucinations and let the user learn the tool because with the repeatable input-output relation, even if the tool misbehaves for some requests formulations, but improves for others, it lets the user to consistently improve on the tool usage. With unpredictable and non-reproducible outputs, the user would never be able to tune the inputs for the expected outputs.
Why qwen2.5-coder:1.5B?
Such tool is supposed to be a reasonable alternative for just invoking the command or searching for the command and then invoking it. If processing of a request would take several minutes, any other form (googling, copying, asking AI chat bot etc.) would end up being faster than using this tool. I needed to select something small, so that it can run fast locally. Fortunately, our task is highly specialized, so even small code-focused models are capable of handling it.
Step 5: parsing the model response
command: str = json.loads(response.json()["response"])["command"]
The returned command is the bash command extracted from the generated JSON. Then the tool prints it:
print(f"$ {command}")
Users should always see exactly what will be executed.
Step 6: dangerous command detection
This may be actually the most important part of the implementation. The tool scans the generated command for operations defined as dangerous in the config file:
for dangerous_command in config["dangerous_commands"]:
if dangerous_command in command:
user_response: str = input(f"WARNING: dangerous command detected ({dangerous_command})! Are you sure you want to run it? (y/n)")
if user_response == "y":
break
sys.exit(0)
After a dangerous operation is detected, the command execution requires explicit confirmation from the user side. Here, by "dangerous" I mean any command that performs any actual modification of the user data.
Warning: this implementation intentionally favors simplicity over perfect security. The current check is only substring matching, so it may overreact e.g. if you have a folder name containg "rm" letters next to each other.
Step 7: Executing the command
Finally:
subprocess.run(command, shell=True)
executes the generated shell command. This is where the it transforms from AI suggestion into actual tool because the generated command is not only displayed, but actually called.
Example usage
Command:
pai find all files larger than 1kB
Output:
$ find . -type f -size +1k
./main.py
Command:
pai list all .py files with find, but filter out findings from ./venv folder
Output:
$ find . -name '*.py' ! -path './venv/*'
./src/main.py
Command:
pai grep for all usages of MAIN_FILE_PATH
Output:
$ grep -r 'MAIN_FILE_PATH' .
./src/main.py:MAIN_FILE_PATH: str = os.path.join(os.path.dirname(os.path.abspath(__file__)))
./src/main.py:PIKO_AI_CONFIG_FILE_PATH: str = os.path.join(MAIN_FILE_PATH, "..", "config", "pai_config.json")
./src/main.py:PIKO_AI_PROMPT_FILE_PATH: str = os.path.join(MAIN_FILE_PATH, "..", "config", "pai_prompt.txt")
./scripts/install.sh:MAIN_FILE_PATH="$PIKO_AI_DIR/src/main.py"
./scripts/install.sh:ALIAS_LINE="alias pai='$VENV_DIR/bin/python3 $MAIN_FILE_PATH'"
Top comments (0)