Alain Airom

Posted on Oct 3

🪨 ‘Granite 4’ Meets ‘Kubectl-AI’: Building Your Own Local, Executable Kubernetes Assistant

#granite #google #kubernetes #ollama

Stop Googling Commands: A Personal Kubernetes AI Shell Powered by IBM Granite 4 and Google’s kubectl-ai

Introduction

What is the latest IBM Granite 4 Language Model

The IBM Granite 4.0 series represents the latest generation of large language models (LLMs) from IBM, specifically engineered for enterprise applications where efficiency and cost-effectiveness are paramount. Leveraging a novel hybrid Mamba-2/Transformer architecture, Granite 4.0 models drastically reduce memory requirements — often by over 70% compared to conventional LLMs — while maintaining competitive performance, particularly in complex tasks like Retrieval Augmented Generation (RAG) and tool-calling for agentic workflows. Available in various sizes, including Tiny and Micro variants optimized for local deployment and low-latency edge applications (such as the integration with Ollama), the Granite series is released under the Apache 2.0 license, emphasizing transparency, customizability, and flexible deployment across diverse infrastructure.
...
The Granite 4.0 collection comprises multiple model sizes and architecture styles to provide optimal production across a wide array of hardware constraints, including:

Granite-4.0-H-Small, a hybrid mixture of experts (MoE) model with 32B total parameters (9B active)
Granite-4.0-H-Tiny, a hybrid MoE with 7B total parameters (1B active)
Granite-4.0-H-Micro, a dense hybrid model with 3B parameters.
This release also includes Granite-4.0-Micro, a 3B dense model with a conventional attention-driven transformer architecture, to accommodate platforms and communities that do not yet support hybrid architectures.

Granite 4.0-H Small is a workhorse model for strong, cost-effective performance on enterprise workflows like multi-tool agents and customer support automation. The Tiny and Micro models are designed for low latency, edge and local applications, and can also serve as a building block within larger agentic workflows for fast execution of key tasks such as function calling.

Granite 4.0 benchmark performance shows substantial improvements over prior generations — even the smallest Granite 4.0 models significantly outperform Granite 3.3 8B, despite being less than half its size — but their most notable strength is a remarkable increase in inference efficiency. Relative to conventional LLMs, our hybrid Granite 4.0 models require significantly less RAM to run, especially for tasks involving long context lengths (like ingesting a large codebase or extensive documentation) and multiple sessions at the same time (like a customer service agent handling many detailed user inquiries simultaneously).

Most importantly, this dramatic reduction in Granite 4.0’s memory requirements entails a similarly dramatic reduction in the cost of hardware needed to run heavy workloads at high inference speeds. Our aim is to lower barriers to entry by providing enterprises and open-source developers alike with cost-effective access to highly competitive LLMs.

What is Google’s kubectl-ai?

Google’s kubectl-ai is an open-source, AI-powered command-line tool designed to act as an intelligent assistant **for managing Kubernetes clusters, effectively bridging the gap between human language and complex Kubernetes operations. Its core function is to take user requests articulated in **plain English (e.g., "Scale the deployment named 'web-api' to 5 replicas" or "Show me all pods in CrashLoopBackOff") and translate them into the precise, required kubectl commands or YAML configuration files. The tool significantly boosts productivity by eliminating the need for users to memorize intricate command flags, complex YAML syntax, or constant documentation lookups. Furthermore, kubectl-ai offers an interactive mode for conversational troubleshooting, supports multiple LLM providers (including Gemini, OpenAI, and local models via Ollama), and, most importantly, operates with a safety-first approach by asking for user confirmation before executing any command that modifies resources on the cluster.

There are several ways to install and use this tool. I chose a manual installation as described on the GitHub’s repository.

tar -zxvf kubectl-ai_Darwin_arm64.tar.gz
chmod a+x kubectl-ai
sudo mv kubectl-ai /usr/local/bin/

You need to have a Gemini API Key (which I use in the sample applications I made using Ollama locally).

export GEMINI_API_KEY=your_api_key_here

When I first got knowledge of kubectl-ai, it immediately struck me as an incredibly valuable asset for anyone working with Kubernetes. The traditional routine of constantly Googling command syntax, sifting through cheatsheets, or poring over documentation for specific kubectl operations became obsolete. This tool provides a far more efficient and intuitive way to interact with my clusters, effectively putting a knowledgeable Kubernetes expert right at my fingertips, ready to translate my natural language requests into precise, executable commands without any friction.

This gave me the idea to make the best of the tools I could find till now, a personal K8S assistant. ✌️

Personal Assistant Code

The core concept driving this application is to create a seamless, intuitive interface for managing Kubernetes. Instead of wrestling with complex command-line syntax or memorizing endless kubectl flags, the goal is to leverage the power of generative AI. This allows users to simply ask questions about Kubernetes commands in natural language, describing their intent in plain English (or other lanaguages… not tested yet), and have the AI translate those requests into the exact, executable commands needed to interact with their clusters. It transforms the often-steep learning curve of Kubernetes into a conversational experience, making cluster management more accessible and efficient.

Description of my environment

My local development environment is configured to provide a robust and flexible platform for containerization and Kubernetes cluster management. Specifically, I utilize Podman Desktop to offer an intuitive graphical interface for easily managing containers, images, and volumes. Underpinning this, the Podman engine handles the actual container runtime operations, ensuring efficient and daemonless management of my containerized workloads. For my Kubernetes needs, a local Minikube cluster serves as my development sandbox, allowing me to deploy, test, and experiment with Kubernetes applications without the overhead of a full cloud-based cluster, all within a seamlessly integrated local ecosystem.

In order to use Minikube alongside with Podman, you’ll need to do;

minikube start --driver=podman

Also in order to have a minimal application deployment on my Kubernetes cluster, I deployed an nginx server. The purpose of this deployment is/was just to be able to run some realistic commands in my environment.

Environment setup

I begin describing the steps to install nginx on top of the cluster and the I’ll jump into the application.

Pull the nginx image

podman pull nginx:latest
# or
docker pull nginx:latest

####
podman pull nginx:latest
Trying to pull docker.io/library/nginx:latest...
Getting image source signatures
Copying blob sha256:08b278be6f74e8daedcf9071f2b01c74ea2ebed0d2b4e0a0d87f63365fadffdf
Copying blob sha256:e2a2ff429ed911917cc7541492e92b083056c251a396562117f3259ea24ce388
Copying blob sha256:f4e51325a7cb57cd9ae67bd9540483838b96bf7c9b0bf18205d9d30819e9ca38
Copying blob sha256:4c5dff34614b43f1e7186ffe85a6de5af902a09c2231b06855a75314dbebb582
Copying blob sha256:f0838cc8bade63262b7ec7e9540ecd55f5f254e0bac6020f3362a811fa90bbf3
Copying blob sha256:bb2e7b5dc9bcdbfb0e8119c35fd33fb8580c93813561ebcdea708b98e82f42c7
Copying blob sha256:5f648e62e9ca7ad38d7eedfda371f0237f5114f7ad9998c3f1179a8ea7cec1ba
Copying config sha256:0777d15d89ecedd8739877d62d8983e9f4b081efa23140db06299b0abe7a985b
Writing manifest to image destination
0777d15d89ecedd8739877d62d8983e9f4b081efa23140db06299b0abe7a985b

Minikue — nginx setup ⚙️

kubectl create namespace nginx-space

apiVersion: v1
kind: Namespace
metadata:
  name: nginx-space

kubectl apply -f namespace.yaml

# --- Nginx Deployment ---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: nginx-space # Specify the target namespace
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
---
# --- Nginx Service (NodePort to expose externally) ---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: nginx-space # Specify the target namespace
  labels:
    app: nginx
spec:
  type: NodePort # Use NodePort for easy Minikube access
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 80       # Port the service exposes
      targetPort: 80 # Port the container is listening on (Nginx default)
      # You can optionally specify a nodePort, e.g., nodePort: 30080

kubectl apply -f nginx-deployment-service.yaml

Verify that nginx works properly ✅

# Check the Deployment status in the namespace
kubectl get deployments -n nginx-space

# Check the Pods status in the namespace
kubectl get pods -n nginx-space

# Check the Service status in the namespace and note the port
kubectl get services -n nginx-space

Scale your nginx sever for instance;

# Scale to 2 replicas
kubectl scale deployment nginx-deployment -n nginx-space --replicas=2

# and scale down
kubectl scale deployment nginx-deployment -n nginx-space --replicas=0

Access your server!

minikube service nginx-service -n nginx-space --url

Also with the Podman desktop you can verify your server.

Application Implementation

Given that the Granite 4 series was just released, I immediately opted to integrate this cutting-edge model into my application. Granite 4, the latest iteration of IBM’s Granite models, stands out for its exceptional efficiency and economic performance, making it an ideal choice for local deployments. Its availability on platforms like Ollama and Hugging Face was a crucial factor, enabling me to seamlessly incorporate it with my local Ollama instance, thus powering my Kubernetes assistant with a highly capable and resource-friendly generative AI.

Let’s jump into the application. As usual prepare your environment and install all the requirements.

python3 -m venv venv
source venv/bin/activate

pip install --upgrade pip

pip install ollama

Pull the model from Ollama site;

ollama run granite4

And implement the code 📟

import tkinter as tk
from tkinter import ttk, scrolledtext
import subprocess
import threading
import ollama
from ollama import Client

OLLAMA_MODEL = "granite4:latest" # Ensure this model is pulled on your Ollama server
OLLAMA_HOST = "http://localhost:11434" # Default Ollama host
SYSTEM_PROMPT = """
You are an expert Kubernetes assistant. Your sole task is to translate natural language user requests into a single, executable 'kubectl' or 'docker' command.

RULES:
1. DO NOT include any explanatory text, markdown formatting, or notes, only the command itself.
2. The command MUST start with 'kubectl' or 'docker'.
3. Always respond with the command immediately.
"""

class KubectlApp(tk.Tk):
    """
    A desktop application using Tkinter to generate and locally execute
    kubectl/docker commands via Ollama.
    """
    def __init__(self):
        super().__init__()
        self.title("Kubectl AI Local Shell")
        self.geometry("800x600")

        # Initialize Ollama Client
        try:
            self.ollama_client = Client(host=OLLAMA_HOST)
            self.ollama_status = "Connected"
        except Exception:
            self.ollama_client = None
            self.ollama_status = "Connection Failed"

        # Current generated command storage
        self.generated_command = tk.StringVar(value="")

        self._setup_ui()
        self._set_status()

    def _setup_ui(self):
        """Sets up the main application layout and widgets."""


        self.grid_columnconfigure(0, weight=1)
        self.grid_rowconfigure(2, weight=1) # The result area gets priority height

        # --- Header Frame (Info) ---
        header_frame = ttk.Frame(self, padding="10")
        header_frame.grid(row=0, column=0, sticky="ew")

        ttk.Label(header_frame, text="💡 Kubectl AI Local Shell", font=("Arial", 16, "bold")).pack(side=tk.LEFT)
        self.status_label = ttk.Label(header_frame, text="", font=("Arial", 10))
        self.status_label.pack(side=tk.RIGHT)

        # --- Command Generation Frame (Input) ---
        input_frame = ttk.Frame(self, padding="10")
        input_frame.grid(row=1, column=0, sticky="ew")
        input_frame.grid_columnconfigure(1, weight=1)

        ttk.Label(input_frame, text="1. Natural Language Prompt:").grid(row=0, column=0, sticky="w", pady=5)
        self.prompt_entry = ttk.Entry(input_frame, width=80)
        self.prompt_entry.grid(row=0, column=1, sticky="ew", padx=(10, 10))
        self.prompt_entry.bind('<Return>', lambda event: self._start_generation_thread())

        self.generate_button = ttk.Button(input_frame, text="Generate Command", command=self._start_generation_thread, style="Accent.TButton")
        self.generate_button.grid(row=0, column=2, sticky="e")

        command_frame = ttk.Frame(self, padding="10")
        command_frame.grid(row=2, column=0, sticky="ew")
        command_frame.grid_columnconfigure(0, weight=1)

        ttk.Label(command_frame, text="2. Generated Command (Editable for safety):").grid(row=0, column=0, sticky="w", pady=5)

        self.command_entry = ttk.Entry(command_frame, textvariable=self.generated_command, font=("Courier", 11))
        self.command_entry.grid(row=1, column=0, sticky="ew", padx=(0, 10))

        self.execute_button = ttk.Button(command_frame, text="▶ Execute Locally", command=self._start_execution_thread, style="Execute.TButton")
        self.execute_button.grid(row=1, column=1, sticky="e")
        self.execute_button["state"] = "disabled"

        result_frame = ttk.Frame(self, padding="10")
        result_frame.grid(row=3, column=0, sticky="nsew")
        result_frame.grid_columnconfigure(0, weight=1)
        result_frame.grid_rowconfigure(1, weight=1)

        ttk.Label(result_frame, text="3. Local Execution Results:", font=("Arial", 11, "bold")).grid(row=0, column=0, sticky="w", pady=(10, 5))

        self.result_text = scrolledtext.ScrolledText(result_frame, wrap=tk.WORD, height=15, font=("Courier", 10), state=tk.DISABLED)
        self.result_text.grid(row=1, column=0, sticky="nsew")

        style = ttk.Style()
        style.theme_use('clam')
        style.configure("Accent.TButton", foreground="white", background="#4a90e2", font=("Arial", 10, "bold"))
        style.map("Accent.TButton", background=[("active", "#3a7bd5")])
        style.configure("Execute.TButton", foreground="white", background="#5cb85c", font=("Arial", 10, "bold"))
        style.map("Execute.TButton", background=[("active", "#4cae4c")])
        style.configure("Error.TLabel", foreground="red")
        style.configure("Success.TLabel", foreground="green")

    def _set_status(self):
        """Updates the status label in the header."""
        model_info = f"LLM: {OLLAMA_MODEL} | Status: {self.ollama_status}"
        if self.ollama_status == "Connected":
            self.status_label.config(text=model_info, style="Success.TLabel")
            self.generate_button["state"] = "normal"
        else:
            self.status_label.config(text=model_info, style="Error.TLabel")
            self.generate_button["state"] = "disabled"
            self.result_text.config(state=tk.NORMAL)
            self.result_text.delete(1.0, tk.END)
            self.result_text.insert(tk.END, "FATAL: Could not connect to Ollama. Please ensure it is running at http://localhost:11434.\n")
            self.result_text.config(state=tk.DISABLED)

    def _start_generation_thread(self):
        """Starts the LLM command generation in a separate thread."""
        prompt = self.prompt_entry.get().strip()
        if not prompt:
            return

        self._update_result("Generating command...", color="orange")
        self.generate_button["state"] = "disabled"
        self.execute_button["state"] = "disabled"

        thread = threading.Thread(target=self._generate_command, args=(prompt,))
        thread.start()

    def _start_execution_thread(self):
        """Starts the local command execution in a separate thread."""
        command = self.generated_command.get().strip()
        if not command:
            return

        self._update_result(f"Executing local command: {command}...", color="orange")
        self.execute_button["state"] = "disabled"
        self.generate_button["state"] = "disabled"

        thread = threading.Thread(target=self._execute_command, args=(command,))
        thread.start()

    def _generate_command(self, prompt):
        """Calls Ollama to generate the command."""
        try:
            messages_payload = [
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": prompt}
            ]

            # Non-streaming call for simplicity in desktop app
            response = self.ollama_client.chat(
                model=OLLAMA_MODEL,
                messages=messages_payload,
                stream=False
            )

            full_response = response['message']['content']

            # Clean up the command
            final_command = full_response.strip().replace('`', '').replace('bash', '').replace('yaml', '')

            # Use root.after to safely update UI from the thread
            self.after(0, self._update_command_ui, final_command, "Command generated successfully.")

        except Exception as e:
            self.after(0, self._update_command_ui, "", f"Error during generation: {e}")

    def _execute_command(self, command):
        """Executes the command locally using subprocess."""
        try:
            # Use shell=True for path resolution and command chaining (like piping)
            # Use text=True to decode output as string
            result = subprocess.run(
                command, 
                shell=True, 
                capture_output=True, 
                text=True, 
                timeout=30 # Timeout in case command hangs
            )

            output = f"--- Command: {command} ---\n"

            if result.returncode == 0:
                output += f"Execution SUCCESS (Return Code: 0)\n\n{result.stdout}"
                color = "green"
            else:
                output += f"Execution FAILED (Return Code: {result.returncode})\n\n{result.stderr or result.stdout}"
                color = "red"

            self.after(0, self._update_result, output, color)

        except FileNotFoundError:
            self.after(0, self._update_result, f"Error: Command '{command.split(' ')[0]}' not found. Ensure 'kubectl' and 'docker' are in your system PATH.", "red")
        except subprocess.TimeoutExpired:
            self.after(0, self._update_result, f"Error: Command '{command}' timed out after 30 seconds.", "red")
        except Exception as e:
            self.after(0, self._update_result, f"An unexpected error occurred during execution: {e}", "red")

    # --- UI Update Helpers (must be called from the main thread via self.after) ---
    def _update_command_ui(self, command, status_message):
        """Updates the command entry field and status after generation."""
        self.generated_command.set(command)

        self.generate_button["state"] = "normal"
        self.execute_button["state"] = "normal"

        if "successfully" in status_message:
            self._update_result(f"Command Ready: {command}", color="green", append=False)
        else:
            self._update_result(status_message, color="red", append=False)


    def _update_result(self, text, color="black", append=True):
        """Safely updates the execution result text area."""
        self.result_text.config(state=tk.NORMAL)

        if not append:
            self.result_text.delete(1.0, tk.END)

        # Insert colored text
        self.result_text.tag_config(color, foreground=color)
        self.result_text.insert(tk.END, f"{text}\n\n", color)
        self.result_text.see(tk.END) # Scroll to bottom

        self.result_text.config(state=tk.DISABLED)

        # Re-enable buttons after execution/generation is done
        if not text.endswith("..."):
            self.execute_button["state"] = "normal"
            self.generate_button["state"] = "normal"


if __name__ == "__main__":
    app = KubectlApp()
    app.mainloop()

Then lauch the application and test it 😊

As we can see, not the only the couple “kubectl-ai/Granite4” can be used as a local assistant, but also we can execute and test the commands through the interface.

So now there we go with a nice little k8s helper 🏆

If you want to use other types of UI for Python applications, I also made a “Streamlit” version which for sure is not able to run the command lines locally (would necessite a lot’s of weird workarounds…), but mimics them!

pip install streamlit

The application’s code;

import streamlit as st
import ollama
from ollama import Client # Import Client for explicit host control
import sys # For checking Python version compatibility

# --- Configuration ---
OLLAMA_MODEL = "granite4:latest" # Ensure this model is pulled on your Ollama server
OLLAMA_HOST = "http://localhost:11434" # Default Ollama host

# System prompt to enforce the kubectl-ai behavior: outputting only the command.
SYSTEM_PROMPT = """
You are an expert Kubernetes assistant. Your sole task is to translate natural language user requests into a single, executable 'kubectl' command.

RULES:
1. DO NOT include any explanatory text, markdown formatting, or notes, only the command itself.
2. The command MUST start with 'kubectl'.
3. Always respond with the command immediately.
"""

# Initialize Ollama Client and cache the resource
@st.cache_resource
def get_ollama_client(host):
    """Initializes and returns the Ollama client."""
    try:
        # Basic check for Python version
        if sys.version_info < (3, 8):
            st.error("This application requires Python 3.8 or higher.")
            return None

        # Initialize the client with the specified host
        client = Client(host=host)
        return client
    except Exception as e:
        # Catch exceptions if the host is unreachable during initialization
        st.error(f"Error initializing Ollama client: {e}. Please ensure Ollama is running and accessible at {host}.")
        return None

client = get_ollama_client(OLLAMA_HOST)

# --- Streamlit UI Setup ---
# Changed layout to wide for better separation of sections
st.set_page_config(page_title="Kubectl AI Assistant", layout="wide") 

# --- State Management for Execution Simulation ---
if 'executor_command' not in st.session_state:
    st.session_state.executor_command = ""
if 'executor_output' not in st.session_state:
    st.session_state.executor_output = ""
if 'executor_status' not in st.session_state:
    st.session_state.executor_status = "info" # info, success, warning, error


# --- Function for Command Simulation ---
def run_simulation(cmd):
    """Simulates the execution of kubectl/docker commands."""
    cmd = cmd.strip()
    if not cmd:
        st.session_state.executor_output = "Error: No command entered."
        st.session_state.executor_status = "error"
        return

    # Simple simulation logic
    if cmd.startswith("kubectl get pods"):
        st.session_state.executor_output = (
            "NAME                           READY   STATUS    RESTARTS   AGE\n"
            "my-app-deployment-5c4d6f7b8-abcde   1/1     Running   0          5d\n"
            "nginx-deployment-6f5b9d4c7-vwxyz  1/1     Running   0          3d\n"
            "kube-dns-7b89d7d4c-qrst           3/3     Running   0          7d"
        )
        st.session_state.executor_status = "success"
    elif cmd.startswith("kubectl describe"):
        st.session_state.executor_output = "Simulating: Successfully fetched detailed description for the resource."
        st.session_state.executor_status = "success"
    elif cmd.startswith("kubectl delete"):
        st.session_state.executor_output = f"Simulating: resource '{cmd.split(' ')[-1]}' deleted"
        st.session_state.executor_status = "success"
    elif cmd.startswith("kubectl"):
        st.session_state.executor_output = f"Simulating: kubectl command executed successfully. Command: {cmd}"
        st.session_state.executor_status = "success"
    elif cmd.startswith("docker"):
         st.session_state.executor_output = f"Simulating: docker command executed successfully. Command: {cmd}\n(Assuming Docker is available locally.)"
         st.session_state.executor_status = "success"
    else:
        st.session_state.executor_output = f"Error: '{cmd.split(' ')[0]}' command not found or not supported in this simulation. Only 'kubectl' and 'docker' are simulated."
        st.session_state.executor_status = "error"


# --- Main UI Content ---
st.title("💡 Kubectl AI Assistant")
# Display model name and host prominently
st.caption(f"🤖 LLM Model: **{OLLAMA_MODEL}** | Ollama Host: {OLLAMA_HOST}")

if client is None:
    # Display a message if the client could not be initialized
    st.warning("Ollama connection failed to initialize. Check the console for errors and ensure the Ollama server is running locally.")
else:
    # Initialize chat history in Streamlit session state
    if "messages" not in st.session_state:
        # History stores display content (user query and formatted command block)
        st.session_state.messages = []

    # --- AI Chat Interface (Left Side) ---
    st.header("1️⃣ Generate Command")

    # Display previous chat messages
    for message in st.session_state.messages:
        with st.chat_message(message["role"]):
            st.markdown(message["content"], unsafe_allow_html=True)

    # Accept user input
    if prompt := st.chat_input("E.g., 'show all pods in the kube-system namespace'"):
        # 1. Add and display user message
        st.session_state.messages.append({"role": "user", "content": prompt})
        with st.chat_message("user"):
            st.markdown(prompt)

        # 2. Generate and display assistant response (the command)
        with st.chat_message("assistant"):
            # Use a placeholder for the streamed, raw text output
            message_placeholder = st.empty()
            full_response = ""

            # Construct the single-turn chat payload: System prompt + User query
            messages_payload = [
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": prompt}
            ]

            try:
                response_stream = client.chat(
                    model=OLLAMA_MODEL,
                    messages=messages_payload,
                    stream=True,
                )

                for chunk in response_stream:
                    if chunk.get('message', {}).get('content'):
                        full_response += chunk['message']['content']
                        message_placeholder.text(full_response + "▌") 

                message_placeholder.empty()

                # Clean up the command
                final_command = full_response.strip().replace('`', '').replace('bash', '').replace('yaml', '')

                # Instruction to guide user to the simulation area
                st.info("✅ **Command Ready.** Use the copy button (top right) and paste this command into the **'2. Simulate Execution'** field below.")

                # Display the command with native copy button
                st.code(final_command, language='bash')

                # Prepare the response string for history (using markdown format)
                formatted_response = f"```
{% endraw %}
bash\n{final_command}\n
{% raw %}
```"

            except ollama.ResponseError as e:
                formatted_response = f"Error from Ollama model: {e}"
                st.error(formatted_response)
            except Exception as e:
                formatted_response = f"An unexpected error occurred: {e}"
                st.error(formatted_response)

            # 3. Add the final formatted response to the chat history
            st.session_state.messages.append({
                "role": "assistant", 
                "content": formatted_response
            })

    # --- Command Execution Simulation (Bottom Section) ---
    st.markdown("---")
    st.header("2️⃣ Simulate Execution")
    st.write("Paste a `kubectl` or `docker` command here to see a simulated shell execution and output.")

    # Text area for command input
    # Use the session state key to track changes for the button trigger
    command_input = st.text_area(
        "Enter Command:",
        height=70,
        value=st.session_state.executor_command,
        key="command_input_key"
    )

    # Button to trigger simulation
    if st.button("▶️ Run Command Simulation", type="primary", use_container_width=True):
        # Update session state with the current input value before running simulation
        st.session_state.executor_command = command_input 
        run_simulation(st.session_state.executor_command)

    # Display Simulation Output
    st.subheader("Simulated Output Log")
    if st.session_state.executor_output:
        if st.session_state.executor_status == "success":
            st.success("Command Executed Successfully (Simulation)")
            st.code(st.session_state.executor_output, language='text')
        elif st.session_state.executor_status == "error":
            st.error("Execution Error (Simulation)")
            st.code(st.session_state.executor_output, language='text')
        else:
            # Default or info status
            st.code(st.session_state.executor_output, language='text')
    else:
        st.info("Output will appear here after a command is run in the simulation field above.")

Conclusion

In conclusion, we’ve successfully crafted a powerful and user-friendly Personal Kubernetes AI Shell, transforming the way we interact with our local clusters. By harnessing the advanced capabilities of the newly released, efficient, and open-licensed IBM Granite 4.0 LLM via Ollama, we’ve built an intelligent assistant that takes inspiration from Google’s kubectl-ai philosophy. This application effectively bridges the gap between natural language and complex kubectl commands, eliminating the need for constant documentation lookups or cheatsheets. Operating within my local environment—comprising Podman Desktop and engine for container management, and Minikube as my Kubernetes cluster—this desktop tool provides a streamlined workflow: simply ask a question, generate the command, and execute it directly on your machine within a single, intuitive interface. This empowers users with an unprecedented level of efficiency and accessibility in Kubernetes cluster management.

DEV Community