<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Addo.Zhang</title>
    <description>The latest articles on DEV Community by Addo.Zhang (@addozhang).</description>
    <link>https://dev.to/addozhang</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F823473%2F9189c919-b7ec-4f33-a962-179024da0432.jpeg</url>
      <title>DEV Community: Addo.Zhang</title>
      <link>https://dev.to/addozhang</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/addozhang"/>
    <language>en</language>
    <item>
      <title>K8sGPT + Ollama - A Free Kubernetes Automated Diagnostic Solution</title>
      <dc:creator>Addo.Zhang</dc:creator>
      <pubDate>Wed, 19 Jun 2024 00:24:34 +0000</pubDate>
      <link>https://dev.to/addozhang/k8sgpt-ollama-a-free-kubernetes-automated-diagnostic-solution-3c8o</link>
      <guid>https://dev.to/addozhang/k8sgpt-ollama-a-free-kubernetes-automated-diagnostic-solution-3c8o</guid>
      <description>&lt;p&gt;I checked my blog drafts over the weekend and found this one. I remember writing it with "Kubernetes Automated Diagnosis Tool: k8sgpt-operator"(posted in Chinese) about a year ago. My procrastination seems to have reached a critical level. Initially, I planned to use K8sGPT + &lt;a href="https://localai.io" rel="noopener noreferrer"&gt;LocalAI&lt;/a&gt;. However, after trying &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt;, I found it more user-friendly. Ollama also supports the &lt;a href="https://github.com/ollama/ollama/blob/main/docs/openai.md" rel="noopener noreferrer"&gt;OpenAI API&lt;/a&gt;, so I decided to switch to using Ollama.&lt;/p&gt;




&lt;p&gt;After publishing the article introducing k8sgpt-operator, some readers mentioned the high barrier to entry for using OpenAI. This issue is indeed challenging but not insurmountable. However, this article is not about solving that problem but introducing an alternative to OpenAI: Ollama. Late last year, &lt;a href="https://landscape.cncf.io/?item=observability-and-analysis--observability--k8sgpt" rel="noopener noreferrer"&gt;k8sgpt entered the CNCF Sandbox&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Installing Ollama
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhi9td0axrurq5kvqm3to.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhi9td0axrurq5kvqm3to.png" alt="Ollama + Llama3"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ollama is an open-source large model tool that allows you to easily install and run &lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;various large models&lt;/a&gt; locally or in the cloud. It is very user-friendly and can be run with simple commands. On macOS, you can install it with a single command using homebrew:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

brew &lt;span class="nb"&gt;install &lt;/span&gt;ollama


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The latest version is 0.1.44.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

ollama &lt;span class="nt"&gt;-v&lt;/span&gt; 
Warning: could not connect to a running Ollama instance
Warning: client version is 0.1.44


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;On Linux, you can also install it with the official script.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

curl &lt;span class="nt"&gt;-sSL&lt;/span&gt; https://ollama.com/install.sh | sh


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Start Ollama and set the listening address to &lt;code&gt;0.0.0.0&lt;/code&gt; through an environment variable to allow access from containers or K8s clusters.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

&lt;span class="nv"&gt;OLLAMA_HOST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.0.0.0 ollama start

...
&lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2024-06-16T07:54:57.329+08:00 &lt;span class="nv"&gt;level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;INFO &lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;routes.go:1057 &lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"Listening on 127.0.0.1:11434 (version 0.1.44)"&lt;/span&gt;
&lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2024-06-16T07:54:57.329+08:00 &lt;span class="nv"&gt;level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;INFO &lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;payload.go:30 &lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"extracting embedded files"&lt;/span&gt; &lt;span class="nb"&gt;dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/var/folders/9p/2tp6g0896715zst_bfkynff00000gn/T/ollama1722873865/runners
&lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2024-06-16T07:54:57.346+08:00 &lt;span class="nv"&gt;level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;INFO &lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;payload.go:44 &lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"Dynamic LLM libraries [metal]"&lt;/span&gt;
&lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2024-06-16T07:54:57.385+08:00 &lt;span class="nv"&gt;level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;INFO &lt;span class="nb"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;types.go:71 &lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"inference compute"&lt;/span&gt; &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;library&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;metal &lt;span class="nv"&gt;compute&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt; &lt;span class="nv"&gt;driver&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.0 &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt; &lt;span class="nv"&gt;total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"21.3 GiB"&lt;/span&gt; &lt;span class="nv"&gt;available&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"21.3 GiB"&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  2. Downloading and Running Large Models
&lt;/h3&gt;

&lt;p&gt;Llama3, one of the popular large models, was open-sourced by Meta in April. Llama3 has two versions: 8B and 70B.&lt;/p&gt;

&lt;p&gt;I am running it on macOS, so I chose the 8B version. The 8B version is 4.7 GB, and it takes 3-4 minutes to download with a fast internet connection.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

ollama run llama3


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;On my M1 Pro with 32GB of memory, it takes about 12 seconds to start.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

time=2024-06-17T09:30:25.070+08:00 level=INFO source=server.go:572 msg="llama runner started in 12.58 seconds"


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Each query takes about 14 seconds.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

....
"total_duration":14064009500,"load_duration":1605750,"prompt_eval_duration":166998000,"eval_count":419,"eval_duration":13894579000}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  3. Configuring K8sGPT CLI Backend
&lt;/h3&gt;

&lt;p&gt;If you want to test k8sgpt-operator, you can skip this step.&lt;/p&gt;

&lt;p&gt;We will use the Ollama REST API as the backend for k8sgpt, serving as the inference provider. Here, we select the backend type as &lt;code&gt;localai&lt;/code&gt; because &lt;a href="https://localai.io" rel="noopener noreferrer"&gt;LocalAI&lt;/a&gt; is compatible with the OpenAI API, and the actual provider will still be Ollama running Llama.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

k8sgpt auth add &lt;span class="nt"&gt;--backend&lt;/span&gt; localai &lt;span class="nt"&gt;--model&lt;/span&gt; llama3 &lt;span class="nt"&gt;--baseurl&lt;/span&gt; http://localhost:11434/v1


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Set it as the default provider.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

k8sgpt auth default &lt;span class="nt"&gt;--provider&lt;/span&gt; localai
Default provider &lt;span class="nb"&gt;set &lt;/span&gt;to localai


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Testing:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a pod in k8s using the image &lt;code&gt;image-not-exist&lt;/code&gt;.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

kubectl get po k8sgpt-test
NAME          READY   STATUS         RESTARTS   AGE
k8sgpt-test   0/1     ErrImagePull   0          6s


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Use k8sgpt to analyze the error.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

k8sgpt analyze &lt;span class="nt"&gt;--explain&lt;/span&gt; &lt;span class="nt"&gt;--filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Pod &lt;span class="nt"&gt;--namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;default &lt;span class="nt"&gt;--output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;json

&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"provider"&lt;/span&gt;: &lt;span class="s2"&gt;"localai"&lt;/span&gt;,
  &lt;span class="s2"&gt;"errors"&lt;/span&gt;: null,
  &lt;span class="s2"&gt;"status"&lt;/span&gt;: &lt;span class="s2"&gt;"ProblemDetected"&lt;/span&gt;,
  &lt;span class="s2"&gt;"problems"&lt;/span&gt;: 1,
  &lt;span class="s2"&gt;"results"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;
    &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="s2"&gt;"kind"&lt;/span&gt;: &lt;span class="s2"&gt;"Pod"&lt;/span&gt;,
      &lt;span class="s2"&gt;"name"&lt;/span&gt;: &lt;span class="s2"&gt;"default/k8sgpt-test"&lt;/span&gt;,
      &lt;span class="s2"&gt;"error"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;
        &lt;span class="o"&gt;{&lt;/span&gt;
          &lt;span class="s2"&gt;"Text"&lt;/span&gt;: &lt;span class="s2"&gt;"Back-off pulling image &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;image-not-exist&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;,
          &lt;span class="s2"&gt;"KubernetesDoc"&lt;/span&gt;: &lt;span class="s2"&gt;""&lt;/span&gt;,
          &lt;span class="s2"&gt;"Sensitive"&lt;/span&gt;: &lt;span class="o"&gt;[]&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
      &lt;span class="o"&gt;]&lt;/span&gt;,
      &lt;span class="s2"&gt;"details"&lt;/span&gt;: &lt;span class="s2"&gt;"Error: Back-off pulling image &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;image-not-exist&lt;/span&gt;&lt;span class="se"&gt;\"\n\n&lt;/span&gt;&lt;span class="s2"&gt;Solution: &lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;1. Check if the image exists on Docker Hub or your local registry.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;2. If not, create the image using a Dockerfile and build it.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;3. If the image exists, check the spelling and try again.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;4. Verify the image repository URL in your Kubernetes configuration file (e.g., deployment.yaml)."&lt;/span&gt;,
      &lt;span class="s2"&gt;"parentObject"&lt;/span&gt;: &lt;span class="s2"&gt;""&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  4. Deploying and Configuring k8sgpt-operator
&lt;/h3&gt;

&lt;p&gt;k8sgpt-operator can automate k8sgpt in the cluster. You can install it using Helm.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

helm repo add k8sgpt https://charts.k8sgpt.ai/
helm repo update
helm &lt;span class="nb"&gt;install &lt;/span&gt;release k8sgpt/k8sgpt-operator &lt;span class="nt"&gt;-n&lt;/span&gt; k8sgpt &lt;span class="nt"&gt;--create-namespace&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;k8sgpt-operator provides two CRDs: &lt;code&gt;K8sGPT&lt;/code&gt; to configure k8sgpt and &lt;code&gt;Result&lt;/code&gt; to output analysis results.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

kubectl api-resources  | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; gpt
k8sgpts                                        core.k8sgpt.ai/v1alpha1                &lt;span class="nb"&gt;true         &lt;/span&gt;K8sGPT
results                                        core.k8sgpt.ai/v1alpha1                &lt;span class="nb"&gt;true         &lt;/span&gt;Result


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Configure &lt;code&gt;K8sGPT&lt;/code&gt;, using Ollama's IP address for &lt;code&gt;baseUrl&lt;/code&gt;.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

kubectl apply &lt;span class="nt"&gt;-n&lt;/span&gt; k8sgpt &lt;span class="nt"&gt;-f&lt;/span&gt; - &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
apiVersion: core.k8sgpt.ai/v1alpha1
kind: K8sGPT
metadata:
  name: k8sgpt-ollama
spec:
  ai:
    enabled: true
    model: llama3
    backend: localai
    baseUrl: http://198.19.249.3:11434/v1
  noCache: false
  filters: ["Pod"]
  repository: ghcr.io/k8sgpt-ai/k8sgpt
  version: v0.3.8
&lt;/span&gt;&lt;span class="no"&gt;EOF


&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;After creating the &lt;code&gt;K8sGPT&lt;/code&gt; CR, the operator will automatically create a pod for it. Checking the &lt;code&gt;Result&lt;/code&gt; CR will show the same results.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

kubectl get result &lt;span class="nt"&gt;-n&lt;/span&gt; k8sgpt &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.items[].spec}'&lt;/span&gt; | jq &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"backend"&lt;/span&gt;: &lt;span class="s2"&gt;"localai"&lt;/span&gt;,
  &lt;span class="s2"&gt;"details"&lt;/span&gt;: &lt;span class="s2"&gt;"Error: Kubernetes is unable to pull the image &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;image-not-exist&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; due to it not existing.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Solution: &lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;1. Check if the image actually exists.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;2. If not, create the image or use an alternative one.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;3. If the image does exist, ensure that the Docker daemon and registry are properly configured."&lt;/span&gt;,
  &lt;span class="s2"&gt;"error"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;
    &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="s2"&gt;"text"&lt;/span&gt;: &lt;span class="s2"&gt;"Back-off pulling image &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;image-not-exist&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;]&lt;/span&gt;,
  &lt;span class="s2"&gt;"kind"&lt;/span&gt;: &lt;span class="s2"&gt;"Pod"&lt;/span&gt;,
  &lt;span class="s2"&gt;"name"&lt;/span&gt;: &lt;span class="s2"&gt;"default/k8sgpt-test"&lt;/span&gt;,
  &lt;span class="s2"&gt;"parentObject"&lt;/span&gt;: &lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>k8s</category>
      <category>k8sgpt</category>
      <category>ollama</category>
    </item>
  </channel>
</rss>
