DEV Community

George Lukas
George Lukas

Posted on

Chapter 4: GitOps with Terraform + ArgoCD — Self-Hosting LLMs as a Platform Product

Each chapter in this series has been a deliberate step toward removing friction from LLM infrastructure management. Chapter 2 introduced Infrastructure as Code by mapping Kubernetes resources directly to Terraform — functional, but painfully verbose. Chapter 3 resolved the abstraction problem by bringing in the Helm provider, collapsing 500 lines of HCL into 50 and allowing Terraform to reason about applications rather than individual resources. Both approaches, however, shared the same fundamental constraint: every change still required a human to run terraform apply. The cluster had no awareness of Git, drift went undetected until someone noticed, and scaling that model to larger teams or more frequent deploys would inevitably make manual execution a bottleneck. Chapter 4 closes that loop by introducing GitOps with ArgoCD — making the cluster itself responsible for continuously reconciling its state against Git, without anyone needing to trigger a command.

The Four Principles of GitOps

GitOps is an operational paradigm built on four interconnected principles:

1. Declarative
Everything the cluster needs to run is expressed as YAML files. There is no procedural "run this command" — only a description of the desired end state.

2. Versioned
All state lives in Git. Every change has an author, a timestamp, and a diff. Rollback becomes as simple as reverting a commit.

3. Pull-based (key differentiator)

Cap 3: Developer → terraform apply → push → Cluster
Cap 4: Developer → git push → Git ← poll ← ArgoCD (no cluster)
Enter fullscreen mode Exit fullscreen mode

4. Continuous reconciliation
ArgoCD runs an infinite loop:

while true:
  git_state = fetch_from_git()
  cluster_state = fetch_from_kubernetes()
  if git_state != cluster_state:
    apply_changes()
  sleep(180)  # 3 minutes
Enter fullscreen mode Exit fullscreen mode

Context and Rationale

With the mechanics of GitOps established, it is worth stepping back to understand why this shift matters beyond a single team running git push. Removing terraform apply from the workflow is only the surface benefit — the deeper payoff is that GitOps creates the operational foundation for treating infrastructure as a product, with real ownership, roadmaps, and SLAs.

Growing Cloud Complexity

Cloud infrastructure promised simplicity. In 2010, AWS offered ~20 intuitive services, provisioned with clicks in the UI. The value proposition was clear: simplify infrastructure.

Today, reality is different:

AWS (2010):
├── ~20 services
├── Provisioning via UI
└── Simplicity as core value

AWS (2025):
├── 175+ services
├── Provisioning via IaC
└── Complexity as new reality
Enter fullscreen mode Exit fullscreen mode

Organizational impact:

According to Max Griffiths (Thoughtworks):

"Rising cloud complexity is putting many organizations and their infrastructure teams right back to where they were 15 years ago — struggling to keep up with demand for new services and instances, and stay on top of an increasingly unmanageable infrastructure footprint."

Consequences:

  • Time to market increased (instead of decreasing)
  • Required skills grew exponentially
  • Self-service became impractical
  • Full-stack developer scope expanded to the point of becoming a disadvantage

Infrastructure as Product

Infrastructure as Product treats infrastructure not as a centralized service provider, but as a portfolio of internal products that enable product teams to deliver value quickly.

Paradigm shift:

Traditional (project-based):
Developer → Ticket → Ops Team → Wait → Provision → Deploy
Timeline: Days/weeks
Bottleneck: Ops team on critical path

Infrastructure as Product:
Developer → Self-service Platform → Provision → Deploy
Timeline: Minutes/hours
Enabler: Platform removes friction
Enter fullscreen mode Exit fullscreen mode

Three Core Principles

1. Developer as Customer

Infrastructure should be designed around developer experience first. If the platform is hard to use, it won't be adopted — regardless of how technically sound it is.

Success metrics:

  • Developer satisfaction score
  • Time to provision (target: <30 min)
  • Platform adoption rate
  • Support tickets reduction

Anti-pattern:

# Requires learning a new language just to provision
developer:
  must_learn: [HCL, Pulumi, CloudFormation]
  to_do: "Same work as before"
  result: "Poor developer experience"
Enter fullscreen mode Exit fullscreen mode

Correct pattern:

# Familiar interface, correct abstraction
developer:
  uses: [YAML, familiar APIs]
  self_service: true
  result: "High adoption, low friction"
Enter fullscreen mode Exit fullscreen mode

2. Platform Teams With a Product Mindset

According to Sebastian Straube (Accenture), infrastructure teams shouldn't operate as a shared service that reacts to tickets — they should be restructured into dedicated platform product teams, each owning a slice of the internal platform with a roadmap, SLAs, and real accountability to their users.

Organization example:

Platform Team: Compute & Container
├── Product: Kubernetes-as-a-Service
├── Customers: All development teams
├── Roadmap: 
│   ├── Q1: Auto-scaling GPU nodes
│   ├── Q2: Service mesh integration
│   └── Q3: Cost optimization
├── SLA: 99.9% uptime, <5min provision
└── Metrics: Adoption rate, developer satisfaction

Platform Team: Observability
├── Product: Unified logs/metrics/traces
├── Customers: All teams (dev + ops)
├── Roadmap:
│   ├── Q1: 30-day retention
│   ├── Q2: AIOps integration
│   └── Q3: Cost attribution
├── SLA: <5s query response
└── Metrics: Query volume, alert accuracy
Enter fullscreen mode Exit fullscreen mode

Characteristics:

  • Long-lived teams (not project-based)
  • Own product roadmap and backlog
  • Accountable for adoption and satisfaction
  • Measure success by customer outcomes

3. Self-Service as a Core Value

The platform must enable true self-service — meaning the platform team is never on the critical path of a deployment.

Anti-pattern:

Developer → Submit PR → Platform review → Approve → Deploy
Problem: Platform team on critical path (bottleneck)
Enter fullscreen mode Exit fullscreen mode

Correct pattern:

Developer → Use platform API → Automated validation → Deploy
Platform: Monitor outcomes
Enter fullscreen mode Exit fullscreen mode

Real-world example:

# Developer workflow
kubectl apply -f app.yaml

# Platform validates automatically:
# ✓ Security policies (OPA)
# ✓ Resource quotas
# ✓ Network policies
# ✓ Deploy without manual approval
Enter fullscreen mode Exit fullscreen mode

GitOps as the Technical Foundation

GitOps is not just "automation" — it is the enabling layer that makes Infrastructure as Product operationally viable. Each of its four properties maps directly to a platform capability:

1. Declarative = Product Catalog

The Git repository becomes a catalog of available products that any team can consume, customize, and deploy independently.

k8s-apps/
├── apps/ollama/          # Product: LLM Inference
├── apps/librechat/       # Product: Chat Interface
└── apps/postgres/        # Product: Database
Enter fullscreen mode Exit fullscreen mode

Developer "consumes" a product via:

git clone k8s-apps
cp -r apps/ollama apps/my-llm
vim apps/my-llm/values.yaml  # Customize
git commit && git push
# ArgoCD auto-deploy
Enter fullscreen mode Exit fullscreen mode

2. Self-Service = Developer Autonomy

ArgoCD removes the platform team from the critical path:

Without ArgoCD:
Developer → Code → Request deploy → Platform team → Manual deploy

With ArgoCD:
Developer → Code → Git push → ArgoCD auto-sync → Deployed
Enter fullscreen mode Exit fullscreen mode

3. API-Driven = Programmatic Access

ArgoCD Application CRDs are the deployment API:

# Developer creates workload via Kubernetes API
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-service
spec:
  source:
    repoURL: https://github.com/company/k8s-apps
    path: apps/my-service
  destination:
    server: https://kubernetes.default.svc
    namespace: my-namespace
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
Enter fullscreen mode Exit fullscreen mode

4. Standardization With Flexibility

Helm charts serve as reusable product templates — the platform defines the structure, teams supply the configuration.

Platform provide templates:
├── web-app/          # Template for web apps
├── ml-service/       # Template for ML workloads
└── data-pipeline/    # Template for ETL

Developer customize:
cp -r templates/web-app apps/my-app
vim apps/my-app/values.yaml  # Configuration only
git push  # Automatic deployment
Enter fullscreen mode Exit fullscreen mode

Architectural Decision

App of Apps with Independent Wrappers

Rather than managing all services as a single unit, this architecture treats each service as an independent package within the App of Apps pattern. GitOps manages deployments, versions, and updates at the individual service level — not the stack level.

Each service (Ollama, LibreChat) has its own Helm chart wrapper that:

  • References the upstream chart via dependencies in Chart.yaml
  • Customizes configuration via a local values.yaml
  • Maintains independent versioning
  • Allows isolated evolution

Wrapper Structure

apps/
├── librechat/
│   ├── Chart.yaml      # Wrapper chart with dependency
│   └── values.yaml     # Specific configuration
└── ollama/
    ├── Chart.yaml      # Wrapper chart with dependency
    └── values.yaml     # Specific configuration
Enter fullscreen mode Exit fullscreen mode

Chart.yaml defines the dependency:

dependencies:
  - name: ollama
    version: "1.42.0"                              # Fixed version
    repository: "https://otwld.github.io/ollama-helm/"
Enter fullscreen mode Exit fullscreen mode

values.yaml customizes the upstream chart:

ollama:        # Wrapper namespace
  ollama:      # Chart namespace (double hierarchy)
    gpu:
      enabled: true
    models:
      pull:
        - llama3.2:3b
Enter fullscreen mode Exit fullscreen mode

Benefits of This Approach

1. Service Independence

  • Each app evolves at its own pace
  • Deploying Ollama does not affect LibreChat
  • Reduces the risk of cross-service regressions

2. Granular Versioning

apps/ollama/Chart.yaml:      version: "1.42.0"
apps/librechat/Chart.yaml:   version: "1.9.7"
Enter fullscreen mode Exit fullscreen mode
  • Upstream chart versions pinned individually
  • Upgrades tested and applied per service
  • Independent rollback per application

3. Isolated Customization

  • Values specific per service
  • No configuration conflicts
  • Individual testability

4. Per-Service Observability

  • ArgoCD Application per service
  • Isolated logs and events
  • Specific health checking

5. Automated Deployment

Git commit → ArgoCD detects → Helm processes wrapper → Deploy to cluster
Enter fullscreen mode Exit fullscreen mode
  • ArgoCD manages each wrapper as an Application
  • Automatic sync per service
  • Independent self-healing

6. Complete Tracking

git log apps/ollama/values.yaml
# Complete change history for Ollama

git log apps/librechat/values.yaml
# Complete change history for LibreChat
Enter fullscreen mode Exit fullscreen mode
  • Audit trail per service
  • Specific PRs per change
  • Granular rollback via Git

Types of Wrappers

Wrapper with Public Chart (HTTP):

# apps/ollama/Chart.yaml
dependencies:
  - name: ollama
    version: "1.42.0"
    repository: "https://otwld.github.io/ollama-helm/"
Enter fullscreen mode Exit fullscreen mode

Wrapper with OCI Chart:

# apps/librechat/Chart.yaml
dependencies:
  - name: librechat
    version: "1.9.7"
    repository: "oci://ghcr.io/danny-avila/librechat-chart"
Enter fullscreen mode Exit fullscreen mode

Wrapper with Sub-charts:

# apps/custom-app/Chart.yaml
dependencies:
  - name: app
    version: "1.0.0"
    repository: "https://..."
  - name: postgres
    version: "12.0.0"
    repository: "https://charts.bitnami.com/bitnami"
  - name: redis
    version: "17.0.0"
    repository: "https://charts.bitnami.com/bitnami"
Enter fullscreen mode Exit fullscreen mode

Governance and Standardization

No tight coupling:

  • Each service maintains flexibility
  • Standards enforced via linting (CI/CD)
  • Common templates available, not mandatory

Example of suggested standard:

# All services follow label convention
metadata:
  labels:
    app.kubernetes.io/name: {{ .Chart.Name }}
    app.kubernetes.io/instance: {{ .Release.Name }}
    app.kubernetes.io/version: {{ .Chart.AppVersion }}
    app.kubernetes.io/managed-by: argocd
Enter fullscreen mode Exit fullscreen mode

But each service can add specific labels without breaking others.

Alternatives Considered and Rejected

❌ Single monorepo without wrappers:

# values.yaml (monolítico)
ollama:
  gpu: ...
librechat:
  config: ...
postgres:
  ...
Enter fullscreen mode Exit fullscreen mode

Problems:

  • A change in Ollama affects LibreChat's versioning
  • Atomic all-or-nothing deployment
  • Rollback impacts all services

❌ Single custom chart:

# custom-chart/
templates/
  ollama.yaml
  librechat.yaml
  postgres.yaml
Enter fullscreen mode Exit fullscreen mode

Problems:

  • Reinventing the wheel (official charts already exist)
  • Heavy maintenance burden
  • Complicates upstream upgrades

✅ Independent wrappers (chosen):

  • Reuses upstream charts
  • Independence between services
  • Easy maintenance
  • Flexible governance

Trade-offs

Advantages:

  • Full independence between services
  • Granular versioning
  • Individual rollback
  • Isolated observability
  • Scalability (adding an app = new directory)

Disadvantages:

  • Duplication of common configurations (e.g., ingress annotations)
  • Requires linting to enforce standards
  • More files to manage

Considerations

Use when:

  • Medium/large team (5+ people)
  • Multiple independent services
  • Frequent deploys per service
  • Need for granular rollback
  • Distributed governance (teams own services)

Do not use when:

  • Monolithic application (single service)
  • Very small team (1–2 people)
  • All services always deployed together
  • Preference for extreme simplicity

Implementing Terraform

This section implements main.tf. It does everything Terraform is responsible for in this architecture — which is intentionally narrow. Rather than managing applications directly, main.tf bootstraps the platform once and then hands control to ArgoCD. Concretely, it does four things in sequence: creates the three namespaces, provisions the credentials secret that LibreChat needs at runtime, installs ArgoCD via Helm, and registers the two ArgoCD Application CRDs that point at the Git repository. After that first terraform apply, Terraform is largely out of the picture — all subsequent application changes flow through Git.

The file is organised into four logical blocks, each separated by a comment header:

main.tf
├── Providers          — Kubernetes + Helm provider config
├── Infrastructure     — Namespaces + credentials secret
├── ArgoCD             — Helm installation + values reference
└── ArgoCD Applications — CRDs that register Ollama and LibreChat
Enter fullscreen mode Exit fullscreen mode

The subsections below walk through each block in order.

Providers

(main.tf lines 1–27 — terraform {} block + provider declarations)

terraform {
  required_version = ">= 1.0"

  required_providers {
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~> 2.23"
    }
    helm = {
      source  = "hashicorp/helm"
      version = "~> 2.11"
    }
  }
}

provider "kubernetes" {
  config_path    = "~/.kube/config"
  config_context = "minikube"
}

provider "helm" {
  kubernetes {
    config_path    = "~/.kube/config"
    config_context = "minikube"
  }
}
Enter fullscreen mode Exit fullscreen mode

The provider configuration is identical to Chapter 3: both the kubernetes and helm providers read from ~/.kube/config and target the minikube context. No changes needed here.

Namespaces

(main.tfINFRASTRUCTURE block)

resource "kubernetes_namespace" "argocd" {
  metadata {
    name = "argocd"
    labels = {
      managed-by = "terraform"
      purpose    = "gitops"
    }
  }
}

resource "kubernetes_namespace" "ollama" {
  metadata {
    name = "ollama"
    labels = {
      managed-by = "terraform"
    }
  }
}

resource "kubernetes_namespace" "librechat" {
  metadata {
    name = "librechat"
    labels = {
      managed-by = "terraform"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Design decision: Namespaces are managed by Terraform rather than ArgoCD because they are prerequisites for everything else — ArgoCD itself cannot deploy into a namespace that does not yet exist. They also change rarely enough that the overhead of GitOps reconciliation is not justified. The purpose = "gitops" label distinguishes tooling namespaces from application namespaces at a glance.

Infrastructure Secrets

(main.tf — still inside the INFRASTRUCTURE block, immediately after the namespaces)

resource "kubernetes_secret" "librechat_credentials" {
  metadata {
    name      = "librechat-credentials-env"
    namespace = kubernetes_namespace.librechat.metadata[0].name
  }

  data = {
    JWT_SECRET         = var.jwt_secret
    JWT_REFRESH_SECRET = var.jwt_refresh_secret
    CREDS_KEY          = var.creds_key
    CREDS_IV           = var.creds_iv
    MONGO_URI          = "mongodb://librechat-mongodb:27017/LibreChat"
    MEILI_HOST         = "http://librechat-meilisearch:7700"
    OLLAMA_BASE_URL    = "http://ollama.ollama.svc.cluster.local:11434"
  }

  type = "Opaque"
}
Enter fullscreen mode Exit fullscreen mode

Design decision: Credentials are managed by Terraform precisely because they must not appear in Git. Committing plaintext secrets to a repository — even a private one — violates the principle of least exposure. For production environments, the right path is to replace this with Sealed Secrets, External Secrets Operator, or HashiCorp Vault, all of which allow secrets to live in Git in an encrypted or referenced form. Chapter 5 covers this.

ArgoCD Installation

(main.tfARGOCD block)

resource "helm_release" "argocd" {
  name       = "argocd"
  repository = "https://argoproj.github.io/argo-helm"
  chart      = "argo-cd"
  namespace  = kubernetes_namespace.argocd.metadata[0].name
  version    = "5.51.6"

  values = [
    file("${path.module}/values/argocd-values.yaml")
  ]

  timeout       = 600
  wait          = true
  wait_for_jobs = true

  depends_on = [
    kubernetes_namespace.argocd
  ]
}
Enter fullscreen mode Exit fullscreen mode

ArgoCD Applications (CRDs)

(main.tfARGOCD APPLICATIONS block — the last block in the file)

With ArgoCD installed, the final step is registering the applications it should manage. Each kubernetes_manifest block below creates an ArgoCD Application CRD — a declarative instruction telling ArgoCD which Git path to watch, which cluster namespace to deploy into, and how to handle sync and drift.

Ollama:

resource "kubernetes_manifest" "argocd_app_ollama" {
  manifest = {
    apiVersion = "argoproj.io/v1alpha1"
    kind       = "Application"
    metadata = {
      name      = "ollama"
      namespace = kubernetes_namespace.argocd.metadata[0].name
      labels = {
        managed-by = "terraform"
      }
    }
    spec = {
      project = "default"

      source = {
        repoURL        = var.git_repo_url
        targetRevision = var.git_branch
        path           = "apps/ollama"
        helm = {
          valueFiles = ["values.yaml"]
        }
      }

      destination = {
        server    = "https://kubernetes.default.svc"
        namespace = kubernetes_namespace.ollama.metadata[0].name
      }

      syncPolicy = {
        automated = {
          prune    = true
          selfHeal = true
        }
        syncOptions = [
          "CreateNamespace=false"
        ]
      }
    }
  }

  depends_on = [
    helm_release.argocd,
    kubernetes_namespace.ollama
  ]
}
Enter fullscreen mode Exit fullscreen mode

Breaking Down:

spec.source:

source = {
  repoURL        = "https://github.com/<user>/k8s-apps.git"
  targetRevision = "main"
  path           = "apps/ollama"
  helm = {
    valueFiles = ["values.yaml"]
  }
}
Enter fullscreen mode Exit fullscreen mode

repoURL: Git repository to watch

targetRevision: Branch, tag, or specific SHA

path: Directory within the repo

helm.valueFiles: Array of values files (merged in order)

spec.destination:

destination = {
  server    = "https://kubernetes.default.svc"
  namespace = "ollama"
}
Enter fullscreen mode Exit fullscreen mode

server: API server of the target cluster

  • kubernetes.default.svc: Local cluster (where ArgoCD is)
  • External URL: Deploy to a remote cluster

namespace: Destination namespace in the cluster

spec.syncPolicy:

syncPolicy = {
  automated = {
    prune    = true
    selfHeal = true
  }
  syncOptions = [
    "CreateNamespace=false"
  ]
}
Enter fullscreen mode Exit fullscreen mode

automated:

  • Present: ArgoCD syncs automatically upon detecting changes
  • Absent: Manual sync only

prune: true:

  • Resources deleted from Git are deleted from the cluster
  • false: Orphaned resources remain in the cluster

selfHeal: true:

  • Manual changes in the cluster are reverted
  • ArgoCD forces state = Git
  • false: Manual changes persist (drift)

syncOptions:

  • CreateNamespace=false: Do not create namespace (already exists via Terraform)
  • CreateNamespace=true: Create if it doesn't exist
  • Validate=false: Skip resource validation
  • PruneLast=true: Delete orphaned resources last

LibreChat:

resource "kubernetes_manifest" "argocd_app_librechat" {
  manifest = {
    apiVersion = "argoproj.io/v1alpha1"
    kind       = "Application"
    metadata = {
      name      = "librechat"
      namespace = kubernetes_namespace.argocd.metadata[0].name
      labels = {
        managed-by = "terraform"
      }
    }
    spec = {
      project = "default"

      source = {
        repoURL        = var.git_repo_url
        targetRevision = var.git_branch
        path           = "apps/librechat"
        helm = {
          valueFiles = ["values.yaml"]
        }
      }

      destination = {
        server    = "https://kubernetes.default.svc"
        namespace = kubernetes_namespace.librechat.metadata[0].name
      }

      syncPolicy = {
        automated = {
          prune    = true
          selfHeal = true
        }
        syncOptions = [
          "CreateNamespace=false"
        ]
      }
    }
  }

  depends_on = [
    helm_release.argocd,
    kubernetes_namespace.librechat,
    kubernetes_secret.librechat_credentials
  ]
}
Enter fullscreen mode Exit fullscreen mode

The LibreChat Application CRD follows the same structure as Ollama. The only meaningful difference is that it carries an additional depends_on reference to kubernetes_secret.librechat_credentials, ensuring the credentials exist in the cluster before ArgoCD attempts its first sync.

Outputs

(main.tfOUTPUTS block)

output "argocd_url" {
  value = "http://argocd.glukas.space"
}

output "argocd_admin_password" {
  value     = try(data.kubernetes_secret.argocd_initial_admin.data["password"], "")
  sensitive = true
}

output "argocd_password_command" {
  value = "minikube kubectl -- -n argocd get secret argocd-initial-admin-secret -o jsonpath='{.data.password}' | base64 -d"
}

output "applications_managed" {
  value = {
    ollama = {
      namespace = kubernetes_namespace.ollama.metadata[0].name
      status    = "Managed by ArgoCD"
    }
    librechat = {
      namespace = kubernetes_namespace.librechat.metadata[0].name
      status    = "Managed by ArgoCD"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

try() function:

try(expression, fallback)
Enter fullscreen mode Exit fullscreen mode

Attempts to execute expression, returns fallback if it fails. Avoids errors when the secret does not yet exist.

With the outputs block in place, main.tf is complete. The full file now captures the entire platform bootstrap in roughly 200 lines: provider config, namespaces, one credentials secret, one Helm release, two ArgoCD Application CRDs, and four outputs. Running terraform apply once produces a live ArgoCD instance watching your Git repository — everything after that point is GitOps.


Implementing Application Charts

The Git repository that ArgoCD watches is separate from the Terraform code. It contains only the Helm wrapper charts — one directory per application — and nothing else. This clean separation means developers working on application configuration never need to touch Terraform, and the platform team can evolve the infrastructure layer independently.

Structure

├── apps
│   ├── librechat
│   │   ├── Chart.yaml
│   │   └── values.yaml
│   └── ollama
│       ├── Chart.yaml
│       └── values.yaml
├── main.tf
├── values
│   └── argocd-values.yaml
└── variables.tf
Enter fullscreen mode Exit fullscreen mode

ArgoCD Configuration (values/argocd-values.yaml)

global:
  domain: argocd.glukas.space

server:
  service:
    type: ClusterIP

  extraArgs:
    - --insecure  # HTTP only — no TLS (development only)

  ingress:
    enabled: true
    ingressClassName: nginx
    hosts:
      - argocd.glukas.space
    paths:
      - /
    pathType: Prefix
    tls: []

  config:
    timeout.reconciliation: 180s  # Polling interval

controller:
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 500m
      memory: 512Mi

repoServer:
  resources:
    requests:
      cpu: 50m
      memory: 128Mi
    limits:
      cpu: 200m
      memory: 256Mi

applicationSet:
  enabled: true

notifications:
  enabled: false

dex:
  enabled: false
Enter fullscreen mode Exit fullscreen mode

Critical parameters:

timeout.reconciliation: 180s

  • Git polling frequency
  • Trade-off: Lower value = faster detection, higher load
  • Recommendation: 180s (3 min) for most cases

extraArgs: [--insecure]

  • DEVELOPMENT ONLY
  • Production must use TLS with valid certificates

resources.limits

  • Controller: Component responsible for reconciliation
  • RepoServer: Component that reads Git
  • Values for ~10 applications; scale as needed

Ollama Configuration

# Chart.yaml
apiVersion: v2
name: ollama
description: Ollama deployment managed by ArgoCD
type: application
version: 1.0.0

dependencies:
  - name: ollama
    version: "1.42.0"
    repository: "https://otwld.github.io/ollama-helm/"
Enter fullscreen mode Exit fullscreen mode

Fields:

apiVersion: v2: Helm 3 API version

name: Wrapper chart name

type: application: Deployable chart (vs library)

version: Wrapper version (local control)

dependencies:

name: ollama: Dependency name

  • Automatically creates namespace .Values.ollama

version: "1.42.0": Fixed version of the upstream chart

  • CRITICAL: Always pin the version
  • Avoid: version: "*" or no version

repository: Chart source

Ollama: values.yaml — Analysing its Double Hierarchy

ollama:  # Layer 1: dependency namespace (created automatically by Helm)

  ollama:  # Layer 2: chart's internal namespace
    gpu:
      enabled: true
      type: nvidia
      number: 1

    models:
      pull:
        - llama3.2:3b
        - deepseek-r1:14b

    ingress:
      enabled: true
      className: nginx
      annotations:
        nginx.ingress.kubernetes.io/proxy-body-size: "0"
        nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
        nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
        nginx.ingress.kubernetes.io/proxy-connect-timeout: "600"
      hosts:
        - host: ollama.glukas.space
          paths:
            - path: /
              pathType: Prefix
      tls: []

  service:
    type: ClusterIP
    port: 11434

  resources:
    requests:
      memory: "2Gi"
      cpu: "500m"
    limits:
      memory: "4Gi"
      cpu: "2000m"
Enter fullscreen mode Exit fullscreen mode

Why Double Hierarchy?

This is one of the most common silent failures when working with Helm wrapper charts. The configuration appears to apply correctly — ArgoCD reports Synced, no errors surface — but the behaviour you expected (GPU enabled, specific models pulled) simply does not materialise. Understanding the mechanism once prevents hours of debugging later.

Helm automatically creates a value namespace when processing a dependency:

dependencies:
  - name: ollama
Enter fullscreen mode Exit fullscreen mode

As a result, .Values.ollama is automatically created.

Upstream chart has an internal namespace:

helm show values https://otwld.github.io/ollama-helm/ollama
Enter fullscreen mode Exit fullscreen mode

Output:

ollama:  # Chart internal namespace
  gpu:
    enabled: false
Enter fullscreen mode Exit fullscreen mode

Consequence:

Wrapper creates:     .Values.ollama
Chart expects:     .Values.ollama.xxx
Result:  .Values.ollama.ollama.xxx
Enter fullscreen mode Exit fullscreen mode

Technical rules:

  1. Every dependency creates a namespace
dependencies:
  - name: X
Enter fullscreen mode Exit fullscreen mode

Helm always creates .Values.X

  1. Some charts have an internal namespace

If the first line is the chart name, there is an internal namespace.

  1. Combination = Duplication
Layer Origin Path
1 Wrapper .Values.ollama
2 Chart .Values.ollama
Final - .Values.ollama.ollama

Validation:

# Verify internal namespace
helm show values <repo>/<chart>

# Render locally
helm template test apps/ollama/

# Search for specific configuration
helm template test apps/ollama/ | grep -A5 "nvidia.com/gpu"
Enter fullscreen mode Exit fullscreen mode

Common mistake:

# ❌ INCORRECT (only one layer)
ollama:
  gpu:
    enabled: true
Enter fullscreen mode Exit fullscreen mode

Result:

  • Helm looks for: .Values.ollama.ollama.gpu.enabled
  • Finds: .Values.ollama.gpu.enabled
  • Uses default: gpu.enabled: false
  • Symptom: No error, but GPU not enabled

Solution:

# ✅ CORRECT (double layer)
ollama:
  ollama:
    gpu:
      enabled: true
Enter fullscreen mode Exit fullscreen mode

Specific Configurations

GPU:

ollama:
  ollama:
    gpu:
      enabled: true
      type: nvidia      # Alternativa: amd (ROCm)
      number: 1
Enter fullscreen mode Exit fullscreen mode

Generates:

resources:
  limits:
    nvidia.com/gpu: "1"

nodeSelector:
  nvidia.com/gpu: "true"

tolerations:
  - key: nvidia.com/gpu
    operator: Exists
Enter fullscreen mode Exit fullscreen mode

Models:

models:
  pull:
    - llama3.2:3b
    - deepseek-r1:14b
Enter fullscreen mode Exit fullscreen mode

Chart creates an init container:

initContainers:
  - name: pull-models
    command:
      - /bin/sh
      - -c
      - |
        ollama pull llama3.2:3b
        ollama pull deepseek-r1:14b
Enter fullscreen mode Exit fullscreen mode

Ingress annotations:

annotations:
  nginx.ingress.kubernetes.io/proxy-body-size: "0"
  nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
  nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
Enter fullscreen mode Exit fullscreen mode

proxy-body-size: "0": No upload limit (models are large)

proxy-read-timeout: "600": 10min timeout (long inference)

proxy-send-timeout: "600": 10min timeout

Service and Resources (root layer):

Note: service and resources sit in the first ollama layer, not the second. The upstream chart does not expose these fields within its internal namespace, so they must be set at the dependency root level — not nested inside the chart's own namespace.

Full structure:

ollama:              # dependency namespace
  ollama:            # chart's internal namespace
    gpu: ...
    models: ...
    ingress: ...

  service: ...       # root level
  resources: ...     # root level
Enter fullscreen mode Exit fullscreen mode

LibreChat Configuration

Chart.yaml

apiVersion: v2
name: librechat
description: LibreChat deployment managed by ArgoCD
type: application
version: 1.0.0

dependencies:
  - name: librechat
    version: "1.9.7"
    repository: "oci://ghcr.io/danny-avila/librechat-chart"
Enter fullscreen mode Exit fullscreen mode

OCI Repository:

repository: "oci://ghcr.io/danny-avila/librechat-chart"
Enter fullscreen mode Exit fullscreen mode

Syntax: oci://<registry>/<owner>/<chart>

Differences vs. HTTP repository:

  • Uses the same container registry infrastructure
  • Faster pull
  • Better integrated versioning

values.yaml

librechat:  # Layer 1: dependency namespace

  librechat:  # Layer 2: chart's internal namespace (double hierarchy)

    configEnv:
      APP_TITLE: "LibreChat + Ollama (via Terraform)"
      HOST: "0.0.0.0"
      PORT: "3080"
      SEARCH: "true"
      MONGO_URI: "mongodb://librechat-mongodb:27017/LibreChat"
      MEILI_HOST: "http://librechat-meilisearch:7700"
      ALLOW_EMAIL_LOGIN: "true"
      ALLOW_REGISTRATION: "true"
      ALLOW_SOCIAL_LOGIN: "false"
      ALLOW_SOCIAL_REGISTRATION: "false"

    configYamlContent: |
      version: 1.1.5
      cache: true
      endpoints:
        custom:
          - name: "Ollama"
            apiKey: "ollama"
            baseURL: "http://ollama.ollama.svc.cluster.local:11434/v1"
            models:
              default:
                - "llama2:latest"
              fetch: true
            titleConvo: true
            titleModel: "current_model"
            modelDisplayLabel: "Ollama"

  ingress:
    enabled: true
    className: "nginx"
    annotations:
      nginx.ingress.kubernetes.io/proxy-body-size: "25m"
      nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
      nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
      nginx.ingress.kubernetes.io/proxy-connect-timeout: "600"
    hosts:
      - host: librechat.glukas.space
        paths:
          - path: /
            pathType: Prefix
    tls: []

  resources:
    requests:
      memory: "256Mi"
      cpu: "100m"
    limits:
      memory: "1Gi"
      cpu: "500m"

  persistence:
    enabled: true
    size: 5Gi
    storageClass: "standard"

  replicaCount: 1

  mongodb:
    enabled: true
    image:
      registry: docker.io
      repository: bitnami/mongodb
      tag: "latest"
      pullPolicy: IfNotPresent
    auth:
      enabled: false
    persistence:
      enabled: true
      size: 8Gi
    resources:
      requests:
        memory: "256Mi"
        cpu: "100m"
      limits:
        memory: "1Gi"
        cpu: "500m"

  meilisearch:
    enabled: true
    auth:
      enabled: false
    environment:
      MEILI_NO_ANALYTICS: "true"
      MEILI_ENV: "development"
    persistence:
      enabled: true
      size: 1Gi
    resources:
      requests:
        memory: "128Mi"
        cpu: "50m"
      limits:
        memory: "512Mi"
        cpu: "250m"
Enter fullscreen mode Exit fullscreen mode

Breaking down:

configEnv:
Environment variables converted to:

env:
  - name: APP_TITLE
    value: "LibreChat + Ollama (via Terraform)"
Enter fullscreen mode Exit fullscreen mode

configYamlContent:
Multi-line YAML (pipe |) written into a ConfigMap and mounted as a file.

Helm processes:

configYamlContent: | ...
Enter fullscreen mode Exit fullscreen mode

Creates:

apiVersion: v1
kind: ConfigMap
metadata:
  name: librechat-config
data:
  librechat.yaml: |
    version: 1.1.5
    cache: true
    ...
Enter fullscreen mode Exit fullscreen mode

Mounts in:

volumeMounts:
  - name: config
    mountPath: /app/librechat.yaml
    subPath: librechat.yaml
Enter fullscreen mode Exit fullscreen mode

Sub-charts (mongodb, meilisearch):

Located in the first librechat layer, not the second.

Structure:

librechat:           # dependency namespace
  librechat:         # chart's internal namespace
    configEnv: ...
    configYamlContent: ...

  mongodb: ...       # sub-chart (root level)
  meilisearch: ...   # sub-chart (root level)
  ingress: ...       # root level
  resources: ...     # root level
Enter fullscreen mode Exit fullscreen mode

.gitignore

# Helm
charts/
Chart.lock

# Secrets
*-secrets.yaml
*.secret.yaml

# Backups
*.bak
*.tmp

# IDE
.vscode/
.idea/
*.swp
Enter fullscreen mode Exit fullscreen mode

Prevents accidental commits of:

  • Downloaded charts (regeneratable)
  • Plaintext secrets
  • Temporary files

Deployment

Prerequisites

# Cluster Minikube
minikube start \
  --driver docker \
  --container-runtime docker \
  --gpus all \
  --memory 8192 \
  --cpus 4

minikube addons enable ingress

# Local DNS
echo "$(minikube ip) ollama.glukas.space" | sudo tee -a /etc/hosts
echo "$(minikube ip) librechat.glukas.space" | sudo tee -a /etc/hosts
echo "$(minikube ip) argocd.glukas.space" | sudo tee -a /etc/hosts

# Git repository created and populated
git clone https://github.com/usuario/k8s-apps.git
cd k8s-apps
# Copy Chart.yaml and values.yaml to apps/ollama/ and apps/librechat/
git add .
git commit -m "Initial commit"
git push origin main
Enter fullscreen mode Exit fullscreen mode

Terraform: Configuration

# terraform.tfvars
cat > terraform.tfvars <<EOF
git_repo_url = "https://github.com/usuario/k8s-apps.git"
git_branch   = "main"

jwt_secret         = "$(openssl rand -hex 32)"
jwt_refresh_secret = "$(openssl rand -hex 32)"
creds_key          = "$(openssl rand -hex 32)"
creds_iv           = "$(openssl rand -hex 16)"
EOF

# .gitignore
echo "terraform.tfvars" >> .gitignore
Enter fullscreen mode Exit fullscreen mode

Terraform: Init

cd terraform/
terraform init
Enter fullscreen mode Exit fullscreen mode

Output:

Initializing provider plugins...
- Installing hashicorp/kubernetes v2.23.0...
- Installing hashicorp/helm v2.11.0...

Terraform has been successfully initialized!
Enter fullscreen mode Exit fullscreen mode

Terraform: Plan

terraform plan -out=tfplan
Enter fullscreen mode Exit fullscreen mode

Planned resources:

Plan: 7 to add, 0 to change, 0 to destroy.

Resources:
  + kubernetes_namespace.argocd
  + kubernetes_namespace.ollama
  + kubernetes_namespace.librechat
  + kubernetes_secret.librechat_credentials
  + helm_release.argocd
  + kubernetes_manifest.argocd_app_ollama
  + kubernetes_manifest.argocd_app_librechat
Enter fullscreen mode Exit fullscreen mode

Terraform: Apply

terraform apply tfplan
Enter fullscreen mode Exit fullscreen mode

Timeline:

[00:00-00:02] Namespaces created
[00:02-00:03] Secret created
[00:03-01:06] ArgoCD installed (Helm chart deployment)
[01:06-01:07] ArgoCD Applications registered (CRDs)

Apply complete! Resources: 7 added, 0 changed, 0 destroyed.
Enter fullscreen mode Exit fullscreen mode

Note: Terraform only creates the platform. Apps will be deployed by ArgoCD.

ArgoCD: Initial Access

# Get password
ARGOCD_PASSWORD=$(minikube kubectl -- -n argocd get secret argocd-initial-admin-secret -o jsonpath='{.data.password}' | base64 -d)
echo "URL: http://argocd.glukas.space"
echo "User: admin"
echo "Pass: $ARGOCD_PASSWORD"
Enter fullscreen mode Exit fullscreen mode

Access the UI:

argo_ui

Initial state:

Applications:
  ollama      Status: Syncing...
  librechat   Status: OutOfSync
Enter fullscreen mode Exit fullscreen mode

ArgoCD: Sync Process

ArgoCD runs automatically:

1. Clone the repository:

git clone https://github.com/usuario/k8s-apps.git
git checkout main
Enter fullscreen mode Exit fullscreen mode

2. Change detection:

Current SHA: abc123def456...
Last synced: (none - first sync)
Action: Sync required
Enter fullscreen mode Exit fullscreen mode

3. Helm processing:

# For apps/ollama/
helm dependency build apps/ollama/
helm template ollama apps/ollama/ --values apps/ollama/values.yaml

# Generates YAML manifests
Enter fullscreen mode Exit fullscreen mode

4. Application to the cluster:

kubectl apply -f <manifestos gerados>
Enter fullscreen mode Exit fullscreen mode

5. Health checking:

Waiting:
  - Pods: Ready
  - Deployments: Available
  - StatefulSets: Ready
Enter fullscreen mode Exit fullscreen mode

Observable timeline:

# Terminal 2: Monitor Ollama
watch kubectl get pods -n ollama

# Output evolves as ArgoCD syncs:
NAME                      READY   STATUS
ollama-xxx-yyy           0/1     Pending
ollama-xxx-yyy           0/1     ContainerCreating
ollama-xxx-yyy           0/1     Running  # Init container: pulling models
ollama-xxx-yyy           1/1     Running  # Ready (~2-3 min)

# Terminal 3: Monitor LibreChat
watch kubectl get pods -n librechat

# Output evolves:
NAME                                    READY   STATUS
librechat-mongodb-0                     0/1     Pending
librechat-meilisearch-xxx               0/1     ContainerCreating
librechat-xxx-yyy                       0/1     Pending

librechat-mongodb-0                     1/1     Running  # ~30s
librechat-meilisearch-xxx               1/1     Running  # ~25s
librechat-xxx-yyy                       1/1     Running  # ~1 min
Enter fullscreen mode Exit fullscreen mode

After 3–5 minutes, ArgoCD UI shows:

argo_ui

Verification

# Ollama
curl http://ollama.glukas.space/api/tags
{
  "models": [
    {"name": "llama3.2:3b", ...},
    {"name": "deepseek-r1:14b", ...}
  ]
}

Enter fullscreen mode Exit fullscreen mode

librechat


Operations

These five workflows cover the full operational lifecycle under GitOps: making a configuration change, upgrading a chart version, rolling back a broken deploy, understanding self-healing behaviour, and managing multiple environments. In every case the pattern is the same — edit files, push to Git, let ArgoCD do the rest. No kubectl apply, no terraform apply, no manual intervention required.

Workflow 1: Modify Configuration (Add a Model)

Objective: Add the mistral:latest model to Ollama.

Process:

# 1. Clone and branch
git clone https://github.com/usuario/k8s-apps.git
cd k8s-apps
git checkout -b add-mistral

# 2. Edit
vim apps/ollama/values.yaml

# Modify:
models:
  pull:
    - llama3.2:3b
    - deepseek-r1:14b
    - mistral:latest  # Added

# 3. Commit
git add apps/ollama/values.yaml
git commit -m "feat(ollama): Add mistral model"

# 4. Push
git push origin add-mistral
Enter fullscreen mode Exit fullscreen mode

Pull Request:

  • Create PR on GitHub/GitLab
  • Visible diff:
 models:
   pull:
     - llama3.2:3b
     - deepseek-r1:14b
+    - mistral:latest
Enter fullscreen mode Exit fullscreen mode
  • Review and approval
  • Merge to main

Automatic ArgoCD:

Timeline after merge:

[T+0 min]   Merge to main
[T+0-3 min] ArgoCD polling (waiting for next cycle)
[T+3 min]   ArgoCD detects new SHA
[T+3 min]   Calculates diff: + models.pull: mistral:latest
[T+3 min]   Helm upgrade ollama...
[T+4 min]   Rolling update initiated
[T+4-7 min] Init container: ollama pull mistral:latest
[T+7 min]   New pod Ready
[T+7 min]   Old pod Terminated
[T+7 min]   ArgoCD status: Synced ✓
Enter fullscreen mode Exit fullscreen mode

Total time: ~7 minutes from merge to deploy.

Note: At no point did the developer execute any command against the cluster directly — the entire deployment was driven by a Git push.

Workflow 2: Chart Version Upgrade

Objective: Upgrade LibreChat from 1.9.7 to 1.10.0.

git checkout -b upgrade-librechat

vim apps/librechat/Chart.yaml

# Modify:
dependencies:
  - name: librechat
    version: "1.10.0"  # Era 1.9.7

git commit -am "chore(librechat): Upgrade to v1.10.0"
git push origin upgrade-librechat
Enter fullscreen mode Exit fullscreen mode

CI/CD (optional):

# .github/workflows/helm-lint.yml
name: Helm Lint
on: pull_request

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: azure/setup-helm@v3

      - name: Lint
        run: |
          helm lint apps/*/

      - name: Template Test
        run: |
          helm template test apps/*/ > /dev/null
Enter fullscreen mode Exit fullscreen mode

Pipeline runs:

  • Helm lint (syntax validation)
  • Template rendering (detects errors)

After approval and merge, ArgoCD deploys automatically.

Workflow 3: Rollback

Scenario: New deploy caused a problem in production.

Option 1: Git Revert

# View commits
git log --oneline apps/librechat/
# def456 chore(librechat): Upgrade to v1.10.0
# abc123 feat(ollama): Add mistral

# Revert
git revert def456
git push origin main
Enter fullscreen mode Exit fullscreen mode

ArgoCD detects and applies the revert automatically.

Timeline: ~3–5 minutes.

Option 2: ArgoCD UI

1. Open http://argocd.glukas.space
2. Select the "librechat" application
3. Open the "History" tab
4. Sync list:
   Sync 5: def456 (current) ❌
   Sync 4: abc123 ✅
5. Click on Sync 4
6. Click "Rollback"
7. Confirm
Enter fullscreen mode Exit fullscreen mode

Timeline: ~30 segundos.

Important: Rollback via UI is temporary. The next poll will re-sync with Git. For permanence, perform a git revert.

Option 3: ArgoCD CLI

# Install CLI
brew install argocd  # ou método apropriado

# Login
argocd login argocd.glukas.space --username admin

# View history
argocd app history librechat

# Rollback
argocd app rollback librechat <REVISION>
Enter fullscreen mode Exit fullscreen mode

Workflow 4: Self-Healing

Scenario: Manual change in the cluster.

# Someone runs:
kubectl scale deployment ollama --replicas=3 -n ollama
Enter fullscreen mode Exit fullscreen mode

ArgoCD response:

[T+0s]     kubectl scale executed
[T+0s]     Deployment: replicas=3
[T+0-180s] ArgoCD polling interval
[T+180s]   ArgoCD detects drift:
           Git: replicas=1
           Cluster: replicas=3
[T+181s]   Self-heal triggered
           kubectl apply -f deployment.yaml (from Git)
[T+182s]   Kubernetes: replicas=1
           3 extra pods terminated
[T+183s]   ArgoCD status: Synced ✓
           Event: "Self-healed: ollama deployment"
Enter fullscreen mode Exit fullscreen mode

Manual change was automatically reverted.

Responsible configuration:

syncPolicy = {
  automated = {
    selfHeal = true  # This parameter enables automatic revert
  }
}
Enter fullscreen mode Exit fullscreen mode

Disable self-heal:

syncPolicy = {
  automated = {
    prune = true
    selfHeal = false  # Allows manual changes to persist
  }
}
Enter fullscreen mode Exit fullscreen mode

Workflow 5: Multi-Environment

Structure:

k8s-apps/
├── apps/
│   └── ollama/
│       ├── Chart.yaml
│       ├── values-dev.yaml
│       ├── values-staging.yaml
│       └── values-prod.yaml
Enter fullscreen mode Exit fullscreen mode

Differentiated values:

# values-dev.yaml
ollama:
  ollama:
    models:
      pull:
        - llama3.2:3b  # Lightweight model only
    resources:
      limits:
        memory: "2Gi"

# values-prod.yaml
ollama:
  ollama:
    models:
      pull:
        - llama3.2:3b
        - deepseek-r1:14b
        - mistral:latest
    resources:
      limits:
        memory: "8Gi"
Enter fullscreen mode Exit fullscreen mode

ArgoCD Applications (Terraform):

# Dev
resource "kubernetes_manifest" "argocd_app_ollama_dev" {
  manifest = {
    spec = {
      source = {
        repoURL        = var.git_repo_url
        targetRevision = "develop"  # develop branch
        path           = "apps/ollama"
        helm = {
          valueFiles = ["values-dev.yaml"]
        }
      }
      destination = {
        namespace = "ollama-dev"
      }
    }
  }
}

# Prod
resource "kubernetes_manifest" "argocd_app_ollama_prod" {
  manifest = {
    spec = {
      source = {
        repoURL        = var.git_repo_url
        targetRevision = "main"  # main branch
        path           = "apps/ollama"
        helm = {
          valueFiles = ["values-prod.yaml"]
        }
      }
      destination = {
        namespace = "ollama-prod"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Promotion flow:

Feature branch → develop (PR) → auto-deploy to Dev
              → staging (PR)  → auto-deploy to Staging
              → main (PR + approvals) → auto-deploy to Prod
Enter fullscreen mode Exit fullscreen mode

Git branches map to environments.


Troubleshooting

This section covers the most common failure modes when running ArgoCD in practice — what they look like, why they happen, and how to fix them.

Problem 1: Application OutOfSync

An OutOfSync status means ArgoCD has detected a difference between what's in Git and what's running in the cluster, but hasn't been able to resolve it. This is usually the first sign that something went wrong during a sync — not necessarily a cluster problem, but worth investigating immediately.

Symptom:

kubectl get application -n argocd
NAME      SYNC STATUS   HEALTH STATUS
ollama    OutOfSync     Unknown
Enter fullscreen mode Exit fullscreen mode

Diagnosis:

# Describe Application
kubectl describe application ollama -n argocd

# View events
kubectl get events -n argocd --sort-by='.lastTimestamp'

# repo-server logs
kubectl logs -n argocd deployment/argocd-repo-server

# application-controller logs
kubectl logs -n argocd statefulset/argocd-application-controller
Enter fullscreen mode Exit fullscreen mode

Common causes:

  1. YAML syntax error
Error: YAML parse error line 15: mapping values are not allowed here
Enter fullscreen mode Exit fullscreen mode

Solution: Fix syntax in values.yaml

  1. Chart version not found
Error: chart "ollama" version "1.42.0" not found
Enter fullscreen mode Exit fullscreen mode

Solution: Check available versions:

helm search repo ollama --versions
Enter fullscreen mode Exit fullscreen mode
  1. Repository unreachable
Error: failed to fetch https://github.com/usuario/k8s-apps.git: authentication required
Enter fullscreen mode Exit fullscreen mode

Solution: Configure credentials in ArgoCD

Local validation:

# Test template rendering
cd k8s-apps/
helm dependency build apps/ollama/
helm template test apps/ollama/

# If there's an error, it will appear here
Enter fullscreen mode Exit fullscreen mode

Problem 2: Pods CrashLoopBackOff

A CrashLoopBackOff means the pod is starting, failing, and being restarted repeatedly. ArgoCD may show the application as Synced — meaning the deployment was applied correctly — but Degraded on health, because the pod never reaches a running state. The problem is almost always in the container itself, not in ArgoCD.

Symptom:

kubectl get pods -n ollama
NAME            READY   STATUS             RESTARTS
ollama-xxx      0/1     CrashLoopBackOff   5
Enter fullscreen mode Exit fullscreen mode

Diagnosis:

# Current pod logs
kubectl logs -n ollama ollama-xxx

# Previous container logs (if it has restarted)
kubectl logs -n ollama ollama-xxx --previous

# Pod events and conditions
kubectl describe pod -n ollama ollama-xxx
Enter fullscreen mode Exit fullscreen mode

Common causes:

  1. GPU not available
Error: failed to initialize NVML: could not load NVML library
Enter fullscreen mode Exit fullscreen mode

Solution:

# Temporarily disable GPU
ollama:
  ollama:
    gpu:
      enabled: false
Enter fullscreen mode Exit fullscreen mode
  1. Insufficient memory
Error: OOMKilled
Enter fullscreen mode Exit fullscreen mode

Solution:

resources:
  limits:
    memory: "8Gi"  # Increase
Enter fullscreen mode Exit fullscreen mode
  1. Model does not exist
Error: pulling model: model 'llama4' not found
Enter fullscreen mode Exit fullscreen mode

Solution: Check model name on values.yaml

Problem 3: Double Hierarchy Not Applied

This is one of the trickier failure modes because ArgoCD reports everything as healthy — the sync succeeded, no errors are visible, but the configuration simply isn't taking effect. It typically happens when the Helm values file is missing one level of nesting, causing the GPU settings to silently fall back to defaults.

Symptom:

  • ArgoCD shows Synced
  • GPU not enabled
  • No visible errors

Diagnosis:

# Render full template
helm template test apps/ollama/

# Search for GPU configuration
helm template test apps/ollama/ | grep -A10 "nvidia.com/gpu"

# If not found, structure is wrong
Enter fullscreen mode Exit fullscreen mode

Cause:

# ❌ Incorrect structure (one layer)
ollama:
  gpu:
    enabled: true
Enter fullscreen mode Exit fullscreen mode

Solution:

# ✅ Correct structure (double layer)
ollama:
  ollama:
    gpu:
      enabled: true
Enter fullscreen mode Exit fullscreen mode

Validation:

# After correction, check diff in ArgoCD UI
# Should show change in spec.template.spec.containers[].resources
Enter fullscreen mode Exit fullscreen mode

Problem 4: Slow Sync

Unlike the previous problems, slow sync isn't a failure — it's expected behavior that becomes surprising when you first encounter it. ArgoCD doesn't watch Git in real time; it polls on a fixed interval, so there will always be a delay between a git push and a deployment.

Symptom:
ArgoCD takes >5 minutes to detect changes.

Cause:
Default polling interval is 3 minutes.

Solution 1: Adjust polling

# values/argocd-values.yaml
server:
  config:
    timeout.reconciliation: 60s  # 1 minute
Enter fullscreen mode Exit fullscreen mode

Trade-off: More load on the cluster and Git repo.

Solution 2: Webhook

Configure a webhook in Git to notify ArgoCD:

# GitHub webhook URL
POST https://argocd.glukas.space/api/webhook
Enter fullscreen mode Exit fullscreen mode

ArgoCD syncs immediately upon receiving a push.

Solution 3: Manual sync

# Via CLI
argocd app sync ollama

# Via UI
Click "Sync" on the application
Enter fullscreen mode Exit fullscreen mode

Ch. 3 vs Ch. 4: When to Use Each

Both approaches are valid, and the right choice depends on team size, deploy frequency, and how much operational overhead you want to absorb upfront. The table below maps the key trade-offs to help you decide:

Cap 3 (Terraform + Helm) Cap 4 (Terraform + ArgoCD)
Deploy trigger Manual: terraform apply Automatic: Git push
Latency Immediate 3 min (polling)
Reconciliation Manual: terraform plan Continuous: 3-min loop
Drift detection Manual Automatic
Self-healing Does not exist Configurable (selfHeal)
Rollback git revert + terraform apply ArgoCD UI (1 click) or git revert
Audit trail Git + Terraform logs Git + ArgoCD events
Multi-env Duplicate code or workspaces Branches + valueFiles
Permissions required kubectl + Terraform Git only
Disaster recovery Re-run Terraform Automatic ArgoCD re-sync
State management Terraform state (central) Git (distributed)
Initial complexity Medium High
Scalability (apps) ~20 apps Unlimited
Ideal team size 1–10 10+

Chapter 3's approach is simpler to set up and sufficient for small teams with controlled deploy cadences — if a weekly terraform apply is acceptable, the added complexity of ArgoCD is not justified. Chapter 4 becomes the right choice once teams grow, deploy frequency increases, or compliance requirements demand an immutable audit trail and automated drift correction. The two are not mutually exclusive: many organisations start with Chapter 3 and migrate to Chapter 4 as their operational maturity grows.


Conclusion

Chapters 1 through 4 trace a deliberate progression — from manual kubectl commands to a fully automated, self-healing platform. Each chapter addressed a specific limitation of the one before it: verbosity, the need for manual execution, the absence of continuous reconciliation. The cumulative result is an architecture where Git is the single source of truth, and the cluster enforces that truth on its own.

The four GitOps principles are not just theoretical framing — each one translates directly into an operational guarantee. Declarative configuration means the desired state is always readable and auditable without touching the cluster. Version control means every change has an author, a rationale, and a rollback path. Pull-based deployment means no external system ever needs credentials to reach the cluster — the cluster reaches out to Git. Continuous reconciliation means drift is detected and corrected automatically, without anyone noticing or reacting.

The architecture also enforces a clean separation of concerns that makes each layer independently replaceable:

Terraform → Platform bootstrap (namespaces, secrets, ArgoCD)
Git       → Application desired state
ArgoCD    → Reconciliation engine
Helm      → Packaging and templating
Enter fullscreen mode Exit fullscreen mode

Changes to one layer do not cascade into the others. You could swap Helm for raw manifests, or replace Terraform with a different provisioner, without touching ArgoCD or the Git repository structure.

This foundation is deliberately extensible. The next steps — security, observability, multi-tenancy — build on top of it without requiring the core architecture to change.

Maturity Journey

Each chapter in this series represents a deliberate step up the maturity ladder — not just in tooling, but in ownership model, speed, and scale:

Stage 1: Manual Deployment (Ch. 1)

Maturity:  Ad-hoc
Ownership: Individuals
Speed:     Slow (days/weeks)
Scale:     Doesn't scale
Enter fullscreen mode Exit fullscreen mode

Stage 2: Infrastructure as Code (Ch. 2–3)

Maturity:  Repeatable
Ownership: Ops team
Speed:     Medium (hours/days)
Scale:     Limited (manual execution)
Enter fullscreen mode Exit fullscreen mode

Stage 3: GitOps Foundation (Ch. 4) ← We are here

Maturity:  Automated
Ownership: Shared (platform + dev)
Speed:     Fast (minutes/hours)
Scale:     Good (self-service enabled)
Enter fullscreen mode Exit fullscreen mode

Stage 4: Infrastructure as Product (Next Steps)

Maturity:  Product-driven
Ownership: Platform teams (product owners)
Speed:     Very fast (minutes)
Scale:     Excellent (true self-service)
Metrics:   DORA, satisfaction, adoption
Enter fullscreen mode Exit fullscreen mode

What Comes Next

Stage 3 is a foundation, not a destination. The architecture built in this chapter is intentionally minimal — one team, two applications, one cluster — and that is the right place to start. But the same GitOps primitives that make this setup work at small scale are exactly what allow it to grow.

The diagram below shows the current state: a single developer workflow, a flat namespace structure, and ArgoCD managing two specific workloads with no shared services, no multi-tenancy, and no separation between platform concerns and application concerns.

as_is

The target looks substantially different. The cluster is split into two distinct layers: a Platform Layer of shared services — security, observability, secrets management — owned by a dedicated platform team with SLAs and roadmaps; and a Workload Layer where individual product teams deploy independently via git push, without ever touching the platform layer beneath them.

to_be

The gap between the two diagrams is not a rewrite — it is an incremental build. Every component in the Platform Layer gets added as an ArgoCD-managed application in its own namespace, following the exact same wrapper-chart pattern introduced in this chapter. The core architecture does not change; it simply gains more managed services over time.

The next chapters will build out this platform layer starting with the highest-impact additions: security, observability, and secrets management.

Initiative Domain Phase Complexity Impact Dependencies Time Priority
Pomerium SECURITY Foundation Intermediate High ArgoCD 3-5d P0
Sealed Secrets SECURITY Foundation Basic High None 1d P0
Authentik SECURITY Foundation Intermediate High PostgreSQL 3-5d P0
Prometheus + Grafana OBSERVABILITY Foundation Intermediate High None 3-5d P0
MCP Servers INTEGRATION Foundation Intermediate High None 2-3d P0
RAG (Qdrant) AI/LLM Foundation Advanced High None 1w P1
LangSmith/Langfuse AI/LLM Scale Advanced High Prometheus 5-7d P1
Autoscaling INFRA Scale Intermediate High Prometheus 2-3d P1
Loki OBSERVABILITY Scale Basic Medium Grafana 1-2d P1
SearXNG INTEGRATION Scale Basic Medium None 1d P1
Web Scraper INTEGRATION Scale Intermediate Medium None 2d P1
Tilt DEVEX Scale Basic Medium None 1d P1
Jaeger OBSERVABILITY Production Excellence Advanced Medium Prometheus 3-5d P2
Model Registry AI/LLM Production Excellence Intermediate Medium None 3-5d P2
Multi-region NETWORK Production Excellence Expert Medium ArgoCD 2w+ P3
Fine-tuning AI/LLM Production Excellence Expert Low Registry 2w P3

Platform Products (Shared Services):

Pomerium + Authentik:
  product: "Authentication & Authorization Platform"
  customers: "All applications"
  value: "SSO, MFA, zero-trust"
  sla: "99.9% uptime, <200ms auth latency"
  roadmap: ["RBAC granular", "SAML support", "API keys"]

Prometheus + Grafana + Loki:
  product: "Observability Platform"
  customers: "All teams (dev + ops)"
  value: "Unified metrics/logs/traces"
  sla: "30d retention, <5s query time"
  roadmap: ["AIOps", "Cost attribution", "SLO management"]

Sealed Secrets:
  product: "Secrets Management Platform"
  customers: "All teams"
  value: "Git-native secrets, rotation, audit"
  sla: "Zero exposure, <1min sync"
  roadmap: ["Vault integration", "RBAC", "Expiration"]
Enter fullscreen mode Exit fullscreen mode

Workload-Specific Products:

RAG (Qdrant):
  product: "Vector Search Service"
  customers: "AI/ML teams"
  value: "Semantic search, embeddings"
  sla: "<100ms p95 search latency"
  roadmap: ["Multi-model", "Hybrid search"]

MCP Servers:
  product: "Tool Integration Platform"
  customers: "LLM applications"
  value: "Connect LLMs to tools"
  sla: "<50ms tool invocation"
  roadmap: ["Custom tools", "Async execution"]
Enter fullscreen mode Exit fullscreen mode

Developer Experience Products:

Tilt:
  domain: "[DEVEX]"
  phase: "Scale"
  complexity: "Basic"
  impact: "Medium"
  dependencies: ["None"]
  time: "1 day"
  priority: "P1"
  product: "Local Development Platform"
  customers: "All developers"
  value: "Hot-reload, real K8s environment, fast iteration"
  sla: "<5s code sync, <10s service restart"
  roadmap: ["Remote development", "Debugging tools", "Resource snapshots"]
Enter fullscreen mode Exit fullscreen mode

Infrastructure Products:

Multi-region:
  domain: "[NETWORK]"
  phase: "Production Excellence"
  complexity: "Expert"
  impact: "Medium"
  dependencies: ["ArgoCD"]
  time: "2+ weeks"
  priority: "P3"
  product: "Global Load Balancing & Geo-distribution"
  customers: "All production workloads"
  value: "Low latency worldwide, compliance (data residency)"
  sla: "99.99% global availability, <100ms cross-region failover"
  roadmap: ["Active-active", "Traffic shaping", "Cost optimization", "DR automation"]
Enter fullscreen mode Exit fullscreen mode

Production Recommended Extensions

  1. TLS/HTTPS:
# argocd-values.yaml
server:
  ingress:
    tls:
      - secretName: argocd-tls
        hosts:
          - argocd.empresa.com
Enter fullscreen mode Exit fullscreen mode
  1. SSO/OIDC:
server:
  config:
    url: https://argocd.empresa.com
    oidc.config: |
      name: Okta
      issuer: https://empresa.okta.com
      clientID: $oidc.okta.clientId
      clientSecret: $oidc.okta.clientSecret
Enter fullscreen mode Exit fullscreen mode
  1. RBAC:
server:
  rbacConfig:
    policy.csv: |
      p, role:developers, applications, get, */*, allow
      p, role:developers, applications, sync, */*, allow
      g, developers-group, role:developers
Enter fullscreen mode Exit fullscreen mode
  1. Notifications:
notifications:
  enabled: true
  notifiers:
    service.slack: |
      token: $slack-token
  templates:
    template.app-deployed: |
      message: Application {{.app.metadata.name}} deployed
  triggers:
    trigger.on-deployed: |
      - when: app.status.operationState.phase in ['Succeeded']
        send: [app-deployed]
Enter fullscreen mode Exit fullscreen mode
  1. Application Sets:
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: cluster-apps
spec:
  generators:
    - git:
        repoURL: https://github.com/empresa/k8s-apps.git
        revision: HEAD
        directories:
          - path: apps/*
  template:
    metadata:
      name: '{{path.basename}}'
    spec:
      source:
        repoURL: https://github.com/empresa/k8s-apps.git
        path: '{{path}}'
      destination:
        server: https://kubernetes.default.svc
        namespace: '{{path.basename}}'
Enter fullscreen mode Exit fullscreen mode

Monitoring:

# ServiceMonitor para Prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: argocd-metrics
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: argocd-metrics
  endpoints:
    - port: metrics
Enter fullscreen mode Exit fullscreen mode

Technical Resources

References:

  1. Straube, S. (2025). "Infrastructure as a Product: The Key to Agile and Scalable IT". Medium/Elevate Tech.
  2. Griffiths, M. (2021). "Infrastructure as Product: Accelerating time to market through platform engineering". Thoughtworks Insights.
  3. Strope, L. (2026). "Why Infrastructure Is Becoming Product And How to Capitalize". Akava.

Documentation:

Reference Repositories:

Auxiliary Tools:

  • argocd CLI
  • kubectl-argo-rollouts (progressive delivery)
  • argocd-notifications (alerts)
  • argocd-image-updater (auto-update images)

End of Chapter 4

Top comments (0)