In Chapter 2, we reached an uncomfortable conclusion: Terraform can manage Kubernetes, but that doesn't mean it should manage everything in Kubernetes.
We observed that:
- Terraform → Versioning, auditing, reproducibility
- Helm → Simplicity, lifecycle management
- Terraform + K8s Provider directly → Verbose, giant state, no rollbacks
The question that lingers: "Is there a way to have the best of both?"
Changing Abstraction Level
The problem in Chapter 2 wasn't Terraform itself, it was the level of abstraction we chose.
Wrong thinking:
Terraform → Manage Deployments, Services, Ingress, etc
(individual Kubernetes resources)
Right thinking:
Terraform → Manage Helm Releases
(complete applications as units)
It's a subtle but profound change. Instead of Terraform replacing Helm, Terraform orchestrates Helm.
Layered Architecture of Responsibilities
Let's visualize how responsibilities are divided:
┌──────────────────────────────────────────┐
│ You (Developer) │
│ Define desired state in code │
└────────────────┬─────────────────────────┘
│
↓
┌──────────────────────────────────────────┐
│ Terraform (Orchestrator) │
│ • Manages namespaces │
│ • Manages infrastructure secrets │
│ • Manages RBAC │
│ • Manages Helm Releases (pointers) │
└────────────────┬─────────────────────────┘
│
↓
┌──────────────────────────────────────────┐
│ Helm (Package Manager) │
│ • Renders templates │
│ • Applies resources to cluster │
│ • Maintains release history │
│ • Manages rollbacks │
└────────────────┬─────────────────────────┘
│
↓
┌──────────────────────────────────────────┐
│ Kubernetes (Runtime) │
│ • Runs containers │
│ • Manages storage │
│ • Routes traffic │
│ • Self-healing │
└──────────────────────────────────────────┘
Each layer does what it does best. No unnecessary overlap.
Project Structure:
cap3-helm-provider/
├── main.tf # Main configuration
├── variables.tf # Input variables
├── terraform.tfvars # Values (don't commit!)
├── outputs.tf # Useful outputs
├── .gitignore # Secret protection
│
├── values/ # Helm chart values
│ ├── ollama-values.yaml
│ └── librechat-values.yaml
│
└── README.md # Documentation
-
Separation of code and configuration
- Logic (
main.tf) separated from values (values/) - Easy to version and review changes
- Logic (
-
Reusable
- Same structure for dev, staging, prod
- Only change
terraform.tfvars
-
Secure
-
.gitignoreprotects secrets - Values can have public and private versions
-
Part 1: Provider Declaration
# main.tf
terraform {
required_version = ">= 1.0"
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.23"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.11"
}
}
}
provider "kubernetes" {
config_path = "~/.kube/config"
config_context = "minikube"
}
provider "helm" {
kubernetes {
config_path = "~/.kube/config"
config_context = "minikube"
}
}
Helm Provider
Now we have two providers:
-
kubernetes: For base infrastructure resources -
helm: For managing application releases
Important: The Helm provider doesn't replace the Kubernetes provider. They work together:
- Kubernetes provider → Creates namespaces, secrets, RBAC
- Helm provider → Deploys applications in those namespaces
Semantic versioning (~> 2.23):
~> 2.23 means:
- 2.23.0, 2.23.1, 2.24.0 (accepts)
- 3.0.0 (rejects - breaking change)
Ensures security updates without breaking compatibility.
Part 2: Base Infrastructure (Terraform Territory)
# Namespaces managed by Terraform
resource "kubernetes_namespace" "ollama" {
metadata {
name = "ollama"
labels = {
managed-by = "terraform"
}
}
}
resource "kubernetes_namespace" "librechat" {
metadata {
name = "librechat"
labels = {
managed-by = "terraform"
}
}
}
Why does Terraform manage namespaces?
Namespaces are infrastructure, not applications. They:
- Rarely change
- Are prerequisites for everything
- Define security boundaries
- Need to exist before applications
Label managed-by = "terraform":
# Useful for filtering
kubectl get ns -l managed-by=terraform
# Output:
NAME STATUS AGE
ollama Active 10m
librechat Active 10m
Makes it clear these resources shouldn't be edited manually.
Part 3: Infrastructure Secrets
# Secret managed by Terraform (infra-level)
resource "kubernetes_secret" "librechat_credentials" {
metadata {
name = "librechat-credentials-env"
namespace = kubernetes_namespace.librechat.metadata[0].name
}
data = {
JWT_SECRET = var.jwt_secret
JWT_REFRESH_SECRET = var.jwt_refresh_secret
CREDS_KEY = var.creds_key
CREDS_IV = var.creds_iv
MONGO_URI = "mongodb://librechat-mongodb:27017/LibreChat"
MEILI_HOST = "http://librechat-meilisearch:7700"
OLLAMA_BASE_URL = "http://ollama.ollama.svc.cluster.local:11434"
}
type = "Opaque"
}
Design decision: Why does Terraform manage this secret?
This secret contains infrastructure credentials that:
- Need to exist before application deployment
- Don't change frequently
- Are shared between environments (same structure, different values)
- Should be versioned (structure) but not values (
.tfvars)
Important dynamic reference:
namespace = kubernetes_namespace.librechat.metadata[0].name
Terraform ensures the namespace is created first, then the secret. Automatic dependency management!
Variables file (variables.tf):
variable "jwt_secret" {
description = "JWT secret for LibreChat"
type = string
sensitive = true
}
variable "jwt_refresh_secret" {
description = "JWT refresh secret"
type = string
sensitive = true
}
variable "creds_key" {
description = "Credentials encryption key"
type = string
sensitive = true
}
variable "creds_iv" {
description = "Credentials initialization vector"
type = string
sensitive = true
}
Values file (terraform.tfvars - DON'T COMMIT!):
jwt_secret = "abc123def456..." # generated with openssl rand -hex 32
jwt_refresh_secret = "ghi789jkl012..."
creds_key = "mno345pqr678..."
creds_iv = "stu901vwx234..."
.gitignore:
# Terraform
*.tfstate
*.tfstate.*
.terraform/
terraform.tfvars # ← CRITICAL!
# Sensitive files
values/*-secrets.yaml
Part 4: Helm Releases
This is where the Chapter 3 approach shines.
Ollama deployment:
# Helm Release - Ollama
resource "helm_release" "ollama" {
name = "ollama"
repository = "https://otwld.github.io/ollama-helm/"
chart = "ollama"
namespace = kubernetes_namespace.ollama.metadata[0].name
values = [
file("${path.module}/values/ollama-values.yaml")
]
# Version control
version = "1.41.0" # Pin version for reproducibility
# Deployment settings
create_namespace = false # Already created by Terraform
wait = true # Wait for ready
timeout = 600 # 10 minutes max
# Dependency tracking
depends_on = [
kubernetes_namespace.ollama
]
}
Breaking it down:
1. Chart source:
repository = "https://otwld.github.io/ollama-helm/"
chart = "ollama"
2. Values file:
values = [
file("${path.module}/values/ollama-values.yaml")
]
The file() function reads YAML from Terraform module Path.
3. Version pinning:
version = "1.41.0"
Critical for reproducibility! Without this, helm_release would use "latest", which changes over time.
4. Deployment controls:
wait = true # Don't return until ready
timeout = 600 # 10 min max
Terraform waits for Pods to be healthy before considering deployment successful.
5. Dependencies:
depends_on = [kubernetes_namespace.ollama]
Terraform creates namespace → then creates release
The values file (values/ollama-values.yaml):
ollama:
gpu:
enabled: true
type: nvidia
number: 1
models:
- llama2
- codellama
resources:
requests:
cpu: 2
memory: 8Gi
service:
type: ClusterIP
port: 11434
ingress:
enabled: true
className: nginx
hosts:
- host: ollama.glukas.space
paths:
- path: /
pathType: Prefix
LibreChat deployment:
# Helm Release - LibreChat
resource "helm_release" "librechat" {
name = "librechat"
repository = "oci://ghcr.io/danny-avila/librechat-chart"
chart = "librechat"
namespace = kubernetes_namespace.librechat.metadata[0].name
values = [
file("${path.module}/values/librechat-values.yaml")
]
version = "1.5.0"
create_namespace = false
wait = true
timeout = 900 # 15 min (MongoDB initialization)
depends_on = [
kubernetes_namespace.librechat,
kubernetes_secret.librechat_credentials
]
}
Notice:
- Different repository (OCI registry)
- Longer timeout (MongoDB takes time)
- Depends on secret (must exist first)
LibreChat depends on Ollama. Terraform ensures order:
- Namespace
- Secret
- Ollama
- LibreChat
The values file (values/librechat-values.yaml):
config:
APP_TITLE: "LibreChat + Ollama (via Terraform)"
HOST: "0.0.0.0"
PORT: "3080"
SEARCH: "true"
MONGO_URI: "mongodb://librechat-mongodb:27017/LibreChat"
MEILI_HOST: "http://librechat-meilisearch:7700"
librechat:
configEnv:
ALLOW_REGISTRATION: "true"
configYamlContent: |
version: 1.1.5
cache: true
endpoints:
custom:
- name: "Ollama"
apiKey: "ollama"
baseURL: "http://ollama.ollama.svc.cluster.local:11434/v1"
models:
default:
- "llama2:latest"
fetch: true
titleConvo: true
titleModel: "llama2:latest"
summarize: false
summaryModel: "llama2:latest"
forcePrompt: false
modelDisplayLabel: "Ollama"
addParams:
temperature: 0.7
max_tokens: 2000
extraEnvVarsSecret: "librechat-credentials-env"
ingress:
enabled: true
className: "nginx"
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "25m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
hosts:
- host: librechat.glukas.space
paths:
- path: /
pathType: Prefix
mongodb:
enabled: true
auth:
enabled: false
image:
repository: bitnami/mongodb
tag: latest
persistence:
enabled: true
size: 8Gi
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
meilisearch:
enabled: true
auth:
enabled: false
environment:
MEILI_NO_ANALYTICS: "true"
MEILI_ENV: "development"
persistence:
enabled: true
size: 1Gi
resources:
requests:
memory: "128Mi"
cpu: "50m"
limits:
memory: "512Mi"
cpu: "250m"
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
persistence:
enabled: true
size: 5Gi
storageClass: "standard"
replicaCount: 1
Deployment Workflow
Now let's see this in action.
Initial Deployment
# Generate secrets
export TF_VAR_jwt_secret=$(openssl rand -hex 32)
export TF_VAR_jwt_refresh_secret=$(openssl rand -hex 32)
export TF_VAR_creds_key=$(openssl rand -hex 32)
export TF_VAR_creds_iv=$(openssl rand -hex 16)
# Or create terraform.tfvars
cat > terraform.tfvars <<EOF
jwt_secret = "$(openssl rand -hex 32)"
jwt_refresh_secret = "$(openssl rand -hex 32)"
creds_key = "$(openssl rand -hex 32)"
creds_iv = "$(openssl rand -hex 16)"
EOF
# 1. Initialize
terraform init
# Downloads both kubernetes and helm providers
# 2. Validate
terraform validate
# Checks HCL syntax
# 3. Plan
terraform plan
Plan output:
Terraform will perform the following actions:
# kubernetes_namespace.ollama will be created
+ resource "kubernetes_namespace" "ollama" {
+ id = (known after apply)
+ metadata {
+ generation = (known after apply)
+ name = "ollama"
+ labels = {
+ "managed-by" = "terraform"
}
}
}
# kubernetes_secret.librechat_credentials will be created
+ resource "kubernetes_secret" "librechat_credentials" {
+ data = (sensitive value)
+ id = (known after apply)
+ type = "Opaque"
+ metadata {
+ name = "librechat-credentials-env"
+ namespace = (known after apply)
}
}
# helm_release.ollama will be created
+ resource "helm_release" "ollama" {
+ id = (known after apply)
+ name = "ollama"
+ namespace = (known after apply)
+ repository = "https://otwld.github.io/ollama-helm/"
+ version = "1.41.0"
+ status = (known after apply)
+ values = [
+ <<-EOT
ollama:
gpu:
enabled: true
...
EOT,
]
}
Plan: 5 to add, 0 to change, 0 to destroy.
Notice: Only 5 resources in the plan!
- 2 namespaces
- 1 secret
- 2 helm_releases
Compare to Chapter 2: would be 50+ individual K8s resources.
# 4. Apply
terraform apply
# Output:
kubernetes_namespace.ollama: Creating...
kubernetes_namespace.librechat: Creating...
kubernetes_namespace.ollama: Creation complete after 1s
kubernetes_namespace.librechat: Creation complete after 1s
kubernetes_secret.librechat_credentials: Creating...
kubernetes_secret.librechat_credentials: Creation complete after 1s
helm_release.ollama: Creating...
helm_release.ollama: Still creating... [10s elapsed]
helm_release.ollama: Still creating... [20s elapsed]
helm_release.ollama: Creation complete after 45s [id=ollama]
helm_release.librechat: Creating...
helm_release.librechat: Still creating... [10s elapsed]
helm_release.librechat: Still creating... [20s elapsed]
...
helm_release.librechat: Creation complete after 2m15s [id=librechat]
Apply complete! Resources: 5 added, 0 changed, 0 destroyed.
What happened behind the scenes:
- Terraform created namespaces
- Terraform created secret
- Terraform told Helm: "Install ollama chart with these values"
- Helm rendered templates and created: Deployment, Service, PVC, Ingress, etc
- Terraform told Helm: "Install librechat chart with these values"
- Helm rendered templates and created: Deployment, Service, MongoDB StatefulSet, MeiliSearch Deployment, Ingress, etc
Terraform state only tracks the 5 high-level resources.
Helm manages all the detailed Kubernetes resources.
Verifying Deployment
# Check Terraform state
terraform state list
# kubernetes_namespace.librechat
# kubernetes_namespace.ollama
# kubernetes_secret.librechat_credentials
# helm_release.librechat
# helm_release.ollama
# Check Helm releases
helm list -A
# NAME NAMESPACE REVISION STATUS CHART APP VERSION
# ollama ollama 1 deployed ollama-1.41.0 0.1.20
# librechat librechat 1 deployed librechat-1.5.0 0.7.0
# Check actual Kubernetes resources
kubectl get all -n ollama
# NAME READY STATUS RESTARTS AGE
# pod/ollama-7d8f9c5b6d-xk2p4 1/1 Running 0 2m
#
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# service/ollama ClusterIP 10.96.245.12 <none> 11434/TCP 2m
#
# NAME READY UP-TO-DATE AVAILABLE AGE
# deployment.apps/ollama 1/1 1 1 2m
kubectl get all -n librechat
# (Shows MongoDB, MeiliSearch, LibreChat deployments/services)
# Testing Ollama
curl http://ollama.glukas.space/api/tags
Output:
{
"models": [
{
"name": "llama2:latest",
"modified_at": "2025-02-07T13:30:00.000Z",
"size": 3826793677,
"digest": "sha256:abc123...",
"details": {
"format": "gguf",
"family": "llama",
"families": ["llama"],
"parameter_size": "7B",
"quantization_level": "Q4_0"
}
}
]
}
Ollama Working!
# Testing LibreChat
curl -I http://librechat.glukas.space
Output:
HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=utf-8
...
LibreChat Working!
Open in browser:
http://librechat.glukas.space- Sign in
- Select model Ollama
- Start chatting!
All Good!
Operations
Now let's see how daily operations are different.
Upgrading Chart Version
Scenario: New Ollama chart version available.
# 1. Update version
vim main.tf
resource "helm_release" "ollama" {
# ...
version = "1.42.0" # was 1.41.0
}
# 2. Plan
terraform plan
Output:
~ resource "helm_release" "ollama" {
id = "ollama"
name = "ollama"
~ version = "1.41.0" -> "1.42.0"
# (15 unchanged attributes hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
Only 1 change: chart version!
# 3. Apply
terraform apply
Helm does rolling update, zero downtime!
Changing Configuration
Scenario: Add CodeLlama model.
# 1. Edit values
vim values/ollama-values.yaml
ollama:
models:
- llama2
- codellama # ← NEW
# 2. Plan
terraform plan
Output:
~ resource "helm_release" "ollama" {
~ values = [
~ <<-EOT
ollama:
models:
- llama2
+ - codellama
EOT,
]
}
Terraform detects diff in YAML!
# 3. Apply
terraform apply
Helm updates Deployment → Pod restarts → Downloads CodeLlama → Ready!
Rollback
Scenario: New version broke.
Option 1: Via Terraform
# Revert commit in Git
git revert HEAD
# Apply previous version
terraform apply
Option 2: Via Helm (faster)
# View history
helm history ollama -n ollama
# REVISION UPDATED STATUS CHART DESCRIPTION
# 1 Thu Feb 07 10:29:30 2025 superseded ollama-1.41.0 Install complete
# 2 Thu Feb 07 11:15:20 2025 deployed ollama-1.42.0 Upgrade complete
# Rollback
helm rollback ollama -n ollama
# Rollback was a success! Happy Helming!
# Terraform detects on next plan
terraform plan
# (will show drift, but it's not a problem)
Best practice: Always use Terraform, but Helm is available for emergencies.
Comparison: Chapter 2 vs Chapter 3
Let's put side by side to visualize the gain.
Code Required
| Metric | Ch 2 (TF + K8s) | Ch 3 (TF + Helm) | Reduction |
|---|---|---|---|
| Lines of HCL | ~500 | ~100 | 80% ↓ |
| Lines of YAML | 0 | ~100 | - |
| Total code | 500 | 200 | 60% ↓ |
| Files | 1 monolith | 5 organized | - |
State Management
| Aspect | Ch 2 | Ch 3 |
|---|---|---|
| Resources in state | 50+ | 5 |
| State size | 2.3 MB | 15 KB |
| Plan time | 2 minutes | 10 seconds |
| Detectable drift | Partially | Yes (via Helm) |
Operations
| Operation | Ch 2 | Ch 3 |
|---|---|---|
| Initial deploy |
terraform apply (5 min) |
terraform apply (2 min) |
| Version upgrade | Edit multiple blocks | Change 1 line |
| Rollback |
git revert + apply |
helm rollback (instant) |
| View status | terraform state list |
helm list |
| Debug |
kubectl + state inspection |
helm status |
Maintainability
| Factor | Ch 2 | Ch 3 |
|---|---|---|
| Learning curve | High (HCL + K8s) | Medium (HCL + familiar YAML) |
| New dev onboarding | Difficult | Reasonable |
| Code review | Complex (many changes) | Simple (clear diff) |
| Reusability | Low | High (public charts) |
The Advantages Scale
Now imagine you don't have 2 applications, but 20:
Chapter 2:
20 applications × 250 lines = 5,000 lines of HCL
20 applications × 50 resources = 1,000 resources in state
terraform plan = 10+ minutes
State file = 50+ MB
Chapter 3:
20 applications × 15 lines = 300 lines of HCL
20 applications × 100 lines YAML = 2,000 lines (familiar)
20 releases in state
terraform plan = 30 seconds
State file = 300 KB
The difference becomes even more dramatic at scale.
Chapter 3 solves 90% of Chapter 2's problems, and in many scenarios that's sufficient.
Terraform + Helm is the sweet spot for managing Kubernetes applications in a reproducible and versioned way.
Recapping what we achieved:
- Total versioning — Everything in Git
- Reproducibility — terraform apply = identical environment
- Separation of responsibilities — Terraform (infra) + Helm (apps)
- Manageable state — Few tracked resources
- Possible rollbacks — Via Helm or Git
- Less code — 94% reduction vs Ch 2
- Ecosystem — Thousands of public charts
- Maintainable — Familiar YAML, minimal HCL
But there are still limitations:
1. Deployment is Manual
# Always needs someone executing
terraform apply
There's no real continuous deployment. Git isn't the single source of truth, it's an input that requires manual action.
2. No Continuous Reconciliation
# If someone does this:
kubectl edit deployment ollama -n ollama
# Terraform only detects on next plan
# Until then, there's divergence
There's no automatic process ensuring cluster = code.
3. Limited Auditing
Who deployed version 1.42.0?
git log # Shows commit
# But who executed terraform apply?
# There's no central record
Terraform state has some information, but it's not a complete audit log.
4. Approvals and Gates
How to ensure production deployment:
- Passed automated tests?
- Was approved by PO/PM?
- Has automatic rollback if it fails?
Terraform doesn't have this built-in. You need to build custom pipelines.
5. Complex Multi-Tenancy
How to allow:
- Team A to manage their apps in namespace team-a
- Team B to manage their apps in namespace team-b
- But both use the same cluster?
- Without giving access to the entire Terraform state?
Possible, but requires complex architecture.
What's Missing: GitOps
GitOps principles:
- Declarative: Desired state described declaratively (we have this!)
- Versioned: Everything in Git (we have this!)
- Pull-based: Cluster pulls changes from Git automatically (we don't have this)
- Continuous reconciliation: Agent ensures cluster = Git always (we don't have this)
What would change with GitOps:
Without GitOps (Ch 3):
Developer → commits → Git
Developer → terraform apply → Cluster
(Push-based, manual)
With GitOps (Ch 4):
Developer → commits → Git
ArgoCD (agent in cluster) → polls Git → applies changes
(Pull-based, automatic)
Additional benefits:
- Zero-touch deployment: Commit = automatic deploy
- Auto-healing: Cluster self-corrects if it diverges
- Complete audit: Each deploy is a commit
- Approvals: PR process = deployment approval
- Multi-tenancy: Each team has their repo/branch
Next Chapter:
In Chapter 4, we'll discover how Kubernetes infrastructure is managed at scale:
- ArgoCD: Real continuous deployment
- Application Sets: Deploy to multiple clusters
- Granular RBAC: Secure multi-tenancy
- Sync waves: Dependency orchestration
- Auto-sync: Git → Cluster automatic
- Automatic rollbacks: Integrated health checks
Top comments (0)