Muskan

Posted on Jun 15 • Originally published at zop.dev

ChromaDB Helm values.yaml: the 2026 production setup

#kubernetes #helm #ai #devops

Quick take

ChromaDB 1.0.0 rewrote the server in Rust and quietly broke most of the values.yaml configs you find online. Auth keys are ignored, logging keys are ignored, and the persistence semantics changed. Here is a values.yaml that actually works in production, with every block explained.

If you only have 90 seconds, this is the shape:

Use amikos-tech/chromadb-chart, the most maintained chart on Artifact Hub.
For ChromaDB 1.0.0+ assume chromadb.auth.* is dead. Network-level security is the new model.
Persistence is non-optional, since the Rust server always writes to disk and no PVC means no durability.

Why the old guides will burn you in 2026

I spent half a Saturday last quarter migrating a ChromaDB deployment from 0.4.x to 1.0.x and watched four tutorials silently break. Three things changed that almost nothing on Google has caught up to.

The Rust server killed the in-memory mode. Before 1.0.0, IS_PERSISTENT=false was a valid setup for ephemeral testing. The Rust server always writes to disk. If you do not mount a PVC, you do not have data after the first restart.

Auth got deprecated, not removed. The chart still accepts chromadb.auth.* values. It just ignores them. The official guidance is to use network-level controls: private networking, ingress auth, API gateway, or mTLS. The default chromadb-auth secret is generated at install but only used by legacy clients.

Telemetry and logging keys are no-ops. chromadb.logging.*, chromadb.anonymizedTelemetry, and chromadb.maintenance.* are all ignored in 1.0.0+. The Rust server reads its config from the new chromadb.extraConfig YAML injection block instead.

The result: every blog post written before 2025-Q4 about ChromaDB on K8s is either wrong or dangerously incomplete.

A production-ready values.yaml

Here is what actually deploys cleanly today. Skim it first, then read the section-by-section below.

chromadb:
  image:
    registry: ghcr.io/chroma-core
    repository: ghcr.io/chroma-core/chroma
    tag: "1.0.5"
    pullPolicy: IfNotPresent

  serverHttpPort: 8000
  persistDirectory: /data

  data:
    volumeSize: 50Gi
    storageClass: gp3
    accessMode: ReadWriteOnce
    retentionPolicy: Retain

  serviceAccount:
    create: true
    annotations: {}

  extraConfig:
    cors_allow_origins:
      - "https://app.example.com"
    open_telemetry:
      endpoint: "http://otel-collector.observability:4317"
      service_name: "chromadb-prod"

  resources:
    requests:
      cpu: 500m
      memory: 2Gi
    limits:
      cpu: 2000m
      memory: 8Gi

  podSecurityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 1000

service:
  type: ClusterIP
  port: 8000

ingress:
  enabled: false

50 lines, no junk, every block load-bearing. Now the explanation.

What each block does

Image and version pinning

Pin the tag explicitly to 1.0.5 or whatever your last verified version is. Chart.AppVersion floats with chart releases and silently upgrades you across minor versions. The Rust rewrite happened mid-2025, so assume any unpinned tag will eventually cross that boundary.

Persistence

persistDirectory: /data is the path inside the pod where Chroma writes its segments and SQLite WAL. The PVC mounts here. retentionPolicy: Retain is the single most important line in the file. Default is Delete, which means uninstalling the chart wipes your vectors. I have seen this happen twice. Always Retain in prod.

volumeSize: 50Gi is conservative. Vector embeddings are small per row, but at 768 dimensions with a million documents you are already at 3GB before metadata. Plan for 10x your current corpus.

Network security via extraConfig

Since chromadb.auth.* is ignored, the practical move is to keep the service ClusterIP and put auth at the ingress layer. The extraConfig.cors_allow_origins block scopes browser access to your app domain, which is the chart's only built-in network control in 1.0.

For service-to-service auth, the 2026 pattern is a service mesh sidecar (Istio, Linkerd) with mTLS, or an API gateway that injects a bearer token. The chart does not opine, which is honest of it.

Resources

Vector search is memory-hungry, not CPU-hungry. 2Gi request and 8Gi limit is a reasonable floor for a single-tenant ChromaDB serving a million-vector index. If you are running HNSW with high ef_construction, double the limit.

CPU under-allocation is the second most common failure I see. The Rust server is happy on 500m for inserts but will throttle during bulk queries. Give it room.

Pod security context

runAsNonRoot: true and fsGroup: 1000 are required by most Pod Security Standards in 2026. The chart used to default to root, which broke restricted PSS namespaces. Set these explicitly.

The four pitfalls that wreck a fresh install

Every ChromaDB-on-K8s incident I have helped debug has been one of these:

1. The phantom auth

The chart accepts chromadb.auth.basic and silently ignores it on 1.0.0+. Your client connects without credentials and you assume security is on. It is not. Verify with a curl from a different namespace.

2. The non-Retain PVC

Default retentionPolicy: Delete plus a helm uninstall equals zero vectors. Always set Retain even in staging.

3. The 1Gi default volume

The chart default is 1Gi. A 50k-doc index fills that in about a week and the pod starts crashing in a non-obvious way: writes fail, reads still work, and your dashboards look green until the WAL corrupts.

4. The OOM at query time

Setting memory: limits: 2Gi is fine for ingestion and catastrophic for queries. Set the limit at 4x the working set, not at the resting set. Watch container_memory_working_set_bytes, not container_memory_usage_bytes.

Where this setup still falls short

This values.yaml is single-replica. ChromaDB OSS does not support multi-replica writes in 1.0 because the storage layer assumes one writer. If you need HA, you have two options. One, run it as a StatefulSet with leader election (complex, custom). Two, use ChromaDB Cloud or a managed vector database.

For multi-tenant SaaS use cases, the chart also does not isolate tenants at the persistence layer. You will need one ChromaDB instance per tenant or one collection per tenant with row-level auth at the application layer.

Frequently asked questions

Which Helm chart should I actually use?
amikos-tech/chromadb-chart is the most maintained. The official chroma-core repo does not ship a chart, so this community one is the de facto standard.

What about the chromadb-auth secret the chart creates?
For 1.0.0+ it is a legacy compatibility helper. It is generated, but the Rust server does not consume it. Treat it as dead code until the chart drops it.

How do I expose ChromaDB to a frontend app?
Set ingress.enabled: true, use cert-manager for TLS, and put your auth at the ingress layer (OAuth proxy, Cloudflare Access, or an API gateway). Do not expose port 8000 directly.

Does this work on GKE Autopilot or EKS Fargate?
Yes for EKS Fargate as long as the EBS CSI driver is available for the PVC. GKE Autopilot blocks runAsUser: 1000 on some nodepools, so set it to a higher UID like 65532 if you hit the restriction.

How do I monitor it?
Scrape /api/v2/heartbeat for liveness and add the extraConfig.open_telemetry block to ship traces. The 1.0.0 server exposes OTel natively, which the old Python server did not.

What does your ChromaDB on K8s look like?

If you are running ChromaDB in production, the question I would want answered is: how are you handling auth now that chromadb.auth.* is gone? Mesh, ingress, or API gateway? Drop your stack in the comments. I will reply with whatever I have seen work or break.

Full disclosure on tooling: I publish more cloud-native infra walkthroughs over at zop.dev/blog, including the original of this post.

DEV Community