DEV Community

Cover image for Multi-tenant Loki on Kubernetes
Siddarth
Siddarth

Posted on

Multi-tenant Loki on Kubernetes

I'm known as the "monitoring guy" in the company I work at, and this post is about how I deployed Grafana Loki in SimpleScalable mode on Kubernetes and used Promtail to ingest logs per tenant.

What's Loki ?

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus.

If you are using Loki for the first time, I highly recommend you to checkout their docs. Loki Docs

What we need

  1. A Loki deployment (SimpleScalable via Helm)
  2. Promtail
  3. S3 bucket

Loki setup (SimpleScalable + S3 storage)

The deployment relies on object storage, so you need a s3 compatible bucket and its credentials to store the logs.

Our Helm chart is pretty simple.

loki-helm-chart/
  ├── templates/
  │   └── loki-secrets.yaml
  ├── Chart.yaml
  └── values.yaml
Enter fullscreen mode Exit fullscreen mode

Now, the loki-secrets.yaml would have our s3 credentials secret

apiVersion: v1
kind: Secret
metadata:
  name: loki-s3-credentials
  namespace: monitoring
type: Opaque
stringData:
  AWS_ACCESS_KEY_ID: "<id>"
  AWS_SECRET_ACCESS_KEY: "<access-key>"

Enter fullscreen mode Exit fullscreen mode

Loki values.yaml

deploymentMode: SimpleScalable

loki:
  auth_enabled: true

  extraEnvFrom:
    - secretRef:
        name: loki-s3-credentials

  schemaConfig:
    configs:
      - from: "2024-04-01"
        store: tsdb
        object_store: s3
        schema: v13
        index:
          prefix: loki_index_
          period: 24h

  storage_config:
    tsdb_shipper:
      index_gateway_client:
        server_address: '{{ include "loki.indexGatewayAddress" . }}'


  storage:
    type: s3
    s3:
      endpoint: rook-ceph-rgw-store.rook-ceph.svc.cluster.local:80
      s3ForcePathStyle: true
      insecure: true


    bucketNames:
      chunks: loki-logs
      ruler: loki-ruler
      admin: loki-admin


  commonConfig:
    replication_factor: 3

  ingester:
    chunk_encoding: snappy

  querier:
    multi_tenant_queries_enabled: true
    max_concurrent: 4

  pattern_ingester:
    enabled: true

  limits_config:
    allow_structured_metadata: true
    volume_enabled: true

Enter fullscreen mode Exit fullscreen mode

A couple of important notes:

  • auth_enabled: true means Loki expects a tenant identifier (via X-Scope-OrgID) for multi-tenancy.

  • Retention in Loki is implemented by the Compactor. If retention isn’t enabled there, logs can live forever even if retention_period is set. (Log Retention)

Rest of the values

Loki components still need local disk for things like WAL / cache / working state so they can perform well. This is why you’ll see PVCs on the read/write/backend pods.

kubernetes pods

ceph-block is our expandable storage class by the way

backend:
  replicas: 2
  persistence:
    enabled: true
    size: 2Gi
    storageClass: ceph-block
    accessModes:
      - ReadWriteOnce
  extraEnvFrom:
    - secretRef:
        name: loki-s3-credentials

write:
  replicas: 3
  persistence:
    enabled: true
    size: 2Gi
    storageClass: ceph-block
    accessModes:
      - ReadWriteOnce
  extraEnvFrom:
    - secretRef:
        name: loki-s3-credentials

read:
  replicas: 2
  extraEnvFrom:
    - secretRef:
        name: loki-s3-credentials

Enter fullscreen mode Exit fullscreen mode

Gateway config (what it does and does not do)

gateway:
  nginx:
    customHeaders:
      - name: X-Scope-OrgID
        value: $http_x_scope_orgid
Enter fullscreen mode Exit fullscreen mode

This does not enforce isolation by itself, it just forwards whatever tenant header the caller provides.

Grafana’s docs recommend that X-Scope-OrgID be set by an authenticating reverse proxy (so users can’t spoof other tenants). (Grafana Labs)

In my setup, clients do not directly access Loki. Only Promtail (inside the cluster) pushes logs, and I control the tenant mapping, so forwarding this header is fine for now.

Great ! Now moving on to Promtail

Promtail setup

Promtail runs on each node and ships container logs to Loki

daemonset:
  enabled: true
deployment:
  enabled: false

config:
  clients:
    - url: http://loki-gateway.monitoring.svc.cluster.local:80/loki/api/v1/push
      timeout: 60s
      batchwait: 1s
      batchsize: 1048576

  serverPort: 3101
Enter fullscreen mode Exit fullscreen mode

The important part: setting the tenant correctly

I used the Kubernetes namespace into a namespace label:

snippets:
  common:
    - action: replace
      source_labels: [__meta_kubernetes_namespace]
      target_label: namespace
Enter fullscreen mode Exit fullscreen mode

Now in the pipeline, set tenant from the label:

snippets:
  pipelineStages:
    - cri: {}
    - tenant:
        label: namespace
Enter fullscreen mode Exit fullscreen mode

And voila ! we have a multi-tenant Loki setup done.

Now, we can add the Loki data source in Grafana
Grafana dashboard

Setup X-Scope-OrgID, I put the value as monitoring

Grafana dashboard

And now we will get to only see logs from monitoring namespace

Grafana dashboard

A HUGEE shoutout to Ben Ye for his help ! Github Profile Link

Hope you found this useful. I'm always up for a quick chat, connect with me on Linkedin or Twitter

Top comments (0)