Alex Umansky

Posted on Apr 18

How to collect system, service and kernel logs using Alloy, Loki and Grafana.

#devops #monitoring #networking #tutorial

Who am I?

Hello dear users, my name is Alex Umansky aka TheBlueDrara, I'm 25 years young and I've been working in a big Neo Cloud HPC Data Center for quite a while now.

One of my biggest projects is being the eyes and ears of the DC, and as time passes I found myself adding more and more tools to my monitoring stack to be able to see as much data as I can.

And you guessed it, there are not many guides and detailed documentation out there that would help me accomplish this task.

So I decided to take the time and share my knowledge and try to make my dear readers' lives easier in setting up their own monitoring system.

Going forward, I'll be documenting and writing guides about my monitoring projects.

You can start by viewing my first monitoring guide on how to use SNMP exporter with Prometheus and Grafana to pull server and hardware health state via BMCs in an Out of Band network here

Overview

In this guide I will talk about how to pull system, service and kernel logs from the hosts on the network via in-band networking, using Alloy and Loki as our main stack, and visualize the logs with Grafana.

We will start by shallow diving into what Alloy and Loki are, our main tools to capture and make the logs usable, and continue on to deploying the tools in our environment in a containerized state.

So let's not delay any further and jump into the guide.

Note: I will not show how to deploy Grafana, as it's quite basic. To make this guide not too long I will focus only on Alloy and Loki.

Prerequisites

For this setup we will need 3 main tools. I have added a link where you can find the needed image.

Loki — grafana/loki:2.9.17 — image
Alloy — grafana/alloy:v1.11.3 — image
Grafana — grafana/grafana:12.0.0 — image

Architecture

The architecture is quite simple: Client-Server, Push-Based module.

On each node we want to monitor we need to run an Alloy container that will collect our host logs and push the logs to a main Loki server.

And one main Loki server that will listen to logs being pushed.

Grafana will query and visualize the logs from Loki.

[Host 1 + Alloy] ──┐
[Host 2 + Alloy] ──┼──push──> [Loki] <──query── [Grafana]
[Host N + Alloy] ──┘

The How To

Run a Loki Server

We will start by creating a Loki config file config.yaml.

This file is responsible for configuring where to store the logs and for how long Loki will store them.

To make it short and simple, I'll go in general over each block:

auth_enabled: false — Disables multi-tenancy; uses a default tenant.
server — Sets the HTTP port Loki listens on, the default is 3100.
common — Shared default configs for all jobs.
schema_config — Defines how data is indexed and stored from a given date.
storage_config — Specifies where chunks (actual log data) are physically written on disk when using filesystem storage. This is important — I recommend creating a volume so it won't get deleted if the container fails.
limits_config — Per-tenant limits: for example, 7-day retention.
chunk_store_config — Caps how far back queries can look, preventing reads beyond the retention window.

auth_enabled: false

server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2024-01-01
      store: boltdb-shipper
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

storage_config:
  filesystem:
    directory: /loki

limits_config:
  retention_period: 168h   # 7 days
  allow_structured_metadata: false

chunk_store_config:
  max_look_back_period: 168h

Now after we created the config file, let's run the container:

docker run -d \
  --name loki \
  -p 3100:3100 \
  -v $(pwd)/config.yaml:/etc/loki/config.yaml \
  -v loki-data:/loki \
  grafana/loki:2.9.17 \
  -config.file=/etc/loki/config.yaml

Running Alloy on the host

Before running the Alloy container we need to understand how to tell Alloy how and what logs we want to pull. For that we have the config.alloy file.

The hardest part in this stack is that the config file is written in a DSL called "River", which is a Grafana config language.

But there is also a simple solution: you can use this generator to create a simple config file for your needs.

For example, you can use this file. I will break it up a little as we need to understand what we are up to.

loki.write "local_host" {
  endpoint {
    url = "http://<LOKI_SERVER_IP>:3100/loki/api/v1/push"
  }
}

loki.relabel "journal" {
  forward_to = []

  rule {
    source_labels = ["__journal__systemd_unit"]
    target_label  = "service_name"
  }

  rule {
    source_labels = ["__journal__transport"]
    target_label  = "transport"
  }

  rule {
    source_labels = ["__journal_priority_keyword"]
    target_label  = "level"
  }

  rule {
    source_labels = ["__journal__hostname"]
    target_label  = "host_name"
  }
}

loki.source.journal "read" {
  forward_to    = [loki.write.local_host.receiver]
  relabel_rules = loki.relabel.journal.rules
  labels        = { job = "log_collection" }
}

loki.write

This block creates an object to where we push the logs. We will need to give it our Loki server DNS or IP address to send the logs.

You can change the local_host to anything you like, it's just a label. We will see this nature in the other blocks too.

loki.write "local_host" {
  endpoint {
    url = "http://<Loki_Server_IP>:3100/loki/api/v1/push"
  }
}

loki.relabel

This block is all about relabeling.
When Loki receives journal logs, they will have specific initial names that start with __journal_<service_name> like __journal__systemd_unit. So to make our life easier, we create certain rules to relabel them into our own groups so we will have a simpler way to query them later.

Each rule relabels a certain log group into a target_label. You can change the value to any label you want, e.g. target_label = "ninja".

loki.relabel "journal" {
  forward_to = []

  rule {
    source_labels = ["__journal__systemd_unit"]
    target_label  = "service_name"
  }

  rule {
    source_labels = ["__journal__transport"]
    target_label  = "transport"
  }

  rule {
    source_labels = ["__journal_priority_keyword"]
    target_label  = "level"
  }

  rule {
    source_labels = ["__journal__hostname"]
    target_label  = "host_name"
  }
}

loki.source.journal

The final block is getting the input. It uses the journal's API under the hood to collect our logs.
It configures which block to send the data to, which block is used to relabel our data, and finally just a small label that is added, which will be the main label for all data coming from Alloy.

loki.source.journal "read" {
  forward_to    = [loki.write.local_host.receiver]
  relabel_rules = loki.relabel.journal.rules
  labels        = { job = "log_collection" }
}

After we created our config file, we will need to pull the Alloy image to the host. I will leave this part to you, as it differs by environment.

I will jump straight into running the container.

IMPORTANT NOTE! If using automation and deploying too many containers of Alloy at once, it may crash Loki's max streaming amount, so start with one container and then deploy the rest.

To make the Alloy container read the journal logs, we need to mount the journal logs directories and add the container root user to the group ID of the journal group of the host, so it will have permissions.

So we run this command:

Note: the journald group ID may vary, please give the parameter the correct ID. To find the ID on the host, run:

getent group systemd-journal | cut -d: -f3

docker run -d \
  --name alloy \
  --network host \
  --restart unless-stopped \
  --group-add <JOURNAL_GROUP_ID> \
  -v <Path_To_Config_File>:/etc/alloy/config.alloy:ro \
  -v /run/log/journal:/run/log/journal:ro \
  -v /var/log/journal:/var/log/journal:ro \
  -v /etc/machine-id:/etc/machine-id:ro \
  grafana/alloy:v1.11.3 \
  run --server.http.listen-addr=0.0.0.0:12345 /etc/alloy/config.alloy

Verify it works

Before moving on to Grafana, let's make sure everything is running as expected.

On the Loki host, check that Loki is ready:

curl http://<LOKI_SERVER_IP>:3100/ready

You should get back ready.

On the Alloy host, check the container logs for any errors:

docker logs alloy

If Alloy is shipping logs successfully, you can confirm Loki is receiving them by querying for our job label:

curl -G -s "http://<LOKI_SERVER_IP>:3100/loki/api/v1/labels" | grep job

If all three checks pass, you're good to go.

Grafana

From now on, you can set Loki as a Grafana data source and create dashboards for the logs.

Thank You

And that's a wrap!

Thank you so much for taking the time to read my guide — it really means a lot. I hope it saved you some of the headache I went through figuring this out, and that you now have a working Alloy + Loki stack pulling logs from your hosts.

If you spotted something that could be improved, have a question, or just want to share how your own monitoring setup looks, I'd love to hear from you in the comments.

Stay tuned — more monitoring guides are on the way.

— Alex (TheBlueDrara)

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.