Who am I?
Hello dear users, my name is Alex Umansky aka TheBlueDrara, I'm 25 years young and I've been working in a big Neo Cloud HPC Data Center for quite a while now.
One of my biggest projects is being the eyes and ears of the DC, and as time passes I found myself adding more and more tools to my monitoring stack to be able to see as much data as I can.
And you guessed it, there are not many guides and detailed documentation out there that would help me accomplish this task.
So I decided to take the time and share my knowledge and try to make my dear readers' lives easier in setting up their own monitoring system.
Going forward, I'll be documenting and writing guides about my monitoring projects.
You can start by viewing my first monitoring guide on how to use SNMP exporter with Prometheus and Grafana to pull server and hardware health state via BMCs in an Out of Band network here
Overview
In this guide I will talk about how to pull system, service and kernel logs from the hosts on the network via in-band networking, using Alloy and Loki as our main stack, and visualize the logs with Grafana.
We will start by shallow diving into what Alloy and Loki are, our main tools to capture and make the logs usable, and continue on to deploying the tools in our environment in a containerized state.
So let's not delay any further and jump into the guide.
Note: I will not show how to deploy Grafana, as it's quite basic. To make this guide not too long I will focus only on Alloy and Loki.
Prerequisites
For this setup we will need 3 main tools. I have added a link where you can find the needed image.
-
Loki —
grafana/loki:2.9.17— image -
Alloy —
grafana/alloy:v1.11.3— image -
Grafana —
grafana/grafana:12.0.0— image
Architecture
The architecture is quite simple: Client-Server, Push-Based module.
On each node we want to monitor we need to run an Alloy container that will collect our host logs and push the logs to a main Loki server.
And one main Loki server that will listen to logs being pushed.
Grafana will query and visualize the logs from Loki.
[Host 1 + Alloy] ──┐
[Host 2 + Alloy] ──┼──push──> [Loki] <──query── [Grafana]
[Host N + Alloy] ──┘
The How To
Run a Loki Server
We will start by creating a Loki config file config.yaml.
This file is responsible for configuring where to store the logs and for how long Loki will store them.
To make it short and simple, I'll go in general over each block:
-
auth_enabled: false— Disables multi-tenancy; uses a default tenant. -
server— Sets the HTTP port Loki listens on, the default is 3100. -
common— Shared default configs for all jobs. -
schema_config— Defines how data is indexed and stored from a given date. -
storage_config— Specifies where chunks (actual log data) are physically written on disk when using filesystem storage. This is important — I recommend creating a volume so it won't get deleted if the container fails. -
limits_config— Per-tenant limits: for example, 7-day retention. -
chunk_store_config— Caps how far back queries can look, preventing reads beyond the retention window.
auth_enabled: false
server:
http_listen_port: 3100
common:
path_prefix: /loki
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2024-01-01
store: boltdb-shipper
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
storage_config:
filesystem:
directory: /loki
limits_config:
retention_period: 168h # 7 days
allow_structured_metadata: false
chunk_store_config:
max_look_back_period: 168h
Now after we created the config file, let's run the container:
docker run -d \
--name loki \
-p 3100:3100 \
-v $(pwd)/config.yaml:/etc/loki/config.yaml \
-v loki-data:/loki \
grafana/loki:2.9.17 \
-config.file=/etc/loki/config.yaml
Running Alloy on the host
Before running the Alloy container we need to understand how to tell Alloy how and what logs we want to pull. For that we have the config.alloy file.
The hardest part in this stack is that the config file is written in a DSL called "River", which is a Grafana config language.
But there is also a simple solution: you can use this generator to create a simple config file for your needs.
For example, you can use this file. I will break it up a little as we need to understand what we are up to.
loki.write "local_host" {
endpoint {
url = "http://<LOKI_SERVER_IP>:3100/loki/api/v1/push"
}
}
loki.relabel "journal" {
forward_to = []
rule {
source_labels = ["__journal__systemd_unit"]
target_label = "service_name"
}
rule {
source_labels = ["__journal__transport"]
target_label = "transport"
}
rule {
source_labels = ["__journal_priority_keyword"]
target_label = "level"
}
rule {
source_labels = ["__journal__hostname"]
target_label = "host_name"
}
}
loki.source.journal "read" {
forward_to = [loki.write.local_host.receiver]
relabel_rules = loki.relabel.journal.rules
labels = { job = "log_collection" }
}
loki.write
This block creates an object to where we push the logs. We will need to give it our Loki server DNS or IP address to send the logs.
You can change the local_host to anything you like, it's just a label. We will see this nature in the other blocks too.
loki.write "local_host" {
endpoint {
url = "http://<Loki_Server_IP>:3100/loki/api/v1/push"
}
}
loki.relabel
This block is all about relabeling.
When Loki receives journal logs, they will have specific initial names that start with __journal_<service_name> like __journal__systemd_unit. So to make our life easier, we create certain rules to relabel them into our own groups so we will have a simpler way to query them later.
Each rule relabels a certain log group into a target_label. You can change the value to any label you want, e.g. target_label = "ninja".
loki.relabel "journal" {
forward_to = []
rule {
source_labels = ["__journal__systemd_unit"]
target_label = "service_name"
}
rule {
source_labels = ["__journal__transport"]
target_label = "transport"
}
rule {
source_labels = ["__journal_priority_keyword"]
target_label = "level"
}
rule {
source_labels = ["__journal__hostname"]
target_label = "host_name"
}
}
loki.source.journal
The final block is getting the input. It uses the journal's API under the hood to collect our logs.
It configures which block to send the data to, which block is used to relabel our data, and finally just a small label that is added, which will be the main label for all data coming from Alloy.
loki.source.journal "read" {
forward_to = [loki.write.local_host.receiver]
relabel_rules = loki.relabel.journal.rules
labels = { job = "log_collection" }
}
After we created our config file, we will need to pull the Alloy image to the host. I will leave this part to you, as it differs by environment.
I will jump straight into running the container.
IMPORTANT NOTE! If using automation and deploying too many containers of Alloy at once, it may crash Loki's max streaming amount, so start with one container and then deploy the rest.
To make the Alloy container read the journal logs, we need to mount the journal logs directories and add the container root user to the group ID of the journal group of the host, so it will have permissions.
So we run this command:
Note: the journald group ID may vary, please give the parameter the correct ID. To find the ID on the host, run:
getent group systemd-journal | cut -d: -f3
docker run -d \
--name alloy \
--network host \
--restart unless-stopped \
--group-add <JOURNAL_GROUP_ID> \
-v <Path_To_Config_File>:/etc/alloy/config.alloy:ro \
-v /run/log/journal:/run/log/journal:ro \
-v /var/log/journal:/var/log/journal:ro \
-v /etc/machine-id:/etc/machine-id:ro \
grafana/alloy:v1.11.3 \
run --server.http.listen-addr=0.0.0.0:12345 /etc/alloy/config.alloy
Verify it works
Before moving on to Grafana, let's make sure everything is running as expected.
On the Loki host, check that Loki is ready:
curl http://<LOKI_SERVER_IP>:3100/ready
You should get back ready.
On the Alloy host, check the container logs for any errors:
docker logs alloy
If Alloy is shipping logs successfully, you can confirm Loki is receiving them by querying for our job label:
curl -G -s "http://<LOKI_SERVER_IP>:3100/loki/api/v1/labels" | grep job
If all three checks pass, you're good to go.
Grafana
From now on, you can set Loki as a Grafana data source and create dashboards for the logs.
Thank You
And that's a wrap!
Thank you so much for taking the time to read my guide — it really means a lot. I hope it saved you some of the headache I went through figuring this out, and that you now have a working Alloy + Loki stack pulling logs from your hosts.
If you spotted something that could be improved, have a question, or just want to share how your own monitoring setup looks, I'd love to hear from you in the comments.
Stay tuned — more monitoring guides are on the way.
— Alex (TheBlueDrara)
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.