DEV Community

loading...

Prometheus: Alertmanager Web UI alerts Silence

Arseny Zinchenko
DevOps, cloud and infrastructure engineer. Love Linux, OpenSource, and AWS.
Originally published at rtfm.co.ua on ・4 min read

Active alerts sending frequency via Alertmanager is configured via the repeat_interval in the /etc/alertmanager/config.yml file.

We have this interval set to 15 minutes, and as result, we have notifications about alerts in our Slack each fifteen minutes.

Still, some alerts are such a “known issues”, when we already started the investigation or fixing it, but the alert is repeatedly sent to Slack.

To mute those alerts to prevent them to be sent over and over they can be disabled by marking them as “silenced”.

An alert can be silenced with the Web UI of the Alertmanager, see the documentation.

So, what we will do in this post:

  • update Alertmanager’s startup options to enable Web UI
  • update an NGINX virtualhost to get access to the Alertmanager’s Web UI
  • will check and configure Prometheus server to send alerts
  • will add a test alert to check how to Silence it

Alertmanager Web UI configuration

We have our Alertmanager running from a Docker Compose file, let’s add two parameters to the command field - a web.route-prefix which will specify a URI for the Alertmanager Web UI, and a web.external-url, to set a full URL.

This full URL will look like dev.monitor.example.com/alertmanager — add them:

...
  alertmanager:
    image: prom/alertmanager:v0.21.0
    networks:
      - prometheus
    ports:
      - 9093:9093
    volumes:
      - /etc/prometheus/alertmanager_config.yml:/etc/alertmanager/config.yml
    command:
      - '--config.file=/etc/alertmanager/config.yml'
      - '--web.route-prefix=/alertmanager'
      - '--web.external-url=https://dev.monitor.example.com/alertmanager'
...
Enter fullscreen mode Exit fullscreen mode

Alertmanager is working in a Docker container and is accessible via localhost:9093 from the monitoring host:

root@monitoring-dev:/home/admin# docker ps | grep alert
24ae3babd644 prom/alertmanager:v0.21.0 “/bin/alertmanager -…” 3 seconds ago Up 1 second 0.0.0.0:9093->9093/tcp prometheus_alertmanager_1
Enter fullscreen mode Exit fullscreen mode

In the NGINX’s virtualhost config add a new upstream with the Alertmanager's Docker container:

...
upstream alertmanager {
    server 127.0.0.1:9093;
}
...
Enter fullscreen mode Exit fullscreen mode

Also, add a new location in this file which will proxy-pass all requests to the dev.monitor.example.com/alertmanager to this upstream:

...
    location /alertmanager {

        proxy_redirect off;            
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_pass [http://alertmanager$request_uri;](http://alertmanager%24request_uri;)
    }
...
Enter fullscreen mode Exit fullscreen mode

Save and reload NGINX and Alertmanager.

Now, open the https://dev.monitor.example.com/alertmanager URL and you must see the Alertmanager Web UI:

Here are no alerts yet — wait for Prometheus to send new alerts.

Prometheus: “Error sending alert” err=”bad response status 404 Not Found”

After a new alert in the Prometheus server will appear you can see the following error in its log:

caller=notifier.go:527 component=notifier alertmanager=http://alertmanager:9093/api/v1/alerts count=3 msg=”Error sending alert” err=”bad response status 404 Not Found”

It happens because currently, we have the alertmanagers set as:

...
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - alertmanager:9093
...
Enter fullscreen mode Exit fullscreen mode

So, need to add the URI of the Alertmanager by using the path_prefix setting:

...
alerting:
  alertmanagers:
  - path_prefix: "/alertmanager/"
    static_configs:
    - targets:
      - alertmanager:9093
...
Enter fullscreen mode Exit fullscreen mode

Restart the Prometheus, and wait again for alerts:

At this time, you must see them in the Alertmanager Web UI too:

Alertmanager: an alert Silence

Now, let’s add a Silence for an alert to stop sending them.

For example, to disable re-sending of the alertname=”APIendpointProbeSuccessCritical”, click at the + button at the right side:

Then on the Silence button:

The alertname label was added to the silencing condition with the default rule of the 2 hours, add an author and description why it was silenced:

Click Create — and it’s done:

You can check this alert via API now:

root@monitoring-dev:/home/admin# curl -s [http://localhost:9093/alertmanager/api/v1/alerts](http://localhost:9093/alertmanager/api/v1/alerts) | jq ‘.data[1]’
{
“labels”: {
“alertname”: “APIendpointProbeSuccessCritical”,
“instance”: “http://push.example.com",
“job”: “blackbox”,
“monitor”: “monitoring-dev”,
“severity”: “critical”
},
“annotations”: {
“description”: “Cant access API endpoint http://push.example.com!",
“summary”: “API endpoint down!”
},
“startsAt”: “2020–12–30T11:25:25.953289015Z”,
“endsAt”: “2020–12–30T11:43:25.953289015Z”,
“generatorURL”: “https://dev.monitor.example.com/prometheus/graph?g0.expr=probe_success%7Binstance%21%3D%22https%3A%2F%2Fokta.example.com%22%2Cjob%3D%22blackbox%22%7D+%21%3D+1&g0.tab=1",
“status”: {
“state”: “suppressed”,
“silencedBy”: [
“ec11c989-f66e-448e-837c-d788c1db8aa4”
],
“inhibitedBy”: null
},
“receivers”: [
“critical”
],
“fingerprint”: “01e79a8dd541cf69”
}
Enter fullscreen mode Exit fullscreen mode

So, this alert will not be sent to the Slack or wherever else because of the "state": "suppressed" field:

…
“status”: {
“state”: “suppressed”,
“silencedBy”: [
“ec11c989-f66e-448e-837c-d788c1db8aa4”
],…

Enter fullscreen mode Exit fullscreen mode

Done.

Originally published at RTFM: Linux, DevOps and system administration.


Discussion (0)