DEV Community

amirreza valizade
amirreza valizade

Posted on

14 1 1

Manage Prometheus TSDB in the better way!

Prometheus is a powerful monitoring system that provides a simple solution to the retention of old data with --storage.tsdb.retention.size and --storage.tsdb.retention.time configurations. These configurations allow users to define the maximum size and age of data that should be retained in the Prometheus database. However, in some cases, users may need to store certain metrics for long-term purposes and delete unnecessaries. In this article, we will discuss how to label Prometheus targets and delete old data using the Admin API to meet such requirements.

Labeling Prometheus Targets

To retain specific metrics for a longer period, you need to label the targets from which Prometheus scrapes data. You can add a new label, such as retention_time, to the job configuration file for each target. The value of this label should represent the duration for which you want to retain the data. For example, you can set the label to "one-month", "three-month", "twelve-month", or any other value that suits your needs.

Here is an example job configuration file that adds the retention_time label:

  - job_name: 'node_exporter'
    file_sd_configs:
    - files:
      - node_exporter.yml
    relabel_configs:
      - target_label: retention_time
        replacement: "one-month"
Enter fullscreen mode Exit fullscreen mode

Deleting labeled Data Using the Admin API

Once you have labeled the targets, you can use the Prometheus Admin API to delete old data that is no longer required. The DeleteSeries endpoint deletes data for a selection of series in a time range. The data still exists on disk and is cleaned up in future compactions, or you can explicitly clean it up using the CleanTombstones endpoint. Enable admin api by --web.enable-admin-api on Prometheus.

To delete data for a particular time range, you can use the match[] URL query parameter to select the series to delete, along with the start and end timestamps. Here is an example of using DeleteSeries to delete data for series with retention_time="one-month" label that are older than one month:

$ curl -X PUT \
  -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={retention_time="one-month"}&end='"$(date +%s -d '1 month ago')"''
Enter fullscreen mode Exit fullscreen mode

URL query parameters:

  • match[]=<series_selector>: Repeated label matcher argument that selects the series to delete. At least one match[] argument must be provided.
  • start=<rfc3339 | unix_timestamp>: Start timestamp. Optional and defaults to minimum possible time.
  • end=<rfc3339 | unix_timestamp>: End timestamp. Optional and defaults to maximum possible time. Not mentioning both start and end times would clear all the data for the matched series in the database.

Note: that these endpoints mark the samples from the selected series as deleted, but they do not prevent the associated series metadata from still being returned in metadata queries for the affected time range. You can use the CleanTombstones endpoint to remove the deleted data from disk and clean up the existing tombstones.

$ curl -X POST http://localhost:9090/api/v1/admin/tsdb/clean_tombstones
Enter fullscreen mode Exit fullscreen mode

Now we have labels that define how long we need the metrics and the request which can remove them based on labels. At the end you have a script like this:

#!/bin/sh

#calculate the end timestamp and start timestamp will be the minimum possible time
curl -X PUT -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={retention_time="one-month"}&end='"$(date +%s -d '1 month ago')"''
curl -X PUT -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={retention_time="three-month"}&end='"$(date +%s -d '3 month ago')"''
curl -X PUT -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={retention_time="twelve-month"}&end='"$(date +%s -d '12 months ago')"''

#clean_storage
curl -X 'PUT' 'http://127.0.0.1:9090/api/v1/admin/tsdb/clean_tombstones'   -H 'accept: */*'%
Enter fullscreen mode Exit fullscreen mode

Don't forget about the automating to ensure that old metrics are regularly deleted without any manual intervention.

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Explore a sea of insights with this enlightening post, highly esteemed within the nurturing DEV Community. Coders of all stripes are invited to participate and contribute to our shared knowledge.

Expressing gratitude with a simple "thank you" can make a big impact. Leave your thanks in the comments!

On DEV, exchanging ideas smooths our way and strengthens our community bonds. Found this useful? A quick note of thanks to the author can mean a lot.

Okay