Backing Up StarRocks to Hetzner Object Storage

#starrocks #3s

A step-by-step guide to configuring an S3-compatible repository on Hetzner and running full backups and restores with StarRocks.

Prerequisites

Before you begin, make sure you have the following ready:

A running StarRocks cluster (v3.x recommended)
A Hetzner Object Storage bucket created in your preferred region (e.g. hel1)
An S3-compatible access key and secret key from the Hetzner Console
The database and snapshot names you want to back up

1. Create the backup repository

StarRocks uses named repositories to abstract over storage backends. Register your Hetzner bucket as an S3 repository using CREATE REPOSITORY. The key settings are path_style_access = true (required for Hetzner) and the Hetzner endpoint URL for your region.

CREATE REPOSITORY s3_backup
WITH BROKER
ON LOCATION "s3://<bucket>"
PROPERTIES (
    "aws.s3.access_key"        = "<access_key>",
    "aws.s3.secret_key"        = "<secret_key>",
    "aws.s3.region"            = "us-east-1",
    "aws.s3.path_style_access" = "true",
    "aws.s3.endpoint"          = "hel1.your-objectstorage.com"
);

Note: Replace <bucket>, <access_key>, and <secret_key> with your actual values. The region field must be set to us-east-1 even for Hetzner — it is required by the protocol but the actual routing is determined by the endpoint URL.

2. Verify the repository

Confirm the repository was registered correctly and is reachable:

SHOW REPOSITORIES;
SHOW SNAPSHOT ON s3_backup;

SHOW REPOSITORIES lists all configured repositories and their status. SHOW SNAPSHOT ON lists any existing snapshots already stored in the bucket — it will be empty on a fresh repository.

3. Run a full backup

Trigger a full backup of a database snapshot to the repository. The snapshot is named per-database and stored as an immutable point-in-time copy.

BACKUP SNAPSHOT <db_name>._snapshot
TO s3_backup
PROPERTIES (
    "type" = "full"
);

Backups run asynchronously. Poll the job status with SHOW BACKUP until the state shows FINISHED. Once complete, re-run SHOW SNAPSHOT ON s3_backup to confirm the snapshot appears and note its timestamp — you will need it for restores.

4. Restore from a snapshot

To restore, specify the snapshot name, the source repository, the target database, and the exact backup timestamp returned by SHOW SNAPSHOT. The target database does not need to pre-exist.

-- Restores into <db_name> from a named snapshot
RESTORE SNAPSHOT _snapshot
FROM s3_backup
DATABASE <db_name>
PROPERTIES (
    "backup_timestamp" = "2026-06-08-17-57-29-297",
    "replication_num"  = "1"
);

Note: Set replication_num to match your cluster's replication factor — use 1 for single-node dev clusters and 3 for production. Mismatching this value will cause the restore job to fail.

Like backups, restores are asynchronous. Monitor progress with SHOW RESTORE and verify table data once the job reaches FINISHED.

Summary

With four SQL statements you have a fully functional backup pipeline: create the repository once, verify connectivity, back up on demand or on a schedule, and restore to any database target using the snapshot timestamp. For production setups, consider automating the BACKUP statement via a cron job or orchestration tool, and storing the snapshot timestamps in a log for quick disaster recovery reference.

Top comments (1)

giveitatry • Jun 8

A few things worth highlighting from the configuration:

path_style_access = true is essential for Hetzner — without it, StarRocks will try virtual-hosted-style URLs that Hetzner doesn't support.
The region field (us-east-1) is a required placeholder for the S3 protocol. Hetzner ignores it and routes based on the endpoint URL, but StarRocks will reject the statement if it's missing.
Snapshot timestamps are returned by SHOW SNAPSHOT ON s3_backup — always grab one after a backup completes and store it somewhere safe, since you need the exact string to restore.