DEV Community

Cover image for Solved: Automated Log Rotation and Archiving to Google Cloud Storage
Darian Vance
Darian Vance

Posted on • Originally published at wp.me

Solved: Automated Log Rotation and Archiving to Google Cloud Storage

🚀 Executive Summary

TL;DR: Effectively managing application logs is crucial for system health and compliance. This guide provides an automated solution using logrotate for local log management and a custom Python script to archive compressed logs to Google Cloud Storage, ensuring scalable, cost-effective, and centralized historical data retention.

🎯 Key Takeaways

  • logrotate can be configured with directives like daily, rotate 7, compress, delaycompress, and postrotate to manage local log files efficiently before archiving.
  • Authentication to Google Cloud Storage for log uploads is securely handled via a Service Account with the Storage Object Creator role, using a downloaded JSON key file.
  • A Python script, integrated into logrotate’s postrotate section, utilizes the google-cloud-storage library to upload rotated and compressed log files to a specified GCS bucket, optionally deleting local copies.

Automated Log Rotation and Archiving to Google Cloud Storage

Introduction

Managing logs effectively is a cornerstone of robust system administration and application development. Unmanaged logs can quickly consume valuable disk space, degrade system performance, and complicate troubleshooting efforts. Furthermore, retaining logs for compliance or auditing purposes often necessitates a durable and scalable storage solution that goes beyond local file systems.

This tutorial addresses these challenges by outlining a comprehensive, automated solution for log rotation and archiving to Google Cloud Storage (GCS). By leveraging logrotate for local management and a custom Python script for GCS integration, you can ensure your logs are efficiently rotated, compressed, and stored in a highly available, cost-effective, and scalable cloud repository. This approach not only frees up local disk space but also centralizes your historical log data for easier access, analysis, and compliance.

Prerequisites

Before diving into the setup, ensure you have the following in place:

  • A Google Cloud Platform (GCP) project with billing enabled.
  • A Google Cloud Storage bucket created within your GCP project.
  • A Service Account with the Storage Object Admin role (or a more granular role like Storage Object Creator on the specific bucket) for uploading objects to your GCS bucket. Download the JSON key file for this service account; we’ll refer to its path as ~/.gcp/service-account-key.json in this guide.
  • The gcloud command-line interface installed and authenticated on your system. While not strictly required for the Python script, it’s useful for managing GCP resources.
  • logrotate utility installed on your Linux server (most Linux distributions include it by default).
  • Python 3 and pip installed on your server.
  • The google-cloud-storage Python library installed:
  pip3 install google-cloud-storage
Enter fullscreen mode Exit fullscreen mode

Step-by-Step Guide

Step 1: Configure Log Rotation with logrotate

logrotate is a powerful utility designed to simplify the administration of log files on systems that generate a large number of logs. It allows for automatic rotation, compression, removal, and mailing of log files. We’ll set up logrotate to manage a hypothetical application log, say /var/log/myapp/myapp.log.

Create a new logrotate configuration file for your application. We’ll place it in /etc/logrotate.d/myapp:

/var/log/myapp/myapp.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 0640 user group
    postrotate
        # This section will be updated in Step 4 to call our GCS upload script
        true
    endscript
}
Enter fullscreen mode Exit fullscreen mode

Let’s break down this configuration:

  • /var/log/myapp/myapp.log: Specifies the log file to be rotated. You can specify multiple files or use wildcards.
  • daily: Rotates the log file daily. Other options include weekly or monthly.
  • rotate 7: Keeps 7 rotated log files. After the 7th rotation, the oldest log will be removed.
  • compress: Compresses the rotated log files using gzip.
  • delaycompress: Delays the compression of the previous log file until the next rotation cycle. This is useful if the log file is still being written to by an application that doesn’t immediately release its handle.
  • missingok: If the log file is missing, do not issue an error message.
  • notifempty: Do not rotate the log file if it is empty.
  • create 0640 user group: After rotation, a new empty log file is created with specified permissions, owner, and group. Replace user and group with appropriate values for your application.
  • postrotate … endscript: Commands placed here are executed after the log file is rotated. We will use this in a later step to trigger our GCS upload script. true is a placeholder for now.

Test your configuration (without actually performing a rotation) by running:

logrotate -d /etc/logrotate.d/myapp
Enter fullscreen mode Exit fullscreen mode

This command will show you what logrotate *would* do.

Step 2: Create a GCS Bucket and Service Account

(If you’ve already completed the prerequisites, you can skip to verifying your setup).

  1. **Create a GCS Bucket:** Ensure your bucket is created and properly configured. You can do this via the GCP Console or with the gcloud CLI:
gcloud storage buckets create gs://your-gcs-log-archive-bucket --project=[YOUR_GCP_PROJECT_ID] --location=US-CENTRAL1
Enter fullscreen mode Exit fullscreen mode

Remember to replace your-gcs-log-archive-bucket and [YOUR\_GCP\_PROJECT\_ID] with your specific values.

  1. **Create a Service Account and JSON Key:** The Python script will authenticate to GCP using a service account key.
# Create the service account
gcloud iam service-accounts create log-archiver-sa --display-name="Log Archiver Service Account" --project=[YOUR_GCP_PROJECT_ID]

# Grant the service account permissions to your bucket (Storage Object Admin is broad, prefer Storage Object Creator for least privilege)
gcloud storage buckets add-iam-policy-binding gs://your-gcs-log-archive-bucket \
    --member="serviceAccount:log-archiver-sa@[YOUR_GCP_PROJECT_ID].iam.gserviceaccount.com" \
    --role="roles/storage.objectAdmin"

# Create a directory for your keys and download the JSON key
mkdir -p ~/.gcp/
gcloud iam service-accounts keys create ~/.gcp/service-account-key.json \
    --iam-account="log-archiver-sa@[YOUR_GCP_PROJECT_ID].iam.gserviceaccount.com" \
    --project=[YOUR_GCP_PROJECT_ID]
Enter fullscreen mode Exit fullscreen mode

**Security Note:** The ~/.gcp/service-account-key.json file contains sensitive credentials. Ensure it’s protected with appropriate file permissions (e.g., chmod 400 ~/.gcp/service-account-key.json) and is not publicly accessible.

Step 3: Develop a Python Script for GCS Upload

This Python script will take the path to a rotated log file as an argument and upload it to your designated GCS bucket. It will also optionally delete the local file after a successful upload.

Create a file named upload\_gcs.py in a suitable location, for example, /home/user/scripts/:

import os
import sys
from google.cloud import storage

# --- Configuration ---
# Path to your Service Account JSON key file
SERVICE_ACCOUNT_KEY_PATH = os.path.expanduser("~/.gcp/service-account-key.json")
# Your GCS bucket name
GCS_BUCKET_NAME = "your-gcs-log-archive-bucket"
# --- End Configuration ---

def upload_to_gcs(file_path, bucket_name, credentials_path):
    """Uploads a file to the Google Cloud Storage bucket and deletes it locally."""
    try:
        if not os.path.exists(file_path):
            print(f"Error: Log file not found at {file_path}. Skipping upload.")
            return

        # Explicitly pass credentials for the Storage client
        client = storage.Client.from_service_account_json(credentials_path)
        bucket = client.bucket(bucket_name)

        # Generate a blob name for GCS. We'll use a `rotated_logs/` prefix
        # and the base filename (e.g., rotated_logs/myapp.log.1.gz)
        base_filename = os.path.basename(file_path)
        blob_name = f"rotated_logs/{base_filename}"

        blob = bucket.blob(blob_name)

        print(f"Attempting to upload {file_path} to gs://{bucket_name}/{blob_name}...")
        blob.upload_from_filename(file_path)
        print(f"File {file_path} uploaded successfully to gs://{bucket_name}/{blob_name}.")

        # Delete the local file after successful upload
        os.remove(file_path)
        print(f"Local file {file_path} deleted.")

    except Exception as e:
        print(f"Error uploading {file_path}: {e}")
        sys.exit(1) # Indicate failure

if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("Usage: python3 upload_gcs.py <path_to_rotated_log_file>")
        sys.exit(1)

    log_file_to_upload = sys.argv[1]

    # Ensure the service account key path is accessible and exists
    if not os.path.exists(SERVICE_ACCOUNT_KEY_PATH):
        print(f"Error: Service account key not found at {SERVICE_ACCOUNT_KEY_PATH}")
        print("Please ensure your service account key is correctly placed and has read permissions.")
        sys.exit(1)

    upload_to_gcs(log_file_to_upload, GCS_BUCKET_NAME, SERVICE_ACCOUNT_KEY_PATH)
Enter fullscreen mode Exit fullscreen mode

Replace ”your-gcs-log-archive-bucket” with the actual name of your GCS bucket.

Make the script executable:

chmod +x /home/user/scripts/upload_gcs.py
Enter fullscreen mode Exit fullscreen mode

Step 4: Integrate Log Rotation with the GCS Upload Script

Now, we’ll modify the logrotate configuration from Step 1 to call our Python script. The postrotate section in logrotate executes commands after the log file has been rotated and compressed. It passes the name of the *rotated* file to the script as an argument.

Edit /etc/logrotate.d/myapp again and update the postrotate section:

/var/log/myapp/myapp.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 0640 user group
    postrotate
        # Call our Python script to upload the rotated log file to GCS
        # $1 will be replaced by logrotate with the path to the rotated log file (e.g., /var/log/myapp/myapp.log.1.gz)
        python3 /home/user/scripts/upload_gcs.py $1
    endscript
}
Enter fullscreen mode Exit fullscreen mode

Remember to replace user and group with the appropriate values.

The logrotate utility typically runs daily via a cron job (usually /etc/cron.daily/logrotate). When it processes your myapp configuration, it will rotate myapp.log, compress the old one (e.g., myapp.log.1.gz), and then execute the postrotate command, which triggers our Python script to upload myapp.log.1.gz to GCS.

Step 5: Testing and Verification

It’s crucial to test the entire flow to ensure everything works as expected.

  1. **Simulate log rotation:** You can force logrotate to run for your specific configuration (in debug mode first, then actually):
# Simulate with debug (won't actually rotate or run postrotate scripts)
logrotate -d -f /etc/logrotate.d/myapp

# Force rotation and execute postrotate scripts (use with caution in production)
logrotate -f /etc/logrotate.d/myapp
Enter fullscreen mode Exit fullscreen mode

Before running the force command, ensure you have some content in /var/log/myapp/myapp.log so notifempty doesn’t prevent rotation. You might need to temporarily comment out notifempty for testing.

  1. **Verify local files:**

    Check /var/log/myapp/ to see if the log file has been rotated and if the compressed file (e.g., myapp.log.1.gz) has been deleted after upload.

  2. **Verify GCS bucket:**

    Navigate to your GCS bucket in the GCP Console or use the gcloud CLI to list objects:

gcloud storage ls gs://your-gcs-log-archive-bucket/rotated_logs/
Enter fullscreen mode Exit fullscreen mode

You should see your rotated log file (e.g., myapp.log.1.gz) listed.

Common Pitfalls

  • Incorrect Service Account Permissions: Ensure your service account has at least Storage Object Creator role on the specific GCS bucket. Without it, the Python script will fail with an authentication or permission error.
  • Incorrect Paths or Permissions:
    • Double-check the SERVICE\_ACCOUNT\_KEY\_PATH in your Python script and ensure the file exists and is readable by the user running logrotate (often root or the logrotate user).
    • Verify the path to your Python script (/home/user/scripts/upload\_gcs.py) in the logrotate configuration and ensure it’s executable.
    • Make sure the logrotate configuration file (/etc/logrotate.d/myapp) has correct permissions.
  • logrotate not running postrotate actions: If you’re using logrotate -d (debug mode), postrotate commands are not actually executed. Use logrotate -f for a real test, but be cautious in production environments. Also, if notifempty is set and the log file is empty, no rotation occurs, and thus no postrotate action.
  • Python Environment Issues: Ensure the google-cloud-storage library is installed for the Python interpreter being used by the logrotate script (e.g., python3). If logrotate runs as root, make sure the library is installed in root’s environment or globally accessible.

Conclusion

By following this guide, you have successfully implemented an automated system for log rotation and archiving your critical application logs to Google Cloud Storage. This solution not only helps you maintain healthy disk space on your servers but also establishes a durable, scalable, and cost-effective repository for historical log data. This centralized approach simplifies troubleshooting, facilitates compliance, and empowers further analysis of your operational data, ensuring your log management strategy is robust and future-proof. Remember to regularly review your logrotate configurations and GCS bucket policies to align with your evolving operational and compliance requirements.


Darian Vance

👉 Read the original article on TechResolve.blog


Support my work

If this article helped you, you can buy me a coffee:

👉 https://buymeacoffee.com/darianvance

Top comments (0)