Khaing Zin

Posted on May 27

Automating MongoDB Auditlogs Cleanup & Restore Workflow with S3 Backup

Managing database growth is one of the most overlooked challenges in backend infrastructure. Everything works smoothly in the beginning, but over time collections like auditlogs grow rapidly. Every login attempt, API request, status change, and system activity gets stored continuousy.

Eventually, the database becomes larger, queries become slower, storage usage increases, and backups take longer to complete.

This is exactly the kind of operational issue many production systems face.

In the prod environment, the goal was simple:

Automatically clean old auditlogs
Keep a backup before deletion
Store backups safely in Amazon S3
Ensure the restore workflow works correctly
Automate the entire process with Cron

Instead of manually exporting logs and cleaning collections repeatedly, the process was converted into a fully automated backup and cleanup workflow using Bash scripting, MongoDB tools, AWS S3, and Cron Jobs.

This setup creates a safer and more production-ready maintenance process for MongoDB operations.

Why Auditlogs Need Special Handling

Audit logs are important because they provide traceability inside applications.

They help teams:

Track user activities
Monitor system events
Investigate incidents
Debug production issues
Maintain compliance records

However, auditlogs are also one of the fastest-growing collections in most systems.

Unlike business-critical collections such as users or transactions, auditlogs are usually append-only data. New records keep getting inserted, but older records are rarely accessed.

If these logs are never cleaned:

Database size grows continuously
Disk usage increases
Backup size becomes larger
Query performance may degrade
Infrastructure costs increase

This is why organizations commonly implement:

Log retention policies
Scheduled cleanup jobs
Backup archiving systems

In this workflow, Amazon S3 is used as long-term backup storage before deleting auditlogs from MongoDB.

Designing the Backup & Cleanup Workflow

The workflow was designed around one important principle:

Never delete data before creating a recoverable backup.

The process follows this sequence:

Dump the auditlogs collection
Compress the backup
Upload it to Amazon S3
Verify upload success
Delete auditlogs from MongoDB
Remove temporary local files

This ensures backups remain recoverable even after cleanup happens.

The automation runs safely on the mongodb-testing server before production rollout.

Creating an S3 Bucket for Auditlogs Backup

The first step is preparing an Amazon S3 bucket.

The bucket acts as centralized cloud storage for compressed MongoDB backups.

Example bucket:

s3://incoming-auditlogs-backup

Using S3 provides several advantages:

Durable backup storage
Low-cost archival
Easy recovery process
Secure cloud-based retention
Separation between production DB and backup storage

Instead of storing backups locally on the server, backups remain protected even if the EC2 instance fails.

Building the Bash Automation Script

The heart of the automation is a Bash script called:

/home/ubuntu/auditlogs-dump.sh

This script automates the complete lifecycle:

Backup
Compression
Upload
Cleanup

The script begins with several configuration variables.

MONGO_HOST="localhost"
MONGO_PORT="2707"
MONGO_DB="incoming"
MONGO_COLLECTION="auditlogs"

These variables define:

MongoDB host
Database name
Collection name

This makes the script reusable and easier to maintain later.

The S3 destination and backup naming logic are also configured:

S3_BUCKET="s3://incoming-auditlogs-backup"
BACKUP_NAME="auditlogs_backup_$(date +%F_%H-%M-%S)"

Using timestamps inside filenames is extremely important because:

Every backup becomes unique
Older backups are preserved
Restores become easier
Chronological tracking becomes possible

A generated backup name may look like:

auditlogs_backup_2025-08-19_02-00-01

This naming strategy is commonly used in production backup systems.

Dumping the MongoDB Collection

The first operational step inside the script is creating a MongoDB dump using mongodump.

mongodump \
  --host $MONGO_HOST \
  --port $MONGO_PORT \
  --db $MONGO_DB \
  --collection $MONGO_COLLECTION \
  --out $DUMP_PATH

mongodump exports the collection into BSON files that can later be restored using mongorestore.

Instead of backing up the entire database, only the auditlogs collection is exported.

This approach is efficient because:

Backup size remains smaller
Upload time becomes faster
Restore operations become simpler

This is especially useful for collections that grow aggressively.

Compressing the Backup

After the dump completes, the backup is compressed into a .tar.gz archive.

tar -czf "$DUMP_PATH.tar.gz" -C /tmp "$BACKUP_NAME"

Compression is important because:

Storage consumption decreases
Upload speed improves
Transfer bandwidth is reduced

In production systems where backups happen frequently, compression significantly reduces infrastructure costs over time.

Uploading Backup to Amazon S3

Once compressed, the backup is uploaded to Amazon S3 using AWS CLI.

aws s3 cp "$DUMP_PATH.tar.gz" "$S3_BUCKET/$BACKUP_NAME.tar.gz" --profile "$AWS_PROFILE"

This step creates an offsite backup copy before cleanup begins.

Using AWS CLI profiles is also a good operational practice because credentials remain separated and manageable.

Example:

aws configure --profile s3

This allows the script to securely authenticate with AWS without hardcoding access keys directly inside the script.

Cleaning the Auditlogs Collection

After the backup upload succeeds, the script cleans the MongoDB collection.

mongosh --host $MONGO_HOST --port $MONGO_PORT $MONGO_DB \
  --eval "db.getCollection('$MONGO_COLLECTION').deleteMany({})"

This removes all documents inside the auditlogs collection.

At this stage:

Backup already exists in S3
Data remains recoverable
Database size gets reduced

This creates a controlled cleanup workflow instead of dangerous direct deletion.

Removing Temporary Files

The final step removes temporary backup files from the local server.

rm -rf "$DUMP_PATH" "$DUMP_PATH.tar.gz"

This is extremely important because backup files can consume large amounts of disk space if left on the server continuously.

Cleanup helps:

Prevent storage exhaustion
Maintain server hygiene
Reduce operational risks

At the end, the script prints a success message confirming completion.

Making the Script Executable

Before execution, the script needs executable permission.

sudo chmod +x /home/ubuntu/auditlogs-dump.sh

Without this permission, Linux will not allow direct execution of the script.

Manual Testing Before Automation

Before scheduling automation, the script should always be tested manually.

/home/ubuntu/auditlogs-dump.sh

This validation step is important because it helps verify:

MongoDB dump works correctly
Compression succeeds
S3 upload succeeds
Collection cleanup works
Temporary files are removed

In production operations, testing automation manually before enabling Cron Jobs is a critical best practice.

Automating Cleanup with Cron

Once validated, the process can be scheduled using Cron.

The requirement here is:

Run every 2 days
Execute at 2:00 AM MMT

Cron configuration:

crontab -e

Add:

0 2 */2 * * /home/ubuntu/auditlogs-dump.sh

This transforms the process into a fully automated maintenance workflow.

Running during early morning hours minimizes production impact because traffic is usually lower.

Why Restore Testing Is Critical

Creating backups is only half the job.

The real question is:

Can the backup actually be restored successfully?

Many teams create backups regularly but never verify recovery workflows until a disaster happens.

This workflow includes a dedicated restore testing process on the mongodb-testing server.

This is an extremely important operational practice.

Restore Testing Workflow

The testing process begins by downloading the latest full database backup from S3.

aws s3 cp s3://incoming-mongodb-daily-backup/mongo_backup_2025-08-19_02-00-01.tar.gz /home/ubuntu/tmp/ --profile s3

The backup is then extracted:

tar -xzvf /home/ubuntu/tmp/mongo_backup_2025-08-19_02-00-01.tar.gz -C /home/ubuntu/tmp/

After extraction, the database is restored using mongorestore.

mongorestore --drop --db incoming /home/ubuntu/tmp/mongo_backup_2025-08-19_02-00-01/incoming

The --drop option ensures old collections are removed before restore begins.

This prevents duplicate or inconsistent data.

Verifying Data Restoration

After restore completes, verification is performed using:

mongosh incoming --eval "db.auditlogs.count()"

This confirms:

Data exists
Restore succeeded
Collection integrity remains intact

Verification steps like this are critical in real-world disaster recovery procedures.

Simulating Cleanup & Recovery

The workflow then intentionally deletes auditlogs:

mongosh incoming --eval "db.auditlogs.deleteMany({})"

This simulates a cleanup scenario.

Next, the latest auditlogs-only backup is restored from S3.

aws s3 cp s3://audit-bucket/auditlogs_backup_2025-08-19_02-00-01.tar.gz /home/ubuntu/tmp/ --profile s3

After extraction, the auditlogs collection is restored using:

mongorestore --drop --db incoming /home/ubuntu/tmp/auditlogs_backup_2025-08-19_02-00-01/incoming

This validates that:

Auditlogs backups are usable
Partial collection recovery works correctly
Data remains recoverable after cleanup
Cleaning the Testing Environment

Finally, the testing database is cleaned:

mongosh incoming --eval "db.dropDatabase()"

This ensures:

No leftover test data remains
Testing environments stay clean
Future restore tests remain isolated

Operational cleanliness is a small detail that becomes very important in long-running infrastructure environments.

Final Thoughts

This MongoDB auditlogs automation workflow demonstrates several important DevOps and infrastructure engineering concepts:

Backup automation
Cloud storage integration
Infrastructure scripting
Database maintenance
Disaster recovery testing
Operational safety
Scheduled automation with Cron

Instead of relying on manual database cleanup, the system now:

Creates recoverable backups automatically
Stores backups securely in S3
Cleans unnecessary auditlogs
Verifies restore capability safely

DEV Community