Michael

Posted on Jun 14 • Originally published at gbase.cn

GBase 8a Backup and Recovery Guide: gcrcman from Basics to Production

#gbase #database #数据库 #operations

GBase 8a, as an MPP analytical database, does not use WAL transaction logs. Instead, it relies on the dedicated gcrcman tool for snapshot backups. This guide covers cluster‑level, database‑level, and table‑level backup/recovery, along with scheduling strategies and common issues.

1. gcrcman Fundamentals

GBase 8a omits transaction logs for maximum scan throughput, so point‑in‑time recovery is not supported. You must use gcrcman for periodic snapshots. The tool is located at $GCLUSTER_BASE/server/bin/gcrcman.py.

Backup Types

Scope	Full (level 0)	Incremental (level 1)	Online?
Cluster	✅	✅	❌ (switch to readonly)
Database	✅	✅	❌ (switch to readonly)
Table	✅	✅	✅ (locks the table only)

Cycle and Point Model

A full backup starts a new cycle; subsequent incrementals continue the same cycle. You can restore to any snapshot by specifying cycle_id and point_id.

2. Pre‑Backup Checks

# 1. Cluster must be ACTIVE
gcadmin

# 2. All nodes must have synchronized time (drift < 1 second)
date

# 3. Create backup directory on every node
mkdir -p /data/backup/gbase8a
chown gbase:gbase /data/backup/gbase8a

# 4. Verify pexpect is available
python -c "import pexpect; print('OK')"
# If missing, copy it from the installation package
cp $GCLUSTER_BASE/../gcinstall/pexpect.py $GCLUSTER_BASE/server/bin/

Do not place backup directories under $GCLUSTER_BASE, $GBASE_BASE, or $GCWARE_BASE.

3. Cluster‑Level Backup and Recovery

Full Backup

# 1. Switch to readonly mode
gcadmin switchmode readonly
gcadmin  # verify READONLY

# 2. Run full backup interactively
python $GCLUSTER_BASE/server/bin/gcrcman.py \
    -d /data/backup/gbase8a \
    -p your_db_password
gcrcman> backup level 0
gcrcman> exit

# 3. Switch back to normal
gcadmin switchmode normal

Incremental Backup (requires a prior full backup)

gcadmin switchmode readonly
python $GCLUSTER_BASE/server/bin/gcrcman.py -d /data/backup/gbase8a -p password
gcrcman> backup level 1
gcrcman> exit
gcadmin switchmode normal

View Backup History

python $GCLUSTER_BASE/server/bin/gcrcman.py -d /data/backup/gbase8a -p password
gcrcman> show backup

Cluster‑Level Recovery

# 1. Switch to recovery mode
gcadmin switchmode recovery

# 2. Restore to the latest point, or to a specific cycle/point
python $GCLUSTER_BASE/server/bin/gcrcman.py -d /data/backup/gbase8a -p password
gcrcman> recover           # latest
gcrcman> recover 0         # last point of cycle 0
gcrcman> recover 0 1       # point 1 of cycle 0
gcrcman> exit

# 3. Restart gcware (as root)
service gcware restart

# 4. Switch back to normal
gcadmin switchmode normal

⚠️ Recovery rolls the entire cluster back to the snapshot. All data written after the backup point is lost.

4. Table‑Level Backup and Recovery

Table‑level backups only lock the target table, making them ideal for online production use.

Full and Incremental Table Backup

python $GCLUSTER_BASE/server/bin/gcrcman.py -d /data/backup/gbase8a -p password
gcrcman> backup table sales_db.orders level 0    # full
gcrcman> backup table sales_db.orders level 1    # incremental
gcrcman> exit

Table‑Level Recovery (no cluster state change needed)

python $GCLUSTER_BASE/server/bin/gcrcman.py -d /data/backup/gbase8a -p password
gcrcman> recover table sales_db.orders           # latest point
gcrcman> recover table sales_db.orders 0 1       # specific point
gcrcman> exit

5. Cleaning Up Expired Backups

python $GCLUSTER_BASE/server/bin/gcrcman.py -d /data/backup/gbase8a -p password
gcrcman> show backup
gcrcman> delete 0      # Delete cycle 0 (keep at least one complete cycle)
gcrcman> clean          # Remove orphaned files from interrupted backups
gcrcman> exit

6. Production Scheduling

Recommended cadence: Cluster full backup weekly on Sunday, daily incremental, important tables backed up daily.

Sample automation script (registered in cron):

#!/bin/bash
BACKUP_DIR=/data/backup/gbase8a
DB_PASS=your_db_password
LOG_FILE=/home/gbase/logs/backup_$(date +%Y%m%d).log
GCRCMAN=$GCLUSTER_BASE/server/bin/gcrcman.py

DOW=$(date +%u)
if [ "$DOW" -eq 7 ]; then
    LEVEL=0
else
    LEVEL=1
fi

gcadmin switchmode readonly >> $LOG_FILE 2>&1
python $GCRCMAN -d $BACKUP_DIR -p $DB_PASS << EOF >> $LOG_FILE 2>&1
backup level $LEVEL
exit
EOF
BACKUP_STATUS=$?
gcadmin switchmode normal >> $LOG_FILE 2>&1

if [ $BACKUP_STATUS -eq 0 ]; then
    echo "[$(date)] Backup succeeded" >> $LOG_FILE
else
    echo "[$(date)] Backup failed!" >> $LOG_FILE
    exit 1
fi

Crontab entry for the gbase user:

0 2 * * * /home/gbase/scripts/daily_backup.sh

7. After Scaling or Replacing Nodes

Old backups are unusable after topology changes. Immediately run a new full backup (level 0) to start a new cycle.

8. Common Issues

Another gcrcman process is running: Check with gcadmin showlock. If a stale lock remains, ask a DBA to evaluate and clean it.
Disk full during backup: A full backup needs ~1.2× the data directory size; an incremental needs ~1.5× the change volume.
Restored data doesn't match expectations: Cluster‑level recovery rolls back everything. If you only dropped a single table, use table‑level recovery instead.

A well‑planned backup strategy is essential for any gbase database. Using gcrcman with the right cadence and validation ensures you can recover quickly when needed.

DEV Community