Fast Fragment Scanning for Tables in GBase 8a

#cli #database #linux #performance

Disk fragmentation inside a database can silently eat up storage and degrade I/O performance. In GBase 8a, the China-domestically developed MPP cluster database from GBASE, frequent inserts and deletes leave "holes" in table files over time. Spotting these fragmented tables quickly is a common operational challenge—traditional methods that scan metadata tables are slow and don't scale well across large clusters.

This article introduces a lightweight, shell-based approach that uses the Linux filefrag utility to scan for fragmented tables directly from the filesystem layer, significantly reducing detection time in a gbase database environment.

Why Fast Fragment Detection Matters

GBase 8a's data files can become heavily fragmented under write-intensive workloads. Fragmentation not only wastes disk space but also adds random I/O overhead during scans. Conventional approaches that query internal metadata tables struggle with performance at scale. By contrast, filefrag reads filesystem-level extent maps, making it orders of magnitude faster for bulk scanning.

How It Works

The script iterates through all user tables (obtained from gbase.table_distribution), locates the data directory for a specific shard suffix (e.g., n1), and runs filefrag -v against each file. It extracts the number of extents (contiguous disk regions) per file and calculates the average. If a table's average extent count exceeds the threshold (default: 2), the table is flagged as fragmented.

Deploying and Running the Script

1. Installation

Place the sweep.sh script under /home/gbase/sweep on a composite node.

2. Configuration

Adjust the variables at the top of the script:

user — database user (typically gbase)
passwd — corresponding password
threads_num — number of parallel scanning threads (default: 2)
suffix — shard suffix to scan (usually n1)
avg_sum — average extent threshold to trigger a flag (default: 2)

3. Execution

sh sweep.sh

After the scan finishes, the /home/gbase/sweep/log/ directory will contain one file per flagged table, named database_name.table_name. These files serve as a direct to-do list for defragmentation operations.

Full Script Reference

#!/bin/bash
# author: Bryan zhao
# date: 2025-07-29

user="gbase"
passwd="******"
threads_num=2      # Concurrency
suffix="n1"        # Typically scan one shard only
avg_sum=2          # Mark table if avg extents exceed this

> tb.list
if [ ! -d "`pwd`/log" ]; then
    mkdir -p "`pwd`/log"
else
    rm -rf "`pwd`/log"
    mkdir -p "`pwd`/log"
fi

cli="${GCLUSTER_HOME}/bin/gbase -u${user} -p${passwd} -N"
$cli -e "SELECT dbname,tbname FROM gbase.table_distribution where dbname not in ('information_schema','performance_schema','gbase','gctmpdb','gclusterdb')" > tb.list

child() {
    local dbname=$1 tbname=$2 suffix=$3 avg_sum=$4
    local mulu="${GBASE_BASE}/userdata/gbase/${dbname}/sys_tablespace/${tbname}_${suffix}"
    [ ! -d "$mulu" ] && { echo "Error: Directory $mulu not found"; return 1; }
    local avg_extents=$(find "$mulu" -type f -exec filefrag -v {} \; 2>/dev/null | grep "extent" | awk '{sum+=$2} END {print (NR>0)? sum/NR : 0}')
    if (( $(echo "$avg_extents > $avg_sum" | bc -l) )); then
        touch "`pwd`/log/${dbname}.${tbname}"
    fi
}

tmp_fifofile="/tmp/$$_`date +%N`.fifo"
mkfifo $tmp_fifofile
exec 6<>$tmp_fifofile
rm -rf $tmp_fifofile

for i in `seq 1 ${threads_num}`
do
    echo >&6
done

while read -r db table
do
    read -u6
    {
        child ${db} ${table} ${suffix} ${avg_sum}
        echo >&6
    }&
done < tb.list

wait
exec 6>&-

Why This Works

The key insight is offloading the work to the filesystem layer. GBase 8a's gbase.table_distribution system table gives the authoritative list of table names, and filefrag handles the low-level extent counting without touching the gbase database engine. The script then parallelizes the scanning using a named-pipe-based concurrency model, making it practical to run even across hundreds of tables.

GBASE continues to provide robust tools for managing GBase 8a clusters, and combining OS-level utilities with database metadata is a pattern that can be extended to other operational tasks.