DEV Community

Michael
Michael

Posted on • Originally published at gbase.cn

How to Troubleshoot Performance Degradation in GBase 8a

When queries start slowing down in your gbase database cluster, a structured approach helps identify the root cause quickly. Work through these six areas, from cluster health to deep log analysis.

Step 1: Check Overall Cluster Health

Start with the cluster status command:

gcadmin showcluster vc <vc_name>
Enter fullscreen mode Exit fullscreen mode
  • CLUSTER STATE must be ACTIVE. If not, the metadata service (GCware) is likely having issues and the cluster may be locked.
  • VIRTUAL CLUSTER MODE should be NORMAL. If it's READONLY or RECOVERY, slow performance is expected.
  • Confirm that gbased, gcluster, and syncserver on each node show OPEN. A CLOSE or OFFLINE status means the node is out of service, shifting load to the remaining nodes.

Step 2: Analyze Resource Usage and Contention

Resource contention is the most frequent cause of sudden performance drops in a gbase database.

  • Check resource pool load:
  SHOW RESOURCE POOL USAGE ON COORDINATORS WHERE vc_name='<vc_name>';
Enter fullscreen mode Exit fullscreen mode

Watch for cpu_usage_percent near 100%, active_task hitting the max_activetask limit, and mem_usage_mb close to max_memory.

  • Look for rejected or queued tasks:
  SHOW RESOURCE POOL EVENTS WHERE vc_name='<vc_name>' AND event_time > DATE_SUB(NOW(), INTERVAL 1 HOUR);
Enter fullscreen mode Exit fullscreen mode

A spike in WAITING or REJECTED events proves resource contention.

  • System-level metrics: SSH into nodes and use top, free -m, iostat to check CPU, memory, and disk I/O. Pay special attention to swap usage.

Step 3: Find the Slow Queries

A single heavy query can impact the whole cluster.

  • Use GDOM’s “Slow SQL” feature if available.
  • Identify long-running queries currently executing:
  SELECT * FROM information_schema.processlist 
  WHERE COMMAND = 'EXECUTING' AND TIME > 60 
  ORDER BY TIME DESC LIMIT 10;
Enter fullscreen mode Exit fullscreen mode

Check the STATE column (e.g., Sending data, Sorting result) and the SQL text in INFO.

  • Inspect slow query logs at $GCLUSTER_HOME/log/gcluster/express.log and $GBASE_HOME/gnode/log/gbase/express.log.
  • Check for lock contention:
  gcadmin showlock
Enter fullscreen mode Exit fullscreen mode

Look for locks held for an unusually long time.

Step 4: Examine Data Distribution and Storage

Storage-level issues can degrade performance significantly.

  • Data skew causes uneven load across nodes.
  SELECT table_name, node_ip, SUM(segment_size) 
  FROM information_schema.cluster_table_segments 
  WHERE table_schema='<db_name>' 
  GROUP BY table_name, node_ip 
  ORDER BY SUM(segment_size) DESC;
Enter fullscreen mode Exit fullscreen mode

If one node holds a disproportionately large amount of data, adjust the distribution key.

  • Disk usage: Disk usage above 80% severely impacts performance.
  df -h /opt/gbase
Enter fullscreen mode Exit fullscreen mode
  • Table fragmentation: Frequent inserts/deletes can cause fragmentation. Check through GDOM health reports or system tables.

Step 5: Inspect Network and Hardware

Infrastructure problems can mimic database performance issues.

  • Test latency and bandwidth between nodes with ping and iperf3. An abnormal increase in sync.log size often hints at network trouble.
  • Check server hardware logs (RAID controllers, disk SMART). Frequent core dumps may indicate unstable hardware:
  ls -lrt /opt/gbase/gcluster/userdata/gcluster/core.* 2>/dev/null
  ls -lrt /opt/gbase/gnode/userdata/gbase/*.dump 2>/dev/null
Enter fullscreen mode Exit fullscreen mode

Step 6: Dig Into Logs and Configuration

When the above steps don't pinpoint the issue, logs are your best friend.

  • Check system.log (service starts/stops, critical errors) and gcware.log (metadata operations). Search for ERROR, WARNING, timeout, slow.
  • Verify that key performance parameters haven't been changed recently, such as gbase_parallel_degree and memory settings like gbase_heap_*.

Quick Reference Checklist

Area Key Command/Method Likely Fix
Cluster state gcadmin showcluster Restore GCware if not ACTIVE
Resource contention SHOW RESOURCE POOL USAGE Adjust resource plans or scale out
Slow queries processlist, express.log, GDOM Optimize SQL, add indexes, adjust distribution keys
Lock contention gcadmin showlock Tune transactions, avoid long locks
Data skew cluster_table_segments Rebuild table or change distribution column
Disk space df -h Clean logs/data, expand storage
Network/hardware ping, iostat, core dumps Engage sysadmin

A solid performance baseline combined with regular monitoring makes troubleshooting far easier. When performance deviates, use this structured approach to quickly bring your gbase database back to full speed.

Top comments (0)