This article explains the core high‑availability mechanisms of a gbase database cluster: how gcware arbitration works, how multi‑replica consistency is maintained, what happens during automatic node failover, and how to handle common replica anomalies.
1. Three‑Tier HA Architecture
GBase 8a's high availability relies on three cooperating layers:
- gcware (arbitration layer): Based on Corosync/Pacemaker, deployed on an odd number of nodes (3 or 5). Responsible for heartbeats, split‑brain prevention, and leader election.
- gcluster (coordination layer): Multi‑node deployment; any node can serve external requests. Metadata is synchronised across gcluster nodes.
- gnode (data layer): Each piece of data has 1 primary + N replicas. The primary handles reads/writes; replicas sync from the primary. gcware arbitrates the primary role.
2. gcware: The Arbitration Core
gcware uses a quorum principle: the cluster works only when more than half the gcware nodes are alive.
| gcware Nodes | Tolerated Failures | Minimum Alive |
|---|---|---|
| 3 | 1 | 2 |
| 5 | 2 | 3 |
| 7 | 3 | 4 |
Deploying an even number (e.g., 4) is dangerous: during a network partition, both sides have 2 nodes and each thinks it has quorum — causing a split‑brain. The cluster will refuse service to protect consistency. Always deploy gcware on an odd number of nodes.
From V9.5.3 onwards, gcware can be deployed independently — you can run it on lightweight VMs, saving data‑node resources, and gcluster scaling is no longer constrained by the odd‑node requirement.
Each gnode periodically reports its status to gcware. When a gnode fails, gcware detects the heartbeat timeout and: marks the node DOWN → picks the replica with the highest data version (LSN) and promotes it to primary → notifies gcluster to update the routing table.
3. Data Replica Mechanism
Segments and Replicas
Specify the replica count when creating a distribution:
# p 2 = 2 primary shards, d 1 = 1 duplicate → 1 primary + 1 replica
gcadmin distribution gcChangeInfo.xml p 2 d 1 pattern 1
View segment placement:
gcadmin showdistribution node
Each segment's primary and replica reside on different nodes. When a node fails, its primary segments are taken over by replicas on other nodes.
Replication Mode
Primary‑replica sync is asynchronous: the primary returns to the client immediately after a write, and the change is pushed to replicas in the background. In rare cases (primary crashes right after a write), replicas may briefly lag. gcware compares the Log Sequence Number (LSN) to select the most up‑to‑date replica for promotion.
Checking Replica Consistency
SELECT segment_id, node_name, is_primary, data_state, version
FROM gclusterdb.segment_info
ORDER BY segment_id, is_primary DESC;
data_state values: 0 = consistent, 1 = replica catching up, 2 = severely lagging — manual intervention needed.
4. Node Failover Process
Automatic Failover
- gcware detects heartbeat timeout (default 5 s)
- gcware marks the node DOWN
- Promotes the most up‑to‑date replica to primary
- The new primary starts serving reads and writes
- gcluster updates its internal routing table
- Subsequent SQL is automatically routed to the new primary — transparent to applications
The whole process typically completes in 5–30 seconds.
Handling Primary‑Replica Inconsistency
Configure the behaviour when inconsistency is detected:
# gbase.cnf on gcluster
# 0 = refuse service (conservative)
# 1 = auto‑select a new primary (may lose a small amount of data)
gcluster_suffix_consistency_resolve = 1
Evaluate data‑loss tolerance carefully in production before enabling automatic promotion.
Data Resync After Node Recovery
When a failed node restarts, it automatically re‑synchronises with the current primary:
# Check sync progress
gcadmin showdistribution node
# Force a resync if stuck
gcadmin resync node <node_name>
5. Common HA Troubleshooting
Fault 1: gcware won't start — "can not connect to any server"
Cause: gcware service not running, or Corosync port (UDP 5405) blocked by firewall.
# Check gcware process
ps -ef | grep gcware
# Check Corosync port
netstat -tunlp | grep 5405
# Manually start gcware
gcware_services all start
# Inspect gcware log
tail -200 $GCWARE_BASE/log/gcware.log
Fault 2: gnode status CLOSE, log shows memory limit exceeded
Cause: gnode heap memory parameters are too low.
Fix: edit gbase.cnf on the affected node:
gbase_memory_pct_target = 0.75
gbase_heap_data = 4096M
gbase_heap_temp = 2048M
gbase_heap_large = 4096M
Restart and verify:
gcluster_services all restart
gcadmin # confirm node status returns to OPEN
Fault 3: Cluster INACTIVE — more than half the gcware nodes unreachable
When over half the gcware nodes are down, the cluster enters INACTIVE state and rejects all writes (protecting data consistency). Do not attempt forced writes. First restore gcware to a quorum majority, then check gnodes one by one.
6. HA Operations Best Practices
| Recommendation | Reason |
|---|---|
| Deploy gcware on odd numbers (3 or 5) | Prevents split‑brain; ensures quorum arbitration |
| Separate gcware from data nodes (V9.5.3+) | Avoids data‑node failures impacting the arbitration layer |
| Place primary/replica on different physical machines/racks | Prevents a single hardware fault from taking down both |
Periodically check data_state in segment_info |
Catches replica lag early |
| Replica count ≥ 2 (i.e., at least 1 primary + 1 replica) | Survives single‑node failures without service impact |
7. Quick Command Reference
# Overall cluster status
gcadmin
# Segment distribution and replica state per node
gcadmin showdistribution node
# Start gcware on all gcware nodes
gcware_services all start
# Start gcluster/gnode on all nodes
gcluster_services all start
# Follow gcware log
tail -f $GCWARE_BASE/log/gcware.log
# Follow gcluster log
tail -f $GCLUSTER_BASE/log/gcluster/system.log
# Follow gnode log
tail -f $GNODE_BASE/log/gbase/system.log
Understanding these HA mechanisms is essential for keeping a gbase database cluster reliable. The quorum‑based gcware layer, asynchronous replica sync, and automatic failover work together to provide continuous service even when individual nodes fail — as long as the cluster is deployed with the right topology and monitored proactively.
Top comments (0)