DEV Community

Cong Li
Cong Li

Posted on

Introduction to Distributed High Availability in GBase 8c

GBase 8c is a multi-modal database that supports standalone, master-slave, and distributed deployment modes. Both the master-slave and distributed modes support high availability (HA). GBase 8c's distributed high availability includes component-level redundancy, ensuring all nodes are HA-deployable. This article provides a brief introduction to the distributed high availability of GBase 8c. The distributed architecture is illustrated below:

GBase 8c Distributed Architecture

GBase 8c distributed database consists of CN, DN, GTM, and HA management components, each described in detail below:

CN: Coordinator Node

  • Deployment: Fully peer-to-peer.
  • Function: Provides an interface for SQL parsing, optimization, and execution plan generation, coordinating data nodes for data queries and writes. It stores global metadata but not actual business data.
  • High Availability: Multiple CN nodes can be deployed, providing the same database view at all times. They can be deployed in the same data center, city, or different locations.

DN: Data Node

  • Deployment: Master-slave HA architecture with synchronous or asynchronous configurations.
  • Function: Stores local metadata and business data shards, executing requests from the CN.
  • High Availability: Can be deployed in single master, one master-one slave, or one master-multiple slaves configurations, with either synchronous or asynchronous replication.

GTM: Global Transaction Manager

  • Deployment: Master-slave HA architecture with synchronous or asynchronous configurations.
  • Function: Manages distributed transactions, generates and maintains global timestamps to ensure data consistency.
  • High Availability: Deployed similarly to DN, with master and multiple slave nodes.

HA Center (Equivalent to ETCD)

  • Function: Manages cluster state using the Raft replication protocol, storing HA statuses of each node and assessing node status in case of failures.

GHA Server (Equivalent to Patroni)

  • Deployment: Master-slave HA architecture with synchronous or asynchronous configurations.
  • Function: Manages HA status of all nodes in the cluster (master-slave, node availability). Leader information is stored in the HA Center.

GHA Agent

  • Deployment: Deployed on each CN and DN.
  • Function: Acts as an agent, receiving messages from the GHA Server and handling them.

High Availability Processes on Server

gha_server: Main process on management nodes, managing inter-node processes and cluster state.

  • rpc: Handles messaging between gha_ctl, gha_server, and gha_agent.
  • arbiter: Decision-making process, assessing node status.
  • leader checker: Prevents dual master scenarios by verifying leader status.
  • cluster_info_publisher: Communicates network topology to gha_agent.
  • dcs_updater: Ensures the latest and correct state is written to the HA Center.

gha_agent Processes:

  • rpc: Communicates with gha_server.
  • reporter: Periodically updates database status to gha_server.
  • state reporter: Updates node state to gha_server.
  • ha state reporter: Prevents dual master situations by ensuring consensus among DN backup nodes before triggering master-slave switchover.
  • leader checker: Surrenders leader status and restarts if the node becomes isolated.

HA Module References:

  • etcd: Runs the RAFT protocol, storing internal states of HA groups.
  • patroni: Python-based tool for native PostgreSQL HA, adapted in this system to support multiple HA groups.

From this introduction, it is evident that GBase 8c's distributed architecture thoroughly considers HA for its components. In practice, deployment can be flexibly designed based on server resources to ensure data reliability and security.

Top comments (0)