Emre Baykal

Posted on Apr 20

Setting Up Disaster Recovery for HPE Morpheus Enterprise with MySQL InnoDB ClusterSet Step by Step

#mysql #database #highavailability #morpheus

HPE Morpheus Enterprise Disaster Recovery with MySQL InnoDB ClusterSet: A Production Grade Guide

In a previous build out we deployed two separate MySQL InnoDB Clusters, one on the Master site and one on the DR site, as the backing database for our Morpheus Enterprise platform. This guide is the disaster recovery follow up: we merge those two clusters into a single InnoDB ClusterSet so Morpheus Enterprise keeps serving traffic even if an entire site is lost. It covers TLS encrypted async replication between the sites, re bootstrapping MySQL Router so the Morpheus app layer doesn't need any connection string changes, and both planned switchovers and emergency failovers, including the nasty errant GTID cleanup that most guides skip. Every step is executed without interrupting live Morpheus traffic.

Introduction: Why InnoDB ClusterSet?

MySQL InnoDB Cluster gives you solid high availability inside a single data center. Group Replication handles node failures transparently. What it doesn't give you is protection against a site level disaster: if the DC goes dark, a fiber gets cut, or a regional outage hits, your cluster vanishes with it.

InnoDB ClusterSet fills that gap:

A Primary Cluster serves write traffic (Master Site).
A Replica Cluster is fed via async replication (DR Site).
MySQL Router picks up topology changes automatically, so there are no application side connection string changes after failover.
Planned switchovers give you zero data loss; emergency failovers keep RPO in the single digit seconds range.

This article walks through the exact steps I run in production, with the error messages you will hit, the security posture you should aim for, and the operational tips most tutorials skip. It is specifically written to layer ClusterSet on top of two existing InnoDB Clusters rather than a greenfield install.

One caveat worth stating up front: ClusterSet is not a replacement for backups. Logical corruption (DROP TABLE, a bad migration, an application bug) replicates to DR just fine. You still need MySQL Enterprise Backup or Percona XtraBackup on an independent schedule.

Starting Point and Target Architecture

Current State

Component	Master Site	DR Site
InnoDB Cluster	Installed and running	Installed (will be dropped)
MySQL Router	Registered and active	Registered (will be re bootstrapped)
Database	Preserved (source of truth)	Doesn't matter (will be overwritten by clone)

Requirements and Pre-Flight Checks

Software Versions

Component	Version	Notes
MySQL Server	8.0.27+ (8.4 LTS recommended)	Identical version across every node
MySQL Shell	Same major as server	Required on every node
MySQL Router	8.0.x (8.2+ for R/W splitting)	At least one instance per site

Firewall Port List

Cross-site (Master site ↔ DR site, both directions):

Port	Protocol	Service	Purpose
3306	TCP	MySQL Classic	ClusterSet async replication channel (`clusterset_replication`) + AdminAPI

ClusterSet uses standard async replication on port 3306 between the primary and replica clusters. Group Replication's internal port (33061) is intra cluster only and does not carry cross site traffic.

Intra cluster (within the same site, between GR members):

Port	Protocol	Service	Purpose
3306	TCP	MySQL Classic	Client + recovery donor connections
33061	TCP	Group Replication	Inter-node GR `local_address` (configurable via `group_replication_local_address`)
33062	TCP	XCom (recommended free)	Used when `communicationStack: 'XCOM'`. On the default `MYSQL` stack it isn't required, but reserving it avoids surprises if you ever switch stacks

Operator workstation → any MySQL node:

Port	Protocol	Purpose
3306	TCP	`mysqlsh` classic connection
33060	TCP	`mysqlsh` X Protocol (default for newer Shells)

A common mistake: port 33060 is not required between MySQL nodes (cross site or intra cluster). It is only needed from the operator/DBA workstation. Leaving it open cross site is unnecessary attack surface.

Application → MySQL Router:

Port	Protocol	Purpose
6446	TCP	Classic R/W, routes to primary
6447	TCP	Classic R/O, load-balanced across secondaries
6448 / 6449	TCP	X Protocol R/W and R/O (optional)
6450	TCP	R/W splitting endpoint (MySQL Router 8.2+, optional)

Verify connectivity before you start:

# Cross-site: only 3306 is required
nc -zv <remote_site_node_ip> 3306

# Intra-cluster (within same site): 3306 and 33061
nc -zv <local_peer_ip> 3306
nc -zv <local_peer_ip> 33061

Server-Level Consistency Checks (often forgotten)

Any of the following drifting between Master and DR will eventually break replication or corrupt query semantics. Run on one node per cluster and diff the output.

-- Character set and collation
SHOW VARIABLES LIKE 'character_set_server';
SHOW VARIABLES LIKE 'collation_server';

-- Timezone (critical for TIMESTAMP columns)
SELECT @@global.time_zone, @@global.system_time_zone;

-- SQL mode
SELECT @@global.sql_mode;

-- Lower case table names (cannot be changed after initialization)
SELECT @@global.lower_case_table_names;

-- Clone plugin availability (required for recoveryMethod: 'clone')
SHOW PLUGINS WHERE Name = 'clone';
-- Expected: Status = ACTIVE. If missing:
-- INSTALL PLUGIN clone SONAME 'mysql_clone.so';

Watch out for lower_case_table_names. If Master and DR were initialized with different values, replication will break on any mixed case table name. The value is set at mysqld --initialize time and cannot be changed afterwards, so the only fix is a full re init of the divergent side.

Step 1: Configure `server_id` and `report_host` on the Existing Cluster

ClusterSet requires every MySQL node to have a unique server_id (across both sites) and a report_host that advertises its real address. These parameters don't take effect until MySQL restarts, so on a live cluster you perform a rolling restart. Never shut all nodes down at once.

Critical: restart secondaries first and the primary last. Otherwise you risk losing quorum.

1.1 Inspect Current Values

mysql -u clusterAdmin -p \
  -e "SELECT @@server_id, @@report_host, @@report_port, @@hostname;"

Target values:

Node	server_id	report_host
master_node1 (Primary)	1	master_node1 IP
master_node2 (Secondary)	2	master_node2 IP
master_node3 (Secondary)	3	master_node3 IP
dr_node1	4	dr_node1 IP
dr_node2	5	dr_node2 IP
dr_node3	6	dr_node3 IP

Why this matters: server_id must be unique across the entire ClusterSet, not just per cluster. Two nodes sharing an id will drop replication events or create a silent data divergence.

1.2 Edit `my.cnf`

[mysqld]
# ── Required for ClusterSet ──────────────────────────────────────
server_id                  = 1             # Must be UNIQUE across BOTH sites
report_host                = 10.10.1.11    # Real IP of this node
report_port                = 3306

# ── Should already be present on an InnoDB Cluster ───────────────
gtid_mode                  = ON
enforce_gtid_consistency   = ON
binlog_format              = ROW
log_replica_updates        = ON            # Renamed from log_slave_updates in 8.0.26
plugin_load_add            = group_replication.so

# ── Auto-rejoin after restart (optional but recommended) ─────────
loose-group_replication_start_on_boot = ON

A note on hostnames: use IP addresses for report_host. Hostname based setups frequently hit Server address configuration error when DNS is inconsistent.

1.3 Rolling Restart

The sub steps below follow the safe order: identify primary, restart each secondary in turn, restart primary last, move the primary role back to master_node1, then verify Router.

1.3.1 Identify the Primary

mysqlsh clusterAdmin@master_node1:3306
var cluster = dba.getCluster()
cluster.status()

The node with memberRole: "PRIMARY" is restarted last.

1.3.2 Restart Secondaries One at a Time

# A) (Optional) Gracefully remove from cluster
mysqlsh clusterAdmin@<secondary_node>:3306
# >> dba.getCluster().removeInstance('<secondary_node>:3306', {force: false})

# B) Restart MySQL
systemctl restart mysqld

# C) Verify new parameters
mysql -u clusterAdmin -p -e "SELECT @@server_id, @@report_host;"

# D) Re-add
mysqlsh clusterAdmin@master_node1:3306
# >> dba.getCluster().addInstance('<secondary_node>:3306', {recoveryMethod: 'incremental'})

Shortcut: with loose group_replication_start_on_boot = ON, a plain systemctl restart mysqld is enough. The node rejoins on its own within 30 to 60 seconds.

1.3.3 Restart the Primary Last

# Confirm every secondary is ONLINE first
systemctl restart mysqld

When mysqld exits, Group Replication performs a view change and promotes a healthy secondary (5–15 seconds).

1.3.4 Move the Primary Role Back

mysqlsh clusterAdmin@<any_node>:3306
dba.getCluster().setPrimaryInstance('master_node1:3306')

This is a coordinated primary handover, not an election. The current primary drains in flight transactions, sets the target node writable, and transfers the role. Router picks up the change via its metadata TTL (5 to 15 seconds in practice).

1.3.5 Verify Router Routing

mysql -h <router_ip> -P 6446 -u clusterAdmin -p \
  -e "SELECT @@hostname, @@report_host;"

Output should show master_node1.

1.4 Full Verification

Check	Expected
`MEMBER_STATE`	`ONLINE` for every node
`@@server_id`	Unique (1–6)
`@@report_host`	Real IP
`cluster.status()`	`statusText: "Cluster is ONLINE and can tolerate up to ONE failure"`

Step 2: Verify Server Certificates Are Present (Pre-Flight for TLS)

Cross-site async replication without TLS is a serious exposure in production. Before you create the ClusterSet, confirm that every node already has working server certificates so the replication channel can negotiate TLS as soon as it's created in §4.1. The actual enforcement option (clusterSetReplicationSslMode) is set at ClusterSet creation time, not here.

2.1 Confirm Certificates Exist

MySQL auto-generates self signed certs under datadir on first init. For production, replace them with certs issued by your internal CA. Verify they exist on every node:

SHOW VARIABLES LIKE 'ssl_%';
-- ssl_ca, ssl_cert, ssl_key should all point to valid files

Test TLS is actually offered:

mysql -u clusterAdmin -p -h master_node1 \
  --ssl-mode=REQUIRED \
  -e "SHOW STATUS LIKE 'Ssl_cipher';"
# Ssl_cipher should be non-empty (e.g., TLS_AES_256_GCM_SHA384)

2.2 How TLS Is Enforced (Preview of §4.1)

TLS on the ClusterSet replication channel is controlled by the clusterSetReplicationSslMode option, set at ClusterSet creation time via cluster.createClusterSet() (see §4.1). Possible values:

Value	Behavior
`AUTO` (default)	Enables encryption when the server supports it, disables otherwise
`REQUIRED`	Enables encryption for all ClusterSet replication channels
`DISABLED`	Disables encryption. Avoid in production
`VERIFY_CA` (Shell 8.0.33+)	Like `REQUIRED`, plus verifies the server certificate against the CA
`VERIFY_IDENTITY` (Shell 8.0.33+)	Like `VERIFY_CA`, plus verifies the certificate identity matches the server hostname

The internal replication user (typically mysql_innodb_cs_<hex>) is created by AdminAPI and the channel is configured to match this option. Optionally you can also restrict the account server-side post-creation as defense-in-depth:

-- Defense in depth; the channel is already configured per clusterSetReplicationSslMode
SELECT User, Host FROM mysql.user WHERE User LIKE 'mysql_innodb_cs_%';
ALTER USER 'mysql_innodb_cs_xxxx'@'<scoped_host>' REQUIRE SSL;

TLS version note: MySQL 8.0 negotiates TLSv1.2 and TLSv1.3 by default. Pinning to tls_version = TLSv1.3 is good hardening only when every client supports it; keeping TLSv1.2,TLSv1.3 is the safer default.

Step 3: Remove the Existing DR Cluster

The legacy InnoDB Cluster on DR cannot be added to a ClusterSet as a replica. It has to be fully dismantled first.

Data loss warning: this wipes DR. Clone re-seeds from Master afterwards. Confirm nothing on DR needs to be preserved before continuing.

3.1 Method A: `dissolve()` (Recommended)

mysqlsh clusterAdmin@dr_node1:3306

var oldCluster = dba.getCluster()
oldCluster.status()
oldCluster.dissolve({force: true})

dissolve() connects to every node, stops Group Replication, and drops the mysql_innodb_cluster_metadata schema. force: true lets it proceed even with OFFLINE nodes.

Verify on every DR node:

SELECT @@group_replication_group_name;                  -- empty
SHOW STATUS LIKE 'group_replication_primary_member';    -- empty

dba.getCluster()   // should throw

3.2 Method B: `dropMetadataSchema()` (Fallback)

If dissolve() fails:

mysqlsh clusterAdmin@dr_node1:3306

\sql
STOP GROUP_REPLICATION;
-- RESET REPLICA ALL is a no-op for a pure InnoDB Cluster (GR uses its own
-- group_replication_recovery channel, not SQL-level replication). It's only
-- useful here if the node was *previously* a ClusterSet replica or had some
-- manual async channel configured. Skip unless `SHOW REPLICA STATUS\G` shows
-- a channel. Running it on a clean node is harmless but unnecessary.
-- RESET REPLICA ALL;
\js

dba.dropMetadataSchema({force: true})

Repeat on dr_node2 and dr_node3. If you still see errant GTIDs blocking clone later, a controlled reset is sometimes necessary:

-- Use only if you're 100% sure DR data is disposable
RESET MASTER;                        -- MySQL 8.0.x
-- RESET BINARY LOGS AND GTIDS;      -- MySQL 8.4+ (new spelling; RESET MASTER is deprecated there)

3.3 Clean Up the DR Router

systemctl stop mysqlrouter
rm -rf /etc/mysqlrouter/*
rm -rf /var/lib/mysqlrouter/*

3.4 Confirm DR Nodes Are Standalone

mysqlsh clusterAdmin@<dr_nodeX>:3306
dba.getCluster()                                               // ERROR expected
dba.checkInstanceConfiguration('clusterAdmin@<dr_nodeX>:3306') // "ok"

Step 4: Create the ClusterSet and Join DR

4.1 Create ClusterSet from Master (TLS hardened)

Both TLS enforcement and the replication user host scope are configured at ClusterSet creation time. They belong on cluster.createClusterSet(), not on createReplicaCluster() later. Setting them here means every replica cluster you add inherits the secure defaults.

Assumed naming convention used throughout this guide (substitute your own names if different):

Primary cluster name (created earlier when you ran dba.createCluster('morpheus cluster') on Master): morpheus cluster

ClusterSet name (created in this step): MorpheusClusterSet

Replica cluster name (created in §4.2): MorphDrCluster

You can confirm the primary cluster name with dba.getCluster().getName() before proceeding.

mysqlsh clusterAdmin@master_node1:3306

var cluster = dba.getCluster()
// Sanity check: this should return 'morpheus cluster' in this guide
cluster.getName()

var clusterSet = cluster.createClusterSet('MorpheusClusterSet', {
  clusterSetReplicationSslMode: 'REQUIRED',        // Force TLS on the ClusterSet replication channel
  replicationAllowedHost: '10.0.0.0/8'             // MUST cover BOTH primary and replica subnets (see note below)
})

clusterSet.status()

Version requirements: clusterSetReplicationSslMode is available in MySQL Shell 8.0.27 and later. The values VERIFY_CA and VERIFY_IDENTITY (stronger than REQUIRED) were added in 8.0.33. The replicationAllowedHost option requires MySQL Shell 8.0.28 or later.

⚠️ replicationAllowedHost is bidirectional. Per the official docs, the host pattern you set must be reachable from nodes in both the primary and replica clusters. The internal replication user (mysql_innodb_cs_*) is granted with this host pattern and will be used by:

The replica cluster when it connects to the primary (normal-state replication)

The original primary's nodes if they later need to connect to a new primary after failover or switchover

Scoping the pattern to only the DR subnet (for example 10.20.0.0/24) will work as long as the primary never changes, but setPrimaryCluster() or forcePrimaryCluster() will break replication for the old primary's nodes because they can't satisfy the host grant. Use a CIDR or wildcard that covers every MySQL node in every cluster, for example 10.0.0.0/8 covering both sites, or a pair of wildcards like 10.10.%.% combined with 10.20.%.%.

Host pattern syntax: per the official docs, replicationAllowedHost accepts CIDR notation (for example 192.0.2.0/24). The MySQL Host column also accepts wildcards like 10.%.%.% and netmask form 10.20.0.0/255.255.0.0 if your network requires those. Avoid '%' in production, since it lets the replication user connect from any IP.

Already created? If you see MYSQLSH 51601 — function is not available … already belongs to a ClusterSet, fetch the existing ClusterSet:
var clusterSet = dba.getClusterSet()
// To change the host scope after the fact:
clusterSet.setOption('replicationAllowedHost', '10.0.0.0/8')

4.2 Add DR as Replica Cluster

TLS and host scope were already set at the ClusterSet level in §4.1, so the replica cluster inherits them automatically. Here you only need the provisioning options that belong on createReplicaCluster():

var clusterSet = dba.getClusterSet()

var replicaCluster = clusterSet.createReplicaCluster(
  'clusterAdmin@dr_node1:3306',
  'MorphDrCluster',
  {
    recoveryMethod: 'clone',     // AUTO | CLONE | INCREMENTAL
    recoveryProgress: 1,         // 0 = silent, 1 = static, 2 = progress bars
    timeout: 600                 // Seconds to wait for post-provisioning sync
    // communicationStack: 'MYSQL'   // Default on 8.0.27+; set only to pin explicitly
  }
)

What's actually happening under the hood, per the official docs: createReplicaCluster() creates the clusterset_replication asynchronous channel, generates an internal replication user with a random password, and configures encryption on the channel according to the clusterSetReplicationSslMode you set on the ClusterSet in §4.1. It also sets skip_replica_start = ON and super_read_only = ON, and enables mysql_start_failover_channels_if_primary for async failover on the channel.

Add remaining DR nodes:

replicaCluster.addInstance('clusterAdmin@dr_node2:3306', {recoveryMethod: 'clone'})
replicaCluster.addInstance('clusterAdmin@dr_node3:3306', {recoveryMethod: 'clone'})

Note on cloneDonor: by default AdminAPI picks a secondary from the primary cluster as clone donor (and falls back to the primary if no secondary is available). For cross-WAN clones of large datasets, consider pinning an explicit donor with cloneDonor: 'host:port' to control which node sources the snapshot, and therefore which link the traffic flows through.

4.3 Common Errors

Error	Cause	Fix
`Server address configuration error`	`report_host` missing/wrong	Add to `my.cnf`, restart mysqld
`Target instance already part of an InnoDB Cluster`	DR not cleaned up	Re-run Step 3
`Errant GTIDs detected`	Old transactions on DR	Use `recoveryMethod: 'clone'` (overwrites)
`Can't connect to primary cluster`	Firewall / `report_host`	Test with `nc -zv`
`clone plugin not installed`	Missing plugin	`INSTALL PLUGIN clone SONAME 'mysql_clone.so';` on both sides
`Access denied for user 'mysql_innodb_cs_*'`	TLS mismatch or host restriction	Verify `replicationAllowedHost` matches donor IP

Step 5: Re-bootstrap MySQL Router

DR's metadata was rebuilt, so Router on both sites must re-learn the new topology.

5.1 DR Router

Always bootstrap against a Master site node. Router reads metadata from the primary.

systemctl stop mysqlrouter
rm -rf /etc/mysqlrouter/*
rm -rf /var/lib/mysqlrouter/*

mysqlrouter --bootstrap clusterAdmin@master_node1:3306 \
  --account routeruser \
  --user=mysqlrouter \
  --conf-use-gr-notifications \
  --force
# Note: consider dropping --disable-rest if you want metrics/health endpoints

systemctl start mysqlrouter

5.2 Master Router

systemctl stop mysqlrouter

mysqlrouter --bootstrap clusterAdmin@master_node1:3306 \
  --account routeruser \
  --user=mysqlrouter \
  --conf-use-gr-notifications \
  --force

systemctl start mysqlrouter

--conf-use-gr-notifications makes Router react to GR state changes without waiting for a metadata TTL, which meaningfully shortens failover detection.

5.3 Keyring Error (`Keyring decryption failed`)

systemctl stop mysqlrouter

find / -name '*.keyring' 2>/dev/null
rm -f /var/lib/mysqlrouter/*.keyring
rm -f /var/lib/mysqlrouter/keyring*
rm -f /etc/mysqlrouter/*.key

mysqlrouter --bootstrap clusterAdmin@master_node1:3306 \
  --account routeruser --user=mysqlrouter --force

5.4 `mysql_native_password` Warning

Failed changing the authentication plugin: mysql_native_password is deprecated is a warning, not an error. In MySQL 8.4 the plugin is disabled by default. Upgrade the account cleanly:

ALTER USER 'routeruser'@'%' IDENTIFIED WITH caching_sha2_password BY '<password>';

Step 6: Verification and Monitoring

6.1 ClusterSet Health

mysqlsh clusterAdmin@master_node1:3306
var clusterSet = dba.getClusterSet()
clusterSet.status({extended: 1})

Expected:

{
  "primaryCluster": "morpheus-cluster",
  "status": "HEALTHY",
  "replicaClusters": {
    "MorphDrCluster": {
      "clusterRole": "REPLICA",
      "clusterSetReplicationStatus": "OK",
      "transactionSetConsistencyStatus": "OK"
    }
  }
}

If transactionSetConsistencyStatus is anything other than OK (especially INCONSISTENT), stop and investigate before letting traffic continue.

6.2 Replication Channel Queries

A handful of queries cover the vast majority of troubleshooting on a running ClusterSet:

-- 1) ClusterSet replication channel status (run on DR primary)
SELECT CHANNEL_NAME, SERVICE_STATE, LAST_ERROR_NUMBER, LAST_ERROR_MESSAGE
FROM performance_schema.replication_connection_status
WHERE CHANNEL_NAME = 'clusterset_replication';

-- 2) Applier worker status (look for LAST_ERROR on any worker)
SELECT CHANNEL_NAME, WORKER_ID, SERVICE_STATE, LAST_ERROR_NUMBER, LAST_ERROR_MESSAGE
FROM performance_schema.replication_applier_status_by_worker
WHERE CHANNEL_NAME = 'clusterset_replication';

-- 3) GTID gap check: the DR executed set should be a subset of Master's
SELECT @@global.gtid_executed;    -- run on both, compare

-- 4) Classic lag view (still useful for a quick eyeball)
SHOW REPLICA STATUS\G
-- Seconds_Behind_Source (8.0.22+) / Seconds_Behind_Master (older)

-- 5) Group Replication member view
SELECT MEMBER_HOST, MEMBER_ROLE, MEMBER_STATE
FROM performance_schema.replication_group_members
ORDER BY MEMBER_ROLE;

6.3 Router Registration and Ports

clusterSet.listRouters()

Router	Hostname	R/W Port	R/O Port
Master Router	`<master_server>`	6446	6447
DR Router	`<dr_server>`	6446	6447

ss -tlnp | grep mysqlrouter
# LISTEN  0  70  0.0.0.0:6446   # R/W
# LISTEN  0  70  0.0.0.0:6447   # R/O

6.4 Metrics to Pipe Into Prometheus

⚠️ Exporter version caveat: the metric names below are indicative and follow prometheus/mysqld_exporter historical naming. Metric names have changed across versions (e.g. slave → replica in newer releases, and some Group Replication metrics moved between collector flags). Before copy-pasting into your alerts, verify the exact names against the exporter version you run. Query your Prometheus with {__name__=~"mysql_.*(replica|slave).*"} and {__name__=~"mysql_.*group_replication.*"} and adjust. You may also need to enable collectors like --collect.slave_status / --collect.replica_status and --collect.perf_schema.replication_group_member_stats.

Metric (indicative name; verify against your exporter)	Threshold	Action
`mysql_slave_status_seconds_behind_master` (or `mysql_replica_status_seconds_behind_source` on newer exporters)	> 10 s sustained	Check WAN bandwidth, DR apply throughput
`mysql_perf_schema_replication_group_member_info{member_role="PRIMARY"}` or equivalent	Must match expected primary	Alert on unexpected primary change
`mysql_up` on any node	= 0 for > 60 s	Page on-call
`transactionSetConsistencyStatus` (via custom exporter scraping `clusterSet.status()`)	!= `OK`	Freeze writes, investigate

The last row does not have a built-in exporter. A common pattern is a small sidecar that periodically runs mysqlsh --js -e "print(JSON.stringify(dba.getClusterSet().status({extended:1})))" and exposes the parsed fields as a Prometheus textfile.

Step 7: Failover Procedures

7.1 Planned Switchover

For maintenance windows, DC migrations, and version upgrades. Zero data loss.

var cs = dba.getClusterSet()

// Promote DR
cs.setPrimaryCluster('MorphDrCluster')

// Switch back
cs.setPrimaryCluster('morpheus-cluster')

Realistic RTO is roughly 30 seconds to 2 minutes, dominated by Router metadata refresh and the in-flight transaction drain. With --conf-use-gr-notifications enabled, the Router side is typically under 10 seconds.

7.2 Emergency Failover (`forcePrimaryCluster`)

Use this when Master is completely unreachable.

Warning: forcePrimaryCluster() skips consistency checks. Transactions committed on Master but not yet replicated to DR will be lost. Expect a small RPO measured in seconds, not zero.

mysqlsh clusterAdmin@dr_node1:3306
var cs = dba.getClusterSet()
cs.forcePrimaryCluster('MorphDrCluster')

Critical: Fence the Old Master

Before or immediately after the force failover, make sure the old Master cannot accept writes when it comes back up. Otherwise the application may briefly hit the old Master through its Router and create divergent data. The preferred approach is to use the AdminAPI fencing commands (MySQL Shell 8.0.27 and later). They coordinate the fence across every node in the old Master cluster and instruct Router to stop routing to it, rather than relying on each DBA to toggle super_read_only by hand.

// Connect to ANY reachable node of the old (ex-primary) cluster
mysqlsh clusterAdmin@master_node1:3306
var cluster = dba.getCluster()

// Hard fence: blocks BOTH writes and reads via Router. Use when you want
// the old cluster completely out of traffic while you investigate.
cluster.fenceAllTraffic()

// Softer alternative: only blocks writes, keeps the cluster available for R/O.
// Useful if the old Master can safely keep serving reads from stale data.
// cluster.fenceWrites()

// After you've repaired the old Master and are ready to rejoin it as a REPLICA:
// cluster.unfenceWrites()   // only needed if you used fenceWrites()

What the AdminAPI fencing actually does, per the official Shell docs:

fenceAllTraffic() sets super_read_only = ON on every member, stops Group Replication on every member, and updates the metadata so Router stops routing any traffic to the cluster.
fenceWrites() sets super_read_only = ON on the primary and updates metadata so Router stops sending R/W traffic but still allows R/O. The cluster keeps replicating from the new primary through ClusterSet.
unfenceWrites() reverses fenceWrites(), so the cluster comes back online as a healthy replica.

If AdminAPI is unreachable (for example the old Master cluster is completely partitioned away), fall back in this order:

Remove the old Master's VIP, DNS entry, or load balancer registration so the application can't reach it.
Firewall-block ports 3306 and 6446 at the old site from the application subnet.
As a last resort, SET GLOBAL super_read_only = ON; manually on each old Master node. This is a per-node toggle, easy to forget on one host, and it does not update ClusterSet metadata.

Skipping fencing is how you end up with a split-brain when the old Master briefly reconnects to the network.

After Master Recovers, Rejoin It

// Run from the DR primary (now the ClusterSet primary)
var cs = dba.getClusterSet()
cs.rejoinCluster('morpheus-cluster')

7.3 The Most Common Rejoin Failure: Errant GTIDs on the Old Master

If the old Master committed transactions during the outage that never made it to DR, rejoinCluster() will fail with an errant-GTID error. In practice this is by far the most common post-failover pain point.

Diagnose:

-- On old Master primary
SELECT @@global.gtid_executed;
-- On new primary (DR)
SELECT @@global.gtid_executed;
-- Compute the diff:
SELECT GTID_SUBTRACT(
  '<old_master_executed>',
  '<dr_executed>'
) AS errant_gtids;

Any non-empty errant_gtids means the old Master has transactions DR never saw. You have three options, ordered from safest to fastest:

Extract and replay manually. Use mysqlbinlog on the old Master to dump only the errant GTIDs and then replay them against the new primary (DR). The slowest option, but no data loss.

   # Step 1: On the OLD Master, find the binlog files covering the outage window
   ls -lh /var/lib/mysql/binlog.*

   # Step 2: Dump only the errant GTIDs to a SQL file.
   #         --include-gtids takes the exact GTID set printed by GTID_SUBTRACT above.
   mysqlbinlog \
     --read-from-remote-server \
     --host=<old_master_ip> --port=3306 \
     --user=clusterAdmin --password \
     --include-gtids='<errant_gtids_from_GTID_SUBTRACT>' \
     --skip-gtids=false \
     binlog.000123 binlog.000124 binlog.000125 \
     > errant.sql

   # Step 3: Inspect errant.sql. Confirm the statements are safe to re-apply
   #         (no DROP/TRUNCATE you didn't expect, no migrations already run on DR).
   less errant.sql

   # Step 4: On the NEW primary (DR), replay as clusterAdmin
   mysql -h dr_node1 -u clusterAdmin -p < errant.sql

   # Step 5: Re-run the GTID diff from §7.3. errant_gtids should now be empty.

Why --skip-gtids=false: it preserves the original GTIDs so the new primary records them as executed. Otherwise a later rejoinCluster() of the old Master will still see them as errant.

Accept the loss and re-clone. Treat the old Master cluster as disposable, dissolve() it, and re-add it as a new replica cluster against the current primary (DR). The fastest option, but loses the errant transactions.
Forcibly reset and rejoin. On each old Master node:

   STOP GROUP_REPLICATION;
   RESET MASTER;                      -- MySQL 8.0.x; on 8.4+ use RESET BINARY LOGS AND GTIDS
   -- Then rejoinCluster() and let clone re-seed

Only safe if you have the original data preserved and audited elsewhere. This wipes the entire GTID history on the target node.

This is why backups are non-negotiable. Option 1 depends on binary log retention that covers the outage window. Set binlog_expire_logs_seconds accordingly. Seven days is a reasonable floor for most shops.

7.4 Router Behavior After Failover

State	Master Router	DR Router
Normal	Master Cluster → R/W (6446)	Master Cluster → R/W (WAN)
After failover	DR Cluster → R/W (auto)	DR Cluster → R/W (local)

No application connection string change is needed. Router reacts to ClusterSet topology changes via GR notifications and its metadata cache.

Architecture Summary

Component	Master Site Role	DR Site Role
Cluster type	Primary (R/W)	Replica (R/O until failover)
Intra-cluster replication	Group Replication (sync)	Group Replication (sync)
Cross-site replication	Source (donor)	Async + TLS from Master
Failover method	`setPrimaryCluster()` (planned)	`forcePrimaryCluster()` (emergency)
Router behavior	Always routes to the primary	Auto-updates after failover
Realistic RTO	30 s to 2 min (planned)	2 to 5 min (forced, includes fence + rejoin)
RPO	Zero (planned)	Near-zero; non-zero on force (typically 1 to 5 s)

Operational Tips and Best Practices

A switchover drill every month is the single highest-leverage habit. Failover should never be tested for the first time during a real disaster; a monthly setPrimaryCluster() round-trip proves the procedure still works and keeps on-call muscle memory sharp.

Log clusterSet.status({extended: 1}) every 60 seconds to a file. A simple cron-scraped JSON log often beats "enterprise" monitoring when you are trying to reconstruct what happened during an incident.

Put Routers behind a load balancer. A single Router IP is a single point of failure. At minimum, run two Routers per site behind HAProxy, Keepalived, or a cloud load balancer.

Back up independently of the cluster. ClusterSet protects against infrastructure failure, not logical failure. A DROP TABLE replicates. Use MySQL Enterprise Backup or Percona XtraBackup on an independent schedule, ideally to a third location that is neither Master nor DR.

Keep the rolling approach for MySQL version upgrades: upgrade DR first, observe for a week, switch over, upgrade the old Master, then switch back.

Watch cross-site RTT. Async replication tolerates latency well, but GTID apply throughput drops meaningfully above roughly 50 ms RTT for write-heavy workloads. Benchmark before assuming it will work for your traffic shape.

Rotate the internal mysql_innodb_cs_* replication account credentials on a schedule. AdminAPI creates them with strong random passwords but never rotates them on its own. Build a runbook for the rotation.

Keep binlog_expire_logs_seconds generous. Seven days is a reasonable floor for most production environments. It is the difference between manual recovery (option 1 in §7.3) and data loss (option 2).

Don't share credentials between clusterAdmin and routeruser. They have different privilege needs, and scoping them tightly limits blast radius.

Document your fencing procedure before you need it. When a real emergency failover happens, you do not want to be figuring out "how do we stop the old Master from accepting writes" on the fly.

Wrap-Up

InnoDB ClusterSet is a production-grade multi-site DR solution that can be introduced into a live environment without downtime, provided you respect three disciplines:

Rolling restart discipline, so you never break quorum.
Clean DR teardown followed by a TLS-hardened ClusterSet creation. Use dissolve() (or fallback dropMetadataSchema()) to clean DR, then create the ClusterSet with clusterSetReplicationSslMode: 'REQUIRED' (or VERIFY_IDENTITY on Shell 8.0.33+) and a scoped replicationAllowedHost. Never use % in production.
Failover realism. Plan for errant GTIDs, old-Master fencing, and backup-dependent recovery paths before you need them, not after.

Get those three right and your application keeps talking to the same Router address while your infrastructure is ready for the worst case: a full data center loss.