ScaleGrid for ScaleGrid

Posted on Jan 8, 2019 • Edited on Dec 26, 2019 • Originally published at scalegrid.io

MySQL High Availability Framework Explained – Part II

#mysql #replication #highavailability #database

In Part I, we introduced a High Availability (HA) framework for MySQL hosting and discussed various components and their functionality. Now in Part II, we will discuss the details of MySQL semisynchronous replication and the related configuration settings that help us ensure redundancy and consistency of the data in our HA setup. Make sure to check back in for Part III where we will review various failure scenarios that could arise and the way the framework responds and recovers from these conditions.

What is MySQL Semisynchronous Replication?

Simply put, in a MySQL semisynchronous replication configuration, the master commits transactions to the storage engine only after receiving acknowledgement from at least one of the slaves. The slaves would provide acknowledgement only after the events are received and copied to the relay logs and also flushed to the disk. This guarantees that for all transactions committed and returned to the client, the data exists on at least 2 nodes. The term ‘semi’ in semisynchronous (replication) is due to the fact that the master commits the transactions once the events are received and flushed to relay log, but not necessarily committed to the data files on the slave. This is in contrast to fully synchronous replication, where the transaction would have been committed on both the slave and the master before the session returns to the client.

Semisynchronous replication, which is natively available in MySQL, helps the HA framework to ensure data consistency and redundancy for committed transactions. In the event of a master failure, all transactions committed on the master would have been replicated to at least one of the slaves (saved to the relay logs). As a result, failover to that slave would be lossless because the slave is up to date (after the relay logs of the slave are fully drained).

Replication and Semisynchronous Related Settings

Let’s discuss some of the key MySQL settings used to ensure optimal behavior for high availability and data consistency in our framework.

Managing The Execution Speed of The Slaves

The first consideration is to handle the ‘semi’ behavior of semisynchronous replication which only guarantees that the data has been received and flushed to the relay logs by the I/O thread of the slave, but not necessarily committed by the SQL thread. By default, the SQL thread in a MySQL slave is single-threaded and will not be able to keep pace with the master which is multi-threaded. The obvious impact of this is that in the event of a master failure, the slave will not be up-to-date as its SQL thread is still processing the events in the relay log. This will delay the failover process as our framework expects the slave to be fully up-to-date before it can be promoted. This is necessary to preserve data consistency. To address this issue, we enable multi-threaded slaves with the option slave_parallel_workers to set the number of parallel SQL threads to process events in the relay logs.

In addition, we configure the below settings which ensure that the slave does not enter any state that the master was not in:

slave-parallel-type = LOGICAL_CLOCK
slave_preserve_commit_order = 1

This provides us with stronger data consistency. With these settings, we will be able to get better parallelization and speed on the slave, but if there are too many parallel threads, the overhead involved in coordinating between the threads will also increase and can unfortunately offset the benefits.

Another configuration we can use to increase the efficiency of parallel execution on the slaves is to tune binlog_group_commit_sync_delay on the master. By setting this on master, the binary log entries on the master and hence the relay log entries on the slave will have batches of transactions that can be processed parallelly by the SQL threads. This is explained in detail in J-F Gagné's blog where he refers to this behavior as ‘slowing down the master to speed-up the slave’.

If you’re managing your MySQL deployments through the ScaleGrid console, you have the ability to continuously monitor and receive real-time alerts on the replication lag of the slaves. It also allows you to dynamically tune the above parameters to ensure the slaves are working hand in hand with the master, therefore, minimizing your time involved in a failover process.

Important Semisynchronous Replication Options

MySQL semisynchronous replication, by design, can fall back to asynchronous mode based on the slave acknowledgement timeout settings or based on the number of semisynchronous-capable slaves available at any point in time. Asynchronous mode, by definition, does not provide guarantees that committed transactions are replicated to the slave and hence a master loss would lead to losing the data that has not been replicated. The default design of the ScaleGrid HA framework is to avoid falling back to asynchronous mode. Let’s review the configurations that influence this behavior.

rpl_semi_sync_master_wait_for_slave_count

This option is used to configure the number of slaves that must send an acknowledgment before a semisynchronous master can commit the transaction. In the 3-node master-slave configuration, we set this to 1 so we always have an assurance that the data is available in at least one slave while avoiding any performance impact involved in waiting for acknowledgment from both of the slaves.
rpl_semi_sync_master_timeout

This option is used to configure the amount of time that a semisynchronous master waits for acknowledgment from a slave before switching back to asynchronous mode. We set this to a relatively high timeout value so there is no fallback to asynchronous mode.

Since we are operating with 2 slaves and the rpl_semi_sync_master_wait_for_slave_count is set to 1, we’ve noticed that at least one of the slaves does acknowledge within a reasonable amount of time and the master does not switch over to asynchronous mode during temporary network disruptions.
rpl_semi_sync_master_wait_no_slave

This controls whether the master waits for the timeout period configured by rpl_semi_sync_master_timeout to expire, even if the slave count drops to less than the number of slaves configured by rpl_semi_sync_master_wait_for_slave_count during the timeout period. We retain the default value of ON so that the master does not fall back to asynchronous replication.

Impact of Losing All The Semisynchronous Slaves

As we saw above, our framework prevents the master from switching to asynchronous replication if all the slaves go down or become unreachable from the master. The direct impact of this is that writes get stalled on the master impacting the availability of the service. This is essentially as described by the CAP theorem about the limitations of any distributed system. The theorem states that, in the presence of a network partition, we will have to choose either availability or consistency, but not both. Network partition, in this case, can be considered as MySQL slaves disconnected from the master because they are either down or unreachable.

Our consistency goal is to ensure that for all committed transactions, the data is available on at least 2 nodes. As a result in such cases, the ScaleGrid HA framework favors consistency over availability. Further writes will not be accepted from clients though the MySQL master will still be serving the read requests. This is a conscious design decision we have made as the default behavior which is, of course, configurable based on the application requirements.

Make sure to subscribe to the ScaleGrid blog so you don’t miss Part III where we will discuss more failure scenarios and recovery abilities of the MySQL HA framework. Stay tuned!!