In this blog we will go over the basics of replication in MySQL, and why it's useful.
What is Replication?
Replication is the ability to have multiple copies of the same data stored on different machines.
Replication in MySQL:
Allows the synchronization of two servers state, one server is referred to as the “master” and the other “slave”. But for now we will be referring to them as “parent” and “child”. When using replication your app can connect to a child in order to read its data, but can only modify this data on the parent server. Here’s a little overview of how statement replication works:
The client will first connect to the parent server and will execute the modifying statement. After it’s executed it will be written to a binlog file, the parent server then sends a response to the client and continues with other transactions.
Parent:
The parent records all modifying statements (creating, deleting, updating statements) to what is called a binlog. Every statement is saved with a timestamp, sequence number. Once the statement is in the binlog it can then be sent to the child.
Child:
When the child server is connected to the parent, it can ask for the updated binlog file. During this process the child sends the last sequence number it saw, the parent sends its binlog from that number. The child then saves these statements into its own copy called relay log, once it's written onto it, it is executed on its data set and the last seen sequence number is updated.
Because replication is asynchronous, this means that the parent server doesn’t wait for the child server to get the statements copied. The parent will continue to write statements to its binlog even if the child server is not connected. Asynchronous replication also allows for decoupling meaning that a parent can connect or disconnect a child anytime without affecting the parent. The child makes sure to keep track of the last statement sequence number so it can retrieve the correct statements. Once a child is disconnected from the parent, the parent will forget all about that child. Since MySQL replication is asynchronous, it allows decoupling which means the master can connect and disconnect slaves anytime without it affecting the master, and the master does not need to keep track of its slaves, which allows for interesting topologies. Some of these topologies are:
Having Many Children:
Instead of having only one child server you can create many child replicas and distribute read queries around them. By having multiple children the children don't interact with each other, and can be disconnected or connected at any time without affecting other servers. By having multiple children, you can:
- Distribute read only statements among more servers. This is scaling by adding clones applied to database engines. You can use different children for different queries, for example one would be for regular application queries while another one is for long-running reports. By having a separate server you can insulate your app from input/output intensive queries.
- Use MySQL replication to perform 0 downtime backups. All you need to do to backup a child machine is shut down the MySQL process, copy files to your archive location, and then start MySQL again. As soon as it connects back it will catch up with missed statements.
- If one of the child servers stops working, you can stop sending requests to that server which will put it out of rotation until it is rebuilt. Losing a slave server doesn’t affect other servers since all the information saved by them is available to the master or other servers. When a server fails it has to be detected on the client side, so you can implement it with your own app logic or use a smart proxy/load balancer to detect it.
Master-Master:
In this scenario you have two servers in which they both accept writes, there is a parent A and a parent B. Parent A replicates from parent B and vice versa. Here we have all writes being sent to parent A, where they are being recorded to its binlog. Parent B replicates the same writes to its relay log and executes them to its own copy of data. Parent B then writes these statements to its own binlog in case other children want to replicate them. The same way parent B did this, parent A will also replicate parent B’s statements from its binlog by adding them to its own relay log, executing them and then logging them to its binlog.
- This topology can be used for faster master failover and more transparent maintenance. If parent A fails at any point, your app can be configured to to direct all writes to parent B.
- Switch between groups with minimal downtime. You can create two identical server groups by having parent A and parent B having the same amount of children. This allows your app to run with equal capacity using either groups. If parent A fails you can switch to parent B's group. This also helps with updating software or hardware on your parent databases. You can start upgrading parent B first and its children, once it is done stop the writes going to parent A which starts the downtime. Wait for all writes in parent A replicate to parent B. Once that is done you can direct all writes to parent B while you upgrade parent A and its children
Thank you for reading!!!
I hope this small overview of replication helps you :)
Top comments (1)
Great overview. I have implemented them mysql