Replication Principles

Aug 16, 2009 Author: SQLmaster

Replication can take various forms. The MySQL implementation is a directional master-slave relationship. One server is called the master. One or more other servers are called slaves to a particular master. The master controls what data is stored in the system while the slaves try to mirror that content.

The replication process relies on the binary log on the master. This log stores details of every query executed on the server since logging was enabled. Slaves are sent queries from the master's binary log to apply to their own stored data.

You would generally perform all write operations directly on the master and share read operations among all the slaves or even the master and the slaves. This is usually achieved by designing this logic into your application.

It is important to note that if you are adding replication to an existing database with stored data, the binary log may be incomplete. Binary logging is not enabled by default, so the server may not have been logging when you began adding data to the system. To start replicating, you need all slaves to have the same data that the master had when binary logging began. We will look at this subject in more detail later in this chapter.

After being started with consistent data, the slaves connect to the master and apply any changes appearing in the master's binary log to their own data. A thread on the slave connects to a thread on the master and requests new events. These are stored in a relay log on the slave. A separate thread on the slave reads events from the local relay log and executes the queries on the local mirror of the data.

Because the master and slaves can start at different times (because new servers can be added to the system while queries are being performed and because network connections can fail or become a bottleneck), slaves need to be able to keep track of where they are in the log of updates to be performed. It is important that atomic transactions are honored and updates are performed in order. For most applications, as long as the database moves from one consistent state to another, it is less important if the data being read is a few seconds or minutes out of date.

If you are considering using replication, you need to understand how it works. Updates are asynchronous and do not happen in real time. Queries sent to different servers can give different results for some time after an update is made. This can be seen as a negative, but the positive side is that if you have a slave running on a portable device or an unreliable network, it will happily operate for long periods between updating data from the master.

A Note on Versions Replication was added in a fairly recent version of MySQL (3.23.15). It is therefore a feature that still improves with nearly every version that is released. If you intend to use replication, it would be a good idea to be using an up-to-date version of MySQL on all machines.

It is possible to have some combinations of versions running in master-slave relationships together, but this adds an extra level of uncertainty and is best avoided where possible. You can run into problems in which functions, such as PASSWORD(), have changed between versions if you try to make different versions work together.

If you really need to have different versions on the same system, a matrix of combinations of master and slave versions that can work together is available in the documentation

views 4009
  1. Add New Comment