What happens to your MongoDB replica set when it comes to failures like network partitioning, restarting, reconfiguration of the existing topology, etc.? This question is especially important these days because of the popularity gained by the multi-cloud model where chances of these scenarios are quite realistic.
However, is there a solution, preferably a free one, for testing such cases that would obviate the need of writing manual scripts and poring over the official documentation? As software developers, we would be better off preparing our applications in advance to survive these failures.
Having a look at MongoDB Atlas and their free Shared Clusters, we cannot configure a replica set given by default which is Primary with Two Secondary Members (hereinafter the P-S-S), not to mention testing some features like restartPrimaries
. All of it is for Dedicated Clusters requiring payments at hourly rates.
While cloud solutions seem to be the de facto choice for production systems, some software developers are likely to be in need of free local alternatives, which is proven by projects like LocalStack, act, etc.
Given that, what about developing a solution for running locally some functionality that the paid version of MongoDB Atlas does? One possible way is to opt for the following:
- Docker to run MongoDB containers;
- Testcontainers to handle the containers via a programming language like Java;
- Toxiproxy to simulate network conditions.
All good, but to build a cluster of more than one node MongoDB requires that "each member of a replica set is accessible by way of resolvable DNS or hostnames". If you run Docker version 18.03+ on hosts like Mac and Win or Docker 20.04+ on Linux, you are free to use the special DNS name host.docker.internal
. During installation Docker might add it to your OS host file. However, for those who cannot upgrade their Docker for some reason, we can employ a special container which redirects traffic to the host, for instance Qoomon docker-host. Similarly, we can add dockerhost
to the OS host file and also run the container with NET_ADMIN
and NET_RAW
kernel capabilities.
The good news is that we do not have to connect all the above-mentioned pieces together manually, but instead can take the free MongoDBReplicaSet.
Let us get started by creating a new Gradle project with Java and adding some dependencies from Maven central:
testCompile("com.github.silaev:mongodb-replica-set:0.4.3")
testImplementation("org.mongodb:mongodb-driver-sync:4.2.1")
1) Note that it is up to you which MongoDB driver to use here, mongodb-driver-sync
is an example.
Then we need to check and possibly change our OS host file. Regarding host.docker.internal
, it might be already there, otherwise add 127.0.0.1 host.docker.internal
. As per dockerhost
, add 127.0.0.1 dockerhost
. You have to pick only one of them and use it within your test execution.
Our journey begins and we are ready to write a test to simulate network partitioning in the P-S-S. Here is the description of it:
try (
final MongoDbReplicaSet mongoReplicaSet = MongoDbReplicaSet.builder()
.mongoDockerImageName("mongo:4.4.4")
.useHostDockerInternal(true)
.addToxiproxy(true)
.replicaSetNumber(3)
.commandLineOptions(Arrays.asList("--oplogSize", "50"))
.build()
) {
1) Use mongo:4.4.4
as the latest MongoDB Docker image at the moment of writing;
2) If useHostDockerInternal
is true, use host.docker.internal
of Docker, otherwise take dockerhost
of Qoomon docker-host;
3) Put a ToxiproxyContainer.ContainerProxy against each MongoDB node;
4) Set 3 (possible up to 7) members to construct the P-S-S;
5) Optionally, add some command line options, for example set 50MB as the replication operation log;
6) Auto-close all the containers via try-with-resources in case of any exception or completion.
Now we can start the replica set by mongoReplicaSet.start()
, get its URL and make some assertions:
final String replicaSetUrl = mongoReplicaSet.getReplicaSetUrl();
assertThat(
mongoReplicaSet.nodeStates(mongoReplicaSet.getMongoRsStatus().getMembers())
).containsExactlyInAnyOrder(PRIMARY, SECONDARY, SECONDARY);
1) Internally, getMongoRsStatus() calls rs.status()
in MongoDB shell.
Then we can, for instance, create a MongoClient
to insert some data and subsequently assert it (see the full example on Github).
try (
final MongoClient mongoSyncClient = MongoClients.create(new ConnectionString(replicaSetUrl))
) {
1) Note that we are also able to use the more convenient MongoClientSettings
as the parameter of the create
method to set timeouts, read/write concerns, turn off retries at the connection level.
The first failure comes here so let our replica set survive the disconnection of the master node:
// TODO: Insert a document here to assert total number at the end
final MongoNode masterNodeBeforeFailure1 = mongoReplicaSet.getMasterMongoNode(
mongoReplicaSet.getMongoRsStatus().getMembers()
);
mongoReplicaSet.disconnectNodeFromNetwork(masterNodeBeforeFailure1);
mongoReplicaSet.waitForMasterReelection(masterNodeBeforeFailure1);
assertThat(
mongoReplicaSet.nodeStates(mongoReplicaSet.getMongoRsStatus().getMembers())
).containsExactlyInAnyOrder(PRIMARY, SECONDARY, DOWN);
1) We need to wait for a new master node elected by providing the previous master node to the waitForMasterReelection(...)
method.
Going further, the next accident leads to the newcomer master node getting cut off:
// TODO: Insert a document here to assert total number at the end
final MongoNode masterNodeBeforeFailure2 = mongoReplicaSet.getMasterMongoNode(
mongoReplicaSet.getMongoRsStatus().getMembers()
);
mongoReplicaSet.disconnectNodeFromNetwork(masterNodeBeforeFailure2);
mongoReplicaSet.waitForMongoNodesDown(2);
assertThat(
mongoReplicaSet.nodeStates(mongoReplicaSet.getMongoRsStatus().getMembers())
).containsExactlyInAnyOrder(SECONDARY, DOWN, DOWN);
1) We wait for a moment when our single secondary detects the other 2 nodes being down.
Our journey is drawing to a close, so let us bring all the disconnected nodes back by way of a happy end:
// TODO: Insert a document here to assert total number at the end
mongoReplicaSet.connectNodeToNetwork(masterNodeBeforeFailure1);
mongoReplicaSet.connectNodeToNetwork(masterNodeBeforeFailure2);
mongoReplicaSet.waitForAllMongoNodesUp();
mongoReplicaSet.waitForMaster();
assertThat(
mongoReplicaSet.nodeStates(mongoReplicaSet.getMongoRsStatus().getMembers())
).containsExactlyInAnyOrder(PRIMARY, SECONDARY, SECONDARY);
1) The waitForAllMongoNodesUp(...)
method waits for all the disconnected nodes to be up and running;
2) Then the waitForMaster()
method waits for the elections to complete.
Conclusion
Why writing such tests? To address this question, let us set write concern
as majority
with journaling
enabled and read concern
as majority
as well. Then we can replace // TODO:...
in the above-mentioned code examples with the mongoSyncClient.insertOne(…)
method handling possible exceptions to add a new document 3 times. Running this test 120 times and waiting for a while at the end, I found out that approximately half the time total number was 2 and the other half it was 3. Therefore, the idea behind these tests is to be ready for some corner cases beforehand.
Links:
- Find the example from this article here;
- MongoDBReplicaSet on Github.
Top comments (0)