Ara

Posted on May 17

Migrating a ScyllaDB Cluster the “Brain Transplant” Way

#scylladb #bigdata #database

Ever tried migrating a ScyllaDB cluster when traditional replication tools are off the table?

I went a little "mad scientist" and pulled off what I call a cluster brain transplant.

The idea: copy the raw data files while the source cluster keeps running, then cut over with minimal downtime.

Risky? Yes. Crazy? Definitely. But it worked — three times in a row. Here's the story of how I did it, why I had no other choice, and what I learned along the way.

Sometimes, traditional migration methods just don't work.
That was the situation I found myself in when moving a ScyllaDB cluster that was running inside managed Kubernetes. Two-way connectivity between old and new clusters wasn't possible, which meant I had to get something non traditional.

After a few experiments, I pulled off what I now call the "brain transplant" migration. It's not the official way, but it worked — and worked surprisingly well.

I have migrated a production database with as little as possible downtime and 100% data consistency.

The Challenge

Normally, ScyllaDB migrations rely on tools like sstableloader or replication between clusters. But when your cluster runs in managed Kubernetes, networking and connectivity restrictions can get in the way. In my case, it wasn't possible to directly link the old and new clusters in both directions.

That left me with a crazy idea: what if I just copied the entire brain of the cluster — all the data files, commitlogs, and system state — into a brand-new cluster, and then carefully brought it to life?

The Migration Steps

Here's how it went down:

Prepare the destination cluster

This is one to one copy solution, so source and destination clusters must have the same amount of nodes and identical configuration .

Install destination cluster. Use the Scylla official documentation to install a brand-new cluster.
Configuration. The most crucial is the cluster name parameter. Make sure that source and destination clusters have same name in config file.
Stop The destination. Shutdown all nodes in destination cluster and delete ALL data directories.

1. Full rsync copy

I started with a one-to-one rsync of all Scylla data from the source cluster to the destination cluster.This took a long time (not surprising, given the dataset size), but it was straightforward. Importantly, the source cluster stayed online and continued serving applications during this. Here the exact rsync command that I have used. Assuming we need to migrate 3 nodes.

On source node1: rsync -val --exclude 'commitlog/*' /var/lib/scylla/* destinationsrv1:/var/lib/sylla
On source node2: rsync -val --exclude 'commitlog/*' /var/lib/scylla/* destinationsrv2:/var/lib/sylla
On source node3: rsync -val --exclude 'commitlog/*' /var/lib/scylla/* destinationsrv3:/var/lib/sylla

2. Incremental rsync runs

After the initial heavy lift, I ran multiple incremental rsyncs. Each one was much faster than the last, because only changed data needed to be copied. Again, the source cluster kept working during this step, so downtime was still zero.

3. The cutover

When it was time to switch, I stopped the applications pointing to the old cluster. The source cluster was still technically alive, but no longer serving traffic. This was the official "downtime" moment.

4. Booting up the new cluster

On the destination side, I started the seed node first, waited for it to come up, then started the remaining nodes one by one. This part took some patience. The logs were noisy with strange-looking system messages, but eventually all nodes settled down and came online.

5. Validating the cluster

With all nodes running, nodetool status confirmed that the new cluster was healthy. I could connect with cqlsh, query some tables, and see real data.

6. Repairing

To make sure everything was consistent, I ran nodetool repair on each destination node, one by one. This is a normal part of cluster maintenance, and it completed without errors.

7. Final shutdown of the source

Once I was confident the destination cluster was working correctly, I shut down the old Kubernetes-based cluster for good.

8. Phantom memories (The final step)

As we did a brain transplant, the new system will have "Phantom memories" about previous nodes in a cluster. nodetool status will show clean cluster with new nodes only, you will be able to get and set data, but not able to do metadata changes, add or remove nodes.

The reason is that there is still information about nodes from source cluster, which does not exist anymore, but Scylla tries to connect and get some information.

The symptom messages in log like these :

scylla:  [shard  0:main] raft_group_registry - (rate limiting dropped 2999 similar messages) Raft server id d9756728-be49-4cbf-8e2c-417aa8b917c1 cannot be translated to an IP address.
scylla:  [shard  0:main] raft_group_registry - (rate limiting dropped 2999 similar messages) Raft server id e671084b-c41f-4eec-a73c-4c2eaf48ac38 cannot be translated to an IP address.
scylla:  [shard  0:main] raft_group_registry - (rate limiting dropped 2999 similar messages) Raft server id 96edc6f8-4e36-4044-ab53-c0a95a3873f7 cannot be translated to an IP address.

Solution

Monitor logs for messages like above and run nodetool removenode ID on any of cluster member nodes.

nodetool removenode d9756728-be49-4cbf-8e2c-417aa8b917c1
nodetool removenode e671084b-c41f-4eec-a73c-4c2eaf48ac38
nodetool removenode 96edc6f8-4e36-4044-ab53-c0a95a3873f7

On success, nothing should be printed on stdout.

Continue monitoring logs and remove all phantom nodes.

Manual recovery and Raft reset.

This procedure is needed as the nodes have changed their IDs, but the Raft database is not cleared. Even after performing Phantom nodes removal, described above, yor cluster may, most probably will, keep old nodes in Raft. So manual removal is required.

During this period tour cluster will be rolling restarted and put in a RECOVERY mode.

Procedure

Perform the following query on every alive node in the cluster, using e.g. cqlsh:

cqlsh> UPDATE system.scylla_local SET value = 'recovery' WHERE key = 'group0_upgrade_state';

Perform a rolling restart of your alive nodes.
Verify that all the nodes have entered RECOVERY mode when restarting; look for one of the following messages in their logs:

group0_client - RECOVERY mode.
raft_group0 - setup_group0: Raft RECOVERY mode, skipping group 0 setup.
raft_group0_upgrade - RECOVERY mode. Not attempting upgrade.

Remove all your dead nodes using the node removal procedure.
Remove existing Raft cluster data by performing the following queries on every alive node in the cluster, using e.g. cqlsh:

cqlsh> TRUNCATE TABLE system.topology;
cqlsh> TRUNCATE TABLE system.discovery;
cqlsh> TRUNCATE TABLE system.group0_history;
cqlsh> DELETE value FROM system.scylla_local WHERE key = 'raft_group0_id';

Make sure that schema is synchronized in the cluster by executing nodetool describecluster on each node and verifying that the schema version is the same on all nodes.
We can now leave RECOVERY mode. On every alive node, perform the following query:

cqlsh> DELETE FROM system.scylla_local WHERE key = 'group0_upgrade_state';

Perform a rolling restart of your alive nodes.
The Raft upgrade procedure will start anew. Verify that it finishes successfully.

Why This Worked

At first glance, this approach sounds risky — copying live data files and commitlogs while the source cluster is still running. And yet, Scylla's design and eventual consistency model made it surprisingly resilient.

By repeatedly syncing the data and commitlogs, then repairing the new cluster after startup, I ended up with a clean and working copy. It's a bit like pausing a brain, moving it into a new body, and jump-starting it again.

Lessons Learned

It's not the official method. This was a pragmatic hack, not a documented procedure. If you can use sstableloader or proper replication, do that instead.
Incremental rsync is a lifesaver. Each run got faster and gave me confidence that the final cutover would be smooth.
Expect noisy logs. Don't panic if the new nodes shout a lot when starting. Let them stabilize.
Repair is mandatory. Running nodetool repair at the end ensures consistency across the new cluster.

Final Thoughts

Would I recommend this approach for everyone? Probably not. But in constrained environments, sometimes you need to think outside the box.

For me, the "brain transplant" worked — three times in fact, with consistent results.

It's one of those migration war stories worth sharing.
If you're ever stuck without traditional migration paths,maybe this story gives you a bit of inspiration (and courage) to try something unconventional.

✅ TL;DR: I migrated a ScyllaDB cluster by rsync'ing its data and commitlogs into a new cluster, booting it up, repairing it, and cutting over apps — a pragmatic "brain transplant" that worked when standard tools weren't an option.

DEV Community