DEV Community

Zepher Ashe
Zepher Ashe

Posted on

Ceph Public Network Migration (No Downtime)

Ceph Public Network Migration (Proxmox)

172.16.0.0/16 → 10.50.0.0/24

No service downtime, no data loss

📌 Context

This procedure documents a live Ceph public network migration performed on a Proxmox-backed Ceph cluster.

The goal was to eliminate management-network congestion while maintaining cluster availability and data integrity.



🎯 Objective

Migrate all Ceph traffic (MON, MGR, MDS, OSD front + back) from a congested management network to a dedicated Ceph fabric (e.g. 2.5 GbE switch), while keeping the cluster healthy and online.


🧱 Key Concepts (Read Once)

public_network

  • Client ↔ OSD traffic
  • MON / MGR control plane
  • CephFS metadata traffic

cluster_network

  • OSD ↔ OSD replication & recovery (data plane)

Important behaviours

  • MON & MGR enforce address validation
  • OSDs bind addresses at restart
  • /etc/pve/ceph.conf is not authoritative alone — Ceph also uses its internal config database

1️⃣ Prepare the New Ceph Network

Create a dedicated bridge on each node (example: vmbr-ceph):

vim /etc/network/interfaces
Enter fullscreen mode Exit fullscreen mode
auto vmbr-ceph
iface vmbr-ceph inet static
    address 10.50.0.20/24
    bridge-ports eno2
    bridge-stp off
    bridge-fd 0
# Ceph (Fabric)
Enter fullscreen mode Exit fullscreen mode

Assign IPs on the new subnet:

  • pve2 → 10.50.0.20/24
  • pve3 → 10.50.0.30/24
  • pve4 → 10.50.0.40/24

Ensure this network is isolated (no gateway required).

Verify connectivity

ping 10.50.0.30
iperf3 -s / -c <peer>
Enter fullscreen mode Exit fullscreen mode

2️⃣ Add the New Public Network (Dual-Network Phase)

NOTE: Back up the file first

cp /etc/pve/ceph.conf /etc/pve/ceph.conf.bak
Enter fullscreen mode Exit fullscreen mode

Edit /etc/pve/ceph.conf:

public_network = 10.50.0.0/24, 172.16.0.0/16
cluster_network = 10.50.0.0/24, 172.16.0.0/16
Enter fullscreen mode Exit fullscreen mode

⚠️ Do NOT remove the old network yet

Confirm:

  • Proxmox UI → Ceph → Nodes
  • ceph config dump

3️⃣ Recreate MONs (One by One)

MONs enforce network validation.

For each node:

pveceph mon destroy <node>
pveceph mon create
ceph -s
Enter fullscreen mode Exit fullscreen mode

✔ Ensure quorum after each step.


4️⃣ Recreate MGRs (One by One)

  • Recreate standby managers first
  • Leave the active manager for last
pveceph mgr destroy <node>
pveceph mgr create
Enter fullscreen mode Exit fullscreen mode

Verify:

ceph mgr dump
Enter fullscreen mode Exit fullscreen mode

🔧 Recovery Tip

If a manager fails to start:

systemctl reset-failed ceph-mgr@<node>
systemctl start ceph-mgr@<node>
Enter fullscreen mode Exit fullscreen mode

5️⃣ Recreate CephFS Metadata Servers (MDS)

MDS binds its address at creation time

pveceph mds destroy <node>
pveceph mds create
Enter fullscreen mode Exit fullscreen mode

✔ Verify CephFS health before proceeding.


6️⃣ Remove the Old Public Network

Edit /etc/pve/ceph.conf and remove 172.16.0.0/16:

public_network = 10.50.0.0/24
cluster_network = 10.50.0.0/24
Enter fullscreen mode Exit fullscreen mode

7️⃣ Recreate MONs, MGRs, and MDS (Again)

This ensures all control-plane daemons bind exclusively to the new network.

Order:

  1. MONs (one by one)
  2. MGRs (standbys first, active last)
  3. MDS (one by one)

8️⃣ Protect the Cluster Before Touching OSDs

ceph osd set noout
Enter fullscreen mode Exit fullscreen mode

9️⃣ Restart OSDs (Data Plane Migration)

Restart one OSD at a time:

systemctl restart ceph-osd@<id>
ceph -s
Enter fullscreen mode Exit fullscreen mode

Wait for:

PGs: active+clean
Enter fullscreen mode Exit fullscreen mode

Repeat for all OSDs.


🔟 Remove Protection

ceph osd unset noout
Enter fullscreen mode Exit fullscreen mode

🔎 Verification (Critical)

1️⃣ Verify Ceph daemon addresses

ceph osd metadata <id> | egrep 'front_addr|back_addr'
Enter fullscreen mode Exit fullscreen mode

Expected:

  • front_addr → 10.50.0.x
  • back_addr → 10.50.0.x
  • ❌ No 172.16.x.x

2️⃣ Verify traffic is using the Ceph fabric

While Ceph is under load:

ip -s link show vmbr-ceph
Enter fullscreen mode Exit fullscreen mode

RX/TX counters should increase, confirming traffic is not using the management network.


3️⃣ Verify raw network performance (iperf3)

⚠️ Important: iperf3 must be installed on all Ceph nodes to test the fabric correctly.

apt install iperf3
Enter fullscreen mode Exit fullscreen mode

Correct testing method:

  • Server on one node:
iperf3 -s
Enter fullscreen mode Exit fullscreen mode
  • Client on a different node:
iperf3 -c <peer_ip> -P 4
Enter fullscreen mode Exit fullscreen mode

Expected for 2.5 GbE Ceph fabric:

  • ~2.1–2.4 Gbit/s
  • Minimal or zero retransmits
  • Stable throughput across multiple streams

🚨 Troubleshooting: “OSDs Not Reachable / Wrong Subnet”

Symptom

osd.X's public address is not in '172.16.x.x/16' subnet
Enter fullscreen mode Exit fullscreen mode

Cause

Ceph config DB or MON/MGR cache still references the old network.

Fix (Critical)

Restart ALL MONs (mandatory)

systemctl restart ceph-mon@pve2
systemctl restart ceph-mon@pve3
systemctl restart ceph-mon@pve4
Enter fullscreen mode Exit fullscreen mode

Restart ALL MGRs (mandatory)

systemctl restart ceph-mgr@pve2
systemctl restart ceph-mgr@pve3
systemctl restart ceph-mgr@pve4
Enter fullscreen mode Exit fullscreen mode

(Optional) Clean config DB

ceph config rm global public_network
ceph config rm global cluster_network
ceph config set global public_network 10.50.0.0/24
ceph config set global cluster_network 10.50.0.0/24
Enter fullscreen mode Exit fullscreen mode

Restart OSDs again (one by one).

✔ This should resolve any “OSDs missing / wrong subnet” cases.


⚠️ Risks Considered

Why this change is risky

Changing Ceph cluster networking affects quorum, OSD availability, replication traffic, and client IO. Incorrect sequencing can cause data unavailability or permanent loss.

Failure modes considered

  • MON quorum loss
  • OSD flapping
  • Client IO stalls
  • Backfill storms
  • Split-brain conditions

Assumptions

  • Single Ceph cluster
  • Dedicated replication network (fabric)
  • Change executed during low IO window

✅ Final State

  • Dedicated Ceph fabric (2.5 GbE)
  • No Ceph traffic on management NIC
  • MON / MGR / MDS / OSD fully migrated
  • No data loss
  • Stable cluster

🙏 Acknowledgements

This migration approach was heavily informed by the following Proxmox forum discussion, which proved critical in resolving address-binding and daemon recreation issues during the Ceph public network transition:

In particular, the guidance around:

  • Temporarily running dual public networks
  • Recreating MON, MGR, and MDS daemons to force address rebinding
  • Avoiding full cluster downtime during network migration

was instrumental in achieving a clean, no–data-loss migration.

Many thanks to the contributors in that thread for sharing real-world operational experience.

Top comments (0)