<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Zepher Ashe</title>
    <description>The latest articles on DEV Community by Zepher Ashe (@safesploit).</description>
    <link>https://dev.to/safesploit</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2933548%2Fdcd46cea-f504-4b29-ac9a-2897cb63540c.png</url>
      <title>DEV Community: Zepher Ashe</title>
      <link>https://dev.to/safesploit</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/safesploit"/>
    <language>en</language>
    <item>
      <title>Stop Writing Python Like JavaScript: Common Language-Switching Mistakes</title>
      <dc:creator>Zepher Ashe</dc:creator>
      <pubDate>Tue, 14 Apr 2026 22:18:06 +0000</pubDate>
      <link>https://dev.to/safesploit/stop-writing-python-like-javascript-common-language-switching-mistakes-2kj7</link>
      <guid>https://dev.to/safesploit/stop-writing-python-like-javascript-common-language-switching-mistakes-2kj7</guid>
      <description>&lt;ul&gt;
&lt;li&gt;1. Respect Language Idioms&lt;/li&gt;
&lt;li&gt;2. Mind the Type System&lt;/li&gt;
&lt;li&gt;3. Handle Errors the Right Way&lt;/li&gt;
&lt;li&gt;4. Manage Dependencies Correctly&lt;/li&gt;
&lt;li&gt;5. Be Conscious of Execution Context&lt;/li&gt;
&lt;li&gt;6. Security Practices Don’t Always Translate&lt;/li&gt;
&lt;li&gt;7. Testing &amp;amp; Tooling&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;When moving between languages, the main risks are &lt;strong&gt;forcing habits from one language onto another&lt;/strong&gt;, &lt;strong&gt;misusing idioms&lt;/strong&gt;, or &lt;strong&gt;forgetting environment-specific behaviours&lt;/strong&gt;. The following practices help minimise friction and avoid subtle bugs.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Respect Language Idioms
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Don’t force a&lt;/strong&gt; &lt;code&gt;main()&lt;/code&gt; &lt;strong&gt;everywhere&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bash, Perl, PHP, JavaScript → natural to write procedural top-down code.&lt;/li&gt;
&lt;li&gt;Python, Java, Go, Rust, C/C++ → explicit entry point is expected.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Follow the &lt;strong&gt;ecosystem’s coding style&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python → &lt;code&gt;snake_case&lt;/code&gt; for variables/functions, &lt;code&gt;CamelCase&lt;/code&gt; for classes.&lt;/li&gt;
&lt;li&gt;JavaScript → &lt;code&gt;camelCase&lt;/code&gt; for variables/functions, &lt;code&gt;PascalCase&lt;/code&gt; for classes.&lt;/li&gt;
&lt;li&gt;PowerShell → &lt;code&gt;Verb-Noun&lt;/code&gt; (&lt;code&gt;Get-Process&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Java → &lt;code&gt;CamelCase&lt;/code&gt; everywhere, verbose OOP structure.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. Mind the Type System
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dynamic languages&lt;/strong&gt; (Python, Bash, Perl, PHP, JavaScript):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Expect runtime type errors; validate inputs early.&lt;/li&gt;
&lt;li&gt;Use linters/type checkers if available (&lt;code&gt;mypy&lt;/code&gt; for Python, &lt;code&gt;tsc&lt;/code&gt; for TypeScript).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Static languages&lt;/strong&gt; (C, C++, Java, Go, Rust):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Embrace compiler feedback; it saves runtime pain.&lt;/li&gt;
&lt;li&gt;Prefer explicit over inferred types when readability matters.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Hybrid&lt;/strong&gt; (PowerShell):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Strongly typed but still allows implicit conversions → always declare expected types in &lt;code&gt;Param()&lt;/code&gt; blocks.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. Handle Errors the Right Way
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bash/Perl&lt;/strong&gt;: check &lt;code&gt;$?&lt;/code&gt; or &lt;code&gt;$!&lt;/code&gt; exit codes; use &lt;code&gt;set -euo pipefail&lt;/code&gt; in Bash.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Python/Java/JavaScript/PHP&lt;/strong&gt;: structured &lt;code&gt;try/except&lt;/code&gt; or &lt;code&gt;try/catch&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rust&lt;/strong&gt;: pattern-match on &lt;code&gt;Result&lt;/code&gt;/&lt;code&gt;Option&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Go&lt;/strong&gt;: check returned &lt;code&gt;err&lt;/code&gt; explicitly (&lt;code&gt;if err != nil { ... }&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PowerShell&lt;/strong&gt;: &lt;code&gt;try/catch/finally&lt;/code&gt; with terminating errors.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4. Manage Dependencies Correctly
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Python → &lt;code&gt;pip&lt;/code&gt;/&lt;code&gt;poetry&lt;/code&gt;/&lt;code&gt;venv&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;JavaScript → &lt;code&gt;npm&lt;/code&gt;/&lt;code&gt;yarn&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;PHP → &lt;code&gt;composer&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Go → &lt;code&gt;go mod&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Rust → &lt;code&gt;cargo&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Perl → &lt;code&gt;CPAN&lt;/code&gt;/&lt;code&gt;cpanm&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Avoid hardcoding library paths; use ecosystem tools for portability.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  5. Be Conscious of Execution Context
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CLI scripts&lt;/strong&gt;: Bash, Perl, Python — make sure you handle arguments (&lt;code&gt;$1&lt;/code&gt;, &lt;code&gt;sys.argv&lt;/code&gt;, &lt;code&gt;@ARGV&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web scripts&lt;/strong&gt;: PHP, JavaScript — expect interaction with HTTP state, superglobals, or events.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compiled binaries&lt;/strong&gt;: Go, Rust, C/C++ — distribute with care; consider static builds for portability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixed models&lt;/strong&gt;: PowerShell, Python, Node.js → scripts can be interactive or part of automation pipelines.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  6. Security Practices Don’t Always Translate
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;String handling&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Perl, Bash, PHP → watch out for injection risks (shell eval, SQL injection).&lt;/li&gt;
&lt;li&gt;Python, Go, Rust → encourage safer libraries by default.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Secrets handling&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Never hardcode credentials; environment variables or vaults are standard.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Input validation&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;JavaScript/PHP → sanitize all external input (XSS/SQLi risks).&lt;/li&gt;
&lt;li&gt;Bash/Perl → validate before using in system commands.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  7. Testing &amp;amp; Tooling
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Always use &lt;strong&gt;linters/formatters&lt;/strong&gt; when switching (helps enforce idioms):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python → &lt;code&gt;black&lt;/code&gt;, &lt;code&gt;flake8&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;JavaScript → &lt;code&gt;eslint&lt;/code&gt;, &lt;code&gt;prettier&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Go → &lt;code&gt;gofmt&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Rust → &lt;code&gt;clippy&lt;/code&gt;, &lt;code&gt;rustfmt&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;PowerShell → &lt;code&gt;PSScriptAnalyzer&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Write small &lt;strong&gt;unit tests&lt;/strong&gt; in the ecosystem’s framework:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python → &lt;code&gt;pytest&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Java → &lt;code&gt;JUnit&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;PHP → &lt;code&gt;PHPUnit&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Go → built-in &lt;code&gt;go test&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Rust → built-in &lt;code&gt;cargo test&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;p&gt;Switching isn’t just syntax — it’s mindset. Learn the idioms, adopt ecosystem tools, and adjust error-handling and security practices to the language’s expectations.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>softwareengineering</category>
      <category>productivity</category>
      <category>devops</category>
    </item>
    <item>
      <title>Linux Observability Tools — A Practical Guide</title>
      <dc:creator>Zepher Ashe</dc:creator>
      <pubDate>Mon, 09 Mar 2026 09:45:12 +0000</pubDate>
      <link>https://dev.to/safesploit/linux-observability-tools-a-practical-guide-5152</link>
      <guid>https://dev.to/safesploit/linux-observability-tools-a-practical-guide-5152</guid>
      <description>&lt;h2&gt;
  
  
  Linux Observability Tools
&lt;/h2&gt;

&lt;p&gt;Observability is the ability to understand what a Linux system is doing &lt;em&gt;internally&lt;/em&gt; by examining the signals it emits — metrics, logs, traces, and events.&lt;br&gt;&lt;br&gt;
This guide provides a structured overview of Linux observability tools, grouped by the system layers they inspect. It is designed as a practical reference for troubleshooting, performance engineering, capacity planning, and DevSecOps workflows.&lt;/p&gt;




&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🧱 1. Application &amp;amp; User-Space Observability&lt;/li&gt;
&lt;li&gt;🧩 2. System Libraries &amp;amp; Syscall Interface&lt;/li&gt;
&lt;li&gt;🧬 3. Kernel Subsystems Observability&lt;/li&gt;
&lt;li&gt;🔩 4. Device Drivers &amp;amp; Block Layer Observability&lt;/li&gt;
&lt;li&gt;📦 5. Storage &amp;amp; Swap Observability&lt;/li&gt;
&lt;li&gt;🌐 6. Network Stack &amp;amp; NIC Observability&lt;/li&gt;
&lt;li&gt;🖥️ 7. Hardware Observability (CPU, RAM, Buses)&lt;/li&gt;
&lt;li&gt;📊 8. System-Wide Observability Tools&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  📊 Overview Diagram
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffuywmgql9gos0lwtso3l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffuywmgql9gos0lwtso3l.png" alt="Linux Observability Tools" width="800" height="562"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Diagram © Brendan Gregg — used here with attribution for educational and informational purposes.  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The diagram maps common observability tools to layers of the Linux operating system, from user-space applications down to hardware, providing a mental model for selecting the right tool during analysis or incident response.&lt;/p&gt;




&lt;h1&gt;
  
  
  🧱 1. Application &amp;amp; User-Space Observability
&lt;/h1&gt;

&lt;p&gt;These tools inspect behaviour at the &lt;em&gt;process and application&lt;/em&gt; level, including interactions with system libraries.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔧 Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;strace&lt;/strong&gt; – traces system calls made by an application.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ltrace&lt;/strong&gt; – traces dynamic library calls.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ss&lt;/strong&gt; – modern socket statistics (replacement for &lt;code&gt;netstat&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;netstat&lt;/strong&gt; – legacy but still useful for connection state overview.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sysdig&lt;/strong&gt; – system-wide syscall/event capture and filtering.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;lsof&lt;/strong&gt; – lists open files, sockets, pipes, etc.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pidstat&lt;/strong&gt; – per-process CPU, memory, I/O, threads.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pcstat&lt;/strong&gt; – page cache statistics for specific files.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 When to use
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Debugging why an application is slow or blocked.
&lt;/li&gt;
&lt;li&gt;Identifying network usage per process.
&lt;/li&gt;
&lt;li&gt;Auditing open files and ports.
&lt;/li&gt;
&lt;li&gt;Understanding syscall patterns for performance tuning.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🧩 2. System Libraries &amp;amp; Syscall Interface
&lt;/h1&gt;

&lt;p&gt;This layer sits between applications and the kernel. Tools here help examine transitions between user-space and kernel-space.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔧 Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;strace / ltrace&lt;/strong&gt; – observe execution flow into syscalls and libraries.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;perf&lt;/strong&gt; – syscall latency, profiling, hotspots.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ftrace&lt;/strong&gt; – built-in kernel tracer for syscalls and function calls.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SystemTap (stap)&lt;/strong&gt; – programmable probes for syscalls.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LTTng&lt;/strong&gt; – high-performance tracing for production systems.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eBPF / bpftrace&lt;/strong&gt; – modern, safe kernel-level instrumentation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 When to use
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Diagnosing syscall bottlenecks.
&lt;/li&gt;
&lt;li&gt;Monitoring unexpected kernel interactions.
&lt;/li&gt;
&lt;li&gt;High-resolution production tracing with low overhead (eBPF).&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🧬 3. Kernel Subsystems Observability
&lt;/h1&gt;

&lt;p&gt;The kernel handles filesystems, memory management, scheduling, and networking. Tools here inspect these internal mechanisms.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔧 Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;perf&lt;/strong&gt; – scheduler behaviour, CPU cycles, kernel hotspots.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;tcpdump&lt;/strong&gt; – raw packet capture at the IP/Ethernet layers.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;iptraf&lt;/strong&gt; – lightweight network utilisation monitor.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vmstat&lt;/strong&gt; – processes, memory, swap, I/O, interrupts.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;slabtop&lt;/strong&gt; – kernel slab allocator usage.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;free&lt;/strong&gt; – memory allocation breakdown.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pidstat&lt;/strong&gt; – scheduler awareness and per-thread stats.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;tiptop&lt;/strong&gt; – per-thread metrics using hardware counters.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 When to use
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Identifying memory pressure, leaks, or slab exhaustion.
&lt;/li&gt;
&lt;li&gt;Determining network packet loss or congestion.
&lt;/li&gt;
&lt;li&gt;Analysing scheduler-induced latency.
&lt;/li&gt;
&lt;li&gt;Understanding kernel-side performance issues.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🔩 4. Device Drivers &amp;amp; Block Layer Observability
&lt;/h1&gt;

&lt;p&gt;These tools examine I/O as it flows through the Linux block subsystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔧 Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;iostat&lt;/strong&gt; – block device throughput and latency.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;iotop&lt;/strong&gt; – per-process disk I/O usage.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;blktrace&lt;/strong&gt; – very detailed block layer tracing.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;perf / tiptop&lt;/strong&gt; – device driver profiling.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 When to use
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Troubleshooting slow disk I/O.
&lt;/li&gt;
&lt;li&gt;Detecting I/O starvation or noisy-neighbour workloads.
&lt;/li&gt;
&lt;li&gt;Analysing LVM/RAID performance issues.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  📦 5. Storage &amp;amp; Swap Observability
&lt;/h1&gt;

&lt;p&gt;Tools focusing on physical disks, logical volumes, controllers, and swap usage.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔧 Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;iostat&lt;/strong&gt; – read/write performance.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;iotop&lt;/strong&gt; – which processes are causing I/O.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;blktrace&lt;/strong&gt; – kernel-level I/O event tracing.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;swapon -s&lt;/strong&gt; – view swap devices and utilisation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 When to use
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Swap thrash detection.
&lt;/li&gt;
&lt;li&gt;Disk queue depth analysis.
&lt;/li&gt;
&lt;li&gt;Understanding storage behaviour under load.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🌐 6. Network Stack &amp;amp; NIC Observability
&lt;/h1&gt;

&lt;p&gt;These tools examine network interfaces, Ethernet drivers, ports, and NIC statistics.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔧 Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;tcpdump&lt;/strong&gt; – packet-level visibility.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ss / netstat&lt;/strong&gt; – connections and sockets.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;iptraf&lt;/strong&gt; – per-interface traffic charts.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ethtool&lt;/strong&gt; – NIC driver settings and link state.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;nicstat&lt;/strong&gt; – interface utilisation.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;lldptool&lt;/strong&gt; – LLDP neighbour discovery.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;snmpget&lt;/strong&gt; – SNMP-based network metrics.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 When to use
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Packet drops, retransmits, or MTU mismatches.
&lt;/li&gt;
&lt;li&gt;NIC offload tuning (TSO, GRO, etc.).
&lt;/li&gt;
&lt;li&gt;Link speed/duplex mismatch troubleshooting.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🖥️ 7. Hardware Observability (CPU, RAM, Buses)
&lt;/h1&gt;

&lt;p&gt;These tools provide insights into how the &lt;strong&gt;hardware itself&lt;/strong&gt; behaves — including CPU frequency, power states, performance counters, NUMA locality, memory pressure, cache behaviour, and bus throughput.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔧 CPU Tools
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;mpstat&lt;/strong&gt; – Reports CPU usage per core, showing utilisation, steal time, IRQ time, and more.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;top&lt;/strong&gt; – Real-time process monitoring with CPU, load average, and per-thread breakdowns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ps&lt;/strong&gt; – Snapshot of process states, CPU usage, memory usage, and scheduling information.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pidstat&lt;/strong&gt; – Per-thread and per-process CPU utilisation, context switching, and scheduling metrics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;perf&lt;/strong&gt; – Hardware performance counter profiler (cycles, cache misses, branch mispredictions).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;turbostat&lt;/strong&gt; – Intel-specific tool showing CPU frequencies, C-states, P-states, and turbo boost behaviour.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;rdmsr&lt;/strong&gt; – Reads CPU model-specific registers (MSRs) for extremely low-level introspection.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔧 Memory Tools
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;vmstat&lt;/strong&gt; – Shows paging, swapping, memory pressure, interrupts, and system-wide throughput.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;free&lt;/strong&gt; – Reports total, used, cached, and available system memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;slabtop&lt;/strong&gt; – Displays kernel slab allocator statistics (caches, objects, memory used).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;numastat&lt;/strong&gt; – NUMA locality, node memory distribution, and remote memory access counts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;perf (memory events)&lt;/strong&gt; – Analyses hardware counters related to RAM, cache, and memory bus traffic.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧠 When to use
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;NUMA locality and cross-node memory access debugging.
&lt;/li&gt;
&lt;li&gt;CPU throttling, frequency scaling, or thermal throttling investigations.
&lt;/li&gt;
&lt;li&gt;Memory pressure analysis, leaking workloads, or kernel slab issues.
&lt;/li&gt;
&lt;li&gt;High-performance tuning for compute-heavy or latency-sensitive workloads.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  📊 8. System-Wide Observability Tools
&lt;/h1&gt;

&lt;p&gt;These tools cover multiple layers at once.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔧 Tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;sar&lt;/strong&gt; – historic performance logs across CPU, memory, I/O, network.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;dstat&lt;/strong&gt; – live multi-metric system aggregation.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sysdig&lt;/strong&gt; – holistic tracing across syscalls, network, containers.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;/proc&lt;/strong&gt; – raw kernel data for metrics, states, drivers, and interfaces.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 When to use
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Incident response and baselining.
&lt;/li&gt;
&lt;li&gt;Long-term trending and anomaly detection.
&lt;/li&gt;
&lt;li&gt;System-wide correlation across resources.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  🛠️ Practical Use Cases
&lt;/h1&gt;

&lt;h3&gt;
  
  
  ✔ Root Cause Analysis (RCA)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Identify if a slowdown is CPU, memory, network, or storage related.
&lt;/li&gt;
&lt;li&gt;Trace a misbehaving process through syscalls into the kernel.
&lt;/li&gt;
&lt;li&gt;Compare observed performance against baseline.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✔ Performance Tuning
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Scheduler tracing for latency-sensitive workloads.
&lt;/li&gt;
&lt;li&gt;NIC tuning via &lt;code&gt;ethtool&lt;/code&gt; for high-throughput environments.
&lt;/li&gt;
&lt;li&gt;Storage insight for LVM/RAID/SSD/HDD tuning.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ✔ DevSecOps / Security
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;eBPF tools for detecting suspicious syscalls.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;lsof&lt;/code&gt; for auditing unexpected open sockets/files.
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sysdig&lt;/code&gt; rules for behavioural anomaly detection.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A secure system is one that is &lt;em&gt;understood&lt;/em&gt;, not just hardened.&lt;/p&gt;




&lt;h1&gt;
  
  
  🔐 Observability in DevSecOps
&lt;/h1&gt;

&lt;p&gt;Observability is not just operational — it is &lt;em&gt;security-critical&lt;/em&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Detect unusual syscall patterns (possible intrusion).
&lt;/li&gt;
&lt;li&gt;Identify crypto miners via CPU and scheduler patterns.
&lt;/li&gt;
&lt;li&gt;Spot exfiltration via abnormal NIC or TCP behaviour.
&lt;/li&gt;
&lt;li&gt;Validate hardening changes improve performance rather than degrade it.&lt;/li&gt;
&lt;/ul&gt;




&lt;h1&gt;
  
  
  📚 References
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Brendan Gregg — &lt;em&gt;Linux Performance Tools&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Kernel documentation — &lt;a href="https://www.kernel.org/doc/" rel="noopener noreferrer"&gt;https://www.kernel.org/doc/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Sysdig, LTTng, SystemTap official docs
&lt;/li&gt;
&lt;li&gt;eBPF / bpftrace reference guides
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>monitoring</category>
      <category>performance</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Ceph Public Network Migration (No Downtime)</title>
      <dc:creator>Zepher Ashe</dc:creator>
      <pubDate>Sun, 08 Mar 2026 00:43:10 +0000</pubDate>
      <link>https://dev.to/safesploit/ceph-public-network-migration-no-downtime-47bj</link>
      <guid>https://dev.to/safesploit/ceph-public-network-migration-no-downtime-47bj</guid>
      <description>&lt;h2&gt;
  
  
  Ceph Public Network Migration (Proxmox)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;172.16.0.0/16 → 10.50.0.0/24&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;No service downtime, no data loss&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  📌 Context
&lt;/h2&gt;

&lt;p&gt;This procedure documents a live Ceph public network migration performed on a Proxmox-backed Ceph cluster.&lt;br&gt;&lt;br&gt;
The goal was to eliminate management-network congestion while maintaining cluster availability and data integrity.&lt;/p&gt;



&lt;ul&gt;
&lt;li&gt;🎯 Objective&lt;/li&gt;
&lt;li&gt;🧱 Key Concepts (Read Once)&lt;/li&gt;
&lt;li&gt;🚨 Troubleshooting&lt;/li&gt;
&lt;li&gt;⚠️ Risks Considered&lt;/li&gt;
&lt;li&gt;✅ Final State&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  🎯 Objective
&lt;/h2&gt;

&lt;p&gt;Migrate &lt;strong&gt;all Ceph traffic&lt;/strong&gt; (MON, MGR, MDS, OSD front + back) from a congested management network to a &lt;strong&gt;dedicated Ceph fabric&lt;/strong&gt; (e.g. 2.5 GbE switch), while keeping the cluster healthy and online.&lt;/p&gt;


&lt;h2&gt;
  
  
  🧱 Key Concepts (Read Once)
&lt;/h2&gt;
&lt;h3&gt;
  
  
  &lt;code&gt;public_network&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Client ↔ OSD traffic
&lt;/li&gt;
&lt;li&gt;MON / MGR control plane
&lt;/li&gt;
&lt;li&gt;CephFS metadata traffic&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  &lt;code&gt;cluster_network&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;OSD ↔ OSD replication &amp;amp; recovery (data plane)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Important behaviours
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;MON &amp;amp; MGR enforce address validation&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OSDs bind addresses at restart&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/etc/pve/ceph.conf&lt;/code&gt; is &lt;strong&gt;not authoritative alone&lt;/strong&gt; — Ceph also uses its &lt;strong&gt;internal config database&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  1️⃣ Prepare the New Ceph Network
&lt;/h2&gt;

&lt;p&gt;Create a &lt;strong&gt;dedicated bridge&lt;/strong&gt; on each node (example: &lt;code&gt;vmbr-ceph&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vim /etc/network/interfaces
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="err"&gt;auto&lt;/span&gt; &lt;span class="err"&gt;vmbr-ceph&lt;/span&gt;
&lt;span class="err"&gt;iface&lt;/span&gt; &lt;span class="err"&gt;vmbr-ceph&lt;/span&gt; &lt;span class="err"&gt;inet&lt;/span&gt; &lt;span class="err"&gt;static&lt;/span&gt;
    &lt;span class="err"&gt;address&lt;/span&gt; &lt;span class="err"&gt;10.50.0.20/24&lt;/span&gt;
    &lt;span class="err"&gt;bridge-ports&lt;/span&gt; &lt;span class="err"&gt;eno2&lt;/span&gt;
    &lt;span class="err"&gt;bridge-stp&lt;/span&gt; &lt;span class="err"&gt;off&lt;/span&gt;
    &lt;span class="err"&gt;bridge-fd&lt;/span&gt; &lt;span class="err"&gt;0&lt;/span&gt;
&lt;span class="c"&gt;# Ceph (Fabric)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assign IPs on the new subnet:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;pve2 → 10.50.0.20/24&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pve3 → 10.50.0.30/24&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pve4 → 10.50.0.40/24&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ensure this network is &lt;strong&gt;isolated&lt;/strong&gt; (no gateway required).&lt;/p&gt;

&lt;h3&gt;
  
  
  Verify connectivity
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ping 10.50.0.30
iperf3 &lt;span class="nt"&gt;-s&lt;/span&gt; / &lt;span class="nt"&gt;-c&lt;/span&gt; &amp;lt;peer&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  2️⃣ Add the New Public Network (Dual-Network Phase)
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; Back up the file first&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; /etc/pve/ceph.conf /etc/pve/ceph.conf.bak
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Edit &lt;code&gt;/etc/pve/ceph.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;public_network&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10.50.0.0/24, 172.16.0.0/16&lt;/span&gt;
&lt;span class="py"&gt;cluster_network&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10.50.0.0/24, 172.16.0.0/16&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;⚠️ &lt;strong&gt;Do NOT remove the old network yet&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Confirm:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Proxmox UI → &lt;strong&gt;Ceph → Nodes&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ceph config dump&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3️⃣ Recreate MONs (One by One)
&lt;/h2&gt;

&lt;p&gt;MONs enforce network validation.&lt;/p&gt;

&lt;p&gt;For each node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pveceph mon destroy &amp;lt;node&amp;gt;
pveceph mon create
ceph &lt;span class="nt"&gt;-s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✔ Ensure quorum after each step.&lt;/p&gt;




&lt;h2&gt;
  
  
  4️⃣ Recreate MGRs (One by One)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Recreate &lt;strong&gt;standby managers first&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Leave the &lt;strong&gt;active manager for last&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pveceph mgr destroy &amp;lt;node&amp;gt;
pveceph mgr create
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ceph mgr dump
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔧 Recovery Tip
&lt;/h3&gt;

&lt;p&gt;If a manager fails to start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl reset-failed ceph-mgr@&amp;lt;node&amp;gt;
systemctl start ceph-mgr@&amp;lt;node&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  5️⃣ Recreate CephFS Metadata Servers (MDS)
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;MDS binds its address &lt;strong&gt;at creation time&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pveceph mds destroy &amp;lt;node&amp;gt;
pveceph mds create
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;✔ Verify CephFS health before proceeding.&lt;/p&gt;




&lt;h2&gt;
  
  
  6️⃣ Remove the Old Public Network
&lt;/h2&gt;

&lt;p&gt;Edit &lt;code&gt;/etc/pve/ceph.conf&lt;/code&gt; and remove &lt;code&gt;172.16.0.0/16&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;public_network&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10.50.0.0/24&lt;/span&gt;
&lt;span class="py"&gt;cluster_network&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;10.50.0.0/24&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  7️⃣ Recreate MONs, MGRs, and MDS (Again)
&lt;/h2&gt;

&lt;p&gt;This ensures &lt;strong&gt;all control-plane daemons bind exclusively&lt;/strong&gt; to the new network.&lt;/p&gt;

&lt;p&gt;Order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;MONs (one by one)&lt;/li&gt;
&lt;li&gt;MGRs (standbys first, active last)&lt;/li&gt;
&lt;li&gt;MDS (one by one)&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  8️⃣ Protect the Cluster Before Touching OSDs
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ceph osd &lt;span class="nb"&gt;set &lt;/span&gt;noout
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  9️⃣ Restart OSDs (Data Plane Migration)
&lt;/h2&gt;

&lt;p&gt;Restart &lt;strong&gt;one OSD at a time&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl restart ceph-osd@&amp;lt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
ceph &lt;span class="nt"&gt;-s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait for:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PGs: active+clean
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repeat for all OSDs.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔟 Remove Protection
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ceph osd &lt;span class="nb"&gt;unset &lt;/span&gt;noout
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔎 Verification (Critical)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1️⃣ Verify Ceph daemon addresses
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ceph osd metadata &amp;lt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; | egrep &lt;span class="s1"&gt;'front_addr|back_addr'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;code&gt;front_addr → 10.50.0.x&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;✅ &lt;code&gt;back_addr → 10.50.0.x&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;❌ No &lt;code&gt;172.16.x.x&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2️⃣ Verify traffic is using the Ceph fabric
&lt;/h3&gt;

&lt;p&gt;While Ceph is under load:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ip &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nb"&gt;link &lt;/span&gt;show vmbr-ceph
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;RX/TX counters should increase, confirming traffic is &lt;strong&gt;not&lt;/strong&gt; using the management network.&lt;/p&gt;




&lt;h3&gt;
  
  
  3️⃣ Verify raw network performance (iperf3)
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important:&lt;/strong&gt; &lt;code&gt;iperf3&lt;/code&gt; must be installed on &lt;strong&gt;all Ceph nodes&lt;/strong&gt; to test the fabric correctly.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;iperf3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Correct testing method:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Server on one node:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;iperf3 &lt;span class="nt"&gt;-s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Client on a &lt;em&gt;different&lt;/em&gt; node:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;iperf3 &lt;span class="nt"&gt;-c&lt;/span&gt; &amp;lt;peer_ip&amp;gt; &lt;span class="nt"&gt;-P&lt;/span&gt; 4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected for 2.5 GbE Ceph fabric:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;~2.1–2.4 Gbit/s&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Minimal or zero retransmits&lt;/li&gt;
&lt;li&gt;Stable throughput across multiple streams&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🚨 Troubleshooting: “OSDs Not Reachable / Wrong Subnet”
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Symptom
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;osd.X's public address is not in '172.16.x.x/16' subnet
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cause
&lt;/h3&gt;

&lt;p&gt;Ceph config DB or MON/MGR cache still references the old network.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fix (Critical)
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Restart ALL MONs (mandatory)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl restart ceph-mon@pve2
systemctl restart ceph-mon@pve3
systemctl restart ceph-mon@pve4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Restart ALL MGRs (mandatory)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl restart ceph-mgr@pve2
systemctl restart ceph-mgr@pve3
systemctl restart ceph-mgr@pve4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  (Optional) Clean config DB
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ceph config &lt;span class="nb"&gt;rm &lt;/span&gt;global public_network
ceph config &lt;span class="nb"&gt;rm &lt;/span&gt;global cluster_network
ceph config &lt;span class="nb"&gt;set &lt;/span&gt;global public_network 10.50.0.0/24
ceph config &lt;span class="nb"&gt;set &lt;/span&gt;global cluster_network 10.50.0.0/24
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Restart OSDs again (one by one).&lt;/p&gt;

&lt;p&gt;✔ This should resolve any “OSDs missing / wrong subnet” cases.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚠️ Risks Considered
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why this change is risky
&lt;/h3&gt;

&lt;p&gt;Changing Ceph cluster networking affects quorum, OSD availability, replication traffic, and client IO. Incorrect sequencing can cause data unavailability or permanent loss.&lt;/p&gt;

&lt;h3&gt;
  
  
  Failure modes considered
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;MON quorum loss&lt;/li&gt;
&lt;li&gt;OSD flapping&lt;/li&gt;
&lt;li&gt;Client IO stalls&lt;/li&gt;
&lt;li&gt;Backfill storms&lt;/li&gt;
&lt;li&gt;Split-brain conditions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Assumptions
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Single Ceph cluster&lt;/li&gt;
&lt;li&gt;Dedicated replication network (fabric)&lt;/li&gt;
&lt;li&gt;Change executed during low IO window&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ✅ Final State
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Dedicated Ceph fabric (2.5 GbE)&lt;/li&gt;
&lt;li&gt;No Ceph traffic on management NIC&lt;/li&gt;
&lt;li&gt;MON / MGR / MDS / OSD fully migrated&lt;/li&gt;
&lt;li&gt;No data loss&lt;/li&gt;
&lt;li&gt;Stable cluster&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🙏 Acknowledgements
&lt;/h2&gt;

&lt;p&gt;This migration approach was heavily informed by the following Proxmox forum discussion, which proved critical in resolving address-binding and daemon recreation issues during the Ceph public network transition:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Proxmox Forum – “Ceph: changing public network”&lt;/strong&gt;
&lt;a href="https://forum.proxmox.com/threads/ceph-changing-public-network.119116/" rel="noopener noreferrer"&gt;https://forum.proxmox.com/threads/ceph-changing-public-network.119116/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In particular, the guidance around:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Temporarily running &lt;strong&gt;dual public networks&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recreating MON, MGR, and MDS daemons&lt;/strong&gt; to force address rebinding&lt;/li&gt;
&lt;li&gt;Avoiding full cluster downtime during network migration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;was instrumental in achieving a clean, no–data-loss migration.&lt;/p&gt;

&lt;p&gt;Many thanks to the contributors in that thread for sharing real-world operational experience.&lt;/p&gt;

</description>
      <category>distributedsystems</category>
      <category>linux</category>
      <category>networking</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Why Online Disk Expansion Is Safe on Linux (SAN, LVM2, XFS/Ext4)</title>
      <dc:creator>Zepher Ashe</dc:creator>
      <pubDate>Sun, 08 Mar 2026 00:41:02 +0000</pubDate>
      <link>https://dev.to/safesploit/why-online-growth-is-safe-for-lvm2-m03</link>
      <guid>https://dev.to/safesploit/why-online-growth-is-safe-for-lvm2-m03</guid>
      <description>&lt;h2&gt;
  
  
  Why Online Growth Is Safe for SAN (Fibre Channel), LVM2, and File System (XFS)
&lt;/h2&gt;

&lt;p&gt;Modern enterprise Linux storage stacks are designed for &lt;strong&gt;non-disruptive, online capacity expansion&lt;/strong&gt;, even while filesystems are mounted and under active I/O.&lt;br&gt;&lt;br&gt;
This document explains why each layer—&lt;strong&gt;SAN/FC, LVM2, and XFS&lt;/strong&gt;—fully supports this behaviour.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;ME4 LUN → dm-multipath → LVM2 (PV/VG/LV) → XFS/ext4&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  1. SAN / Fibre Channel (FC) – Non-Disruptive LUN Expansion
&lt;/h2&gt;

&lt;p&gt;Modern enterprise SAN systems (such as &lt;strong&gt;Dell EMC ME4&lt;/strong&gt;) support &lt;strong&gt;live, nondisruptive LUN expansion&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Designed for &lt;strong&gt;nondisruptive expansion&lt;/strong&gt;; brief I/O stalls may occur.&lt;/li&gt;
&lt;li&gt;No unmount or downtime
&lt;/li&gt;
&lt;li&gt;Host simply rescans the SCSI bus
&lt;/li&gt;
&lt;li&gt;Multipath handles updated path geometry automatically
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Dell’s ME4 Linux best-practices guide states:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“Resize tasks can be done online without disrupting the applications.”&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Source:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://dl.dell.com/manuals/common/powervault-me4-and-linux-best-practices_en-us.pdf" rel="noopener noreferrer"&gt;https://dl.dell.com/manuals/common/powervault-me4-and-linux-best-practices_en-us.pdf&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;After expanding the LUN, the host only needs a rescan:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rescan-scsi-bus.sh &lt;span class="nt"&gt;--resize&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;multipathd &lt;span class="nt"&gt;-k&lt;/span&gt;
&lt;span class="c"&gt;# &amp;gt; multipathd&amp;gt; "resize map mpathX"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;SAN LUN expansion is explicitly engineered to occur online&lt;/strong&gt;, with no interruption to servers or I/O.&lt;/p&gt;


&lt;h2&gt;
  
  
  2. LVM2 – Online PV and LV Expansion (Safe While Mounted &amp;amp; Under I/O)
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ Do not proceed unless all paths and the multipath device report the new size.&lt;/p&gt;

&lt;p&gt;Ensure the following report ALL paths are resized:&lt;/p&gt;


&lt;pre class="highlight shell"&gt;&lt;code&gt;multipath &lt;span class="nt"&gt;-ll&lt;/span&gt; mpathX
blockdev &lt;span class="nt"&gt;--getsize64&lt;/span&gt; /dev/sdX
blockdev &lt;span class="nt"&gt;--getsize64&lt;/span&gt; /dev/mapper/mpathX
&lt;/code&gt;&lt;/pre&gt;

&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  2.1 PV Resize (&lt;code&gt;pvresize&lt;/code&gt;) – Safe While Mounted
&lt;/h3&gt;

&lt;p&gt;According to Red Hat’s LVM developers:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“&lt;code&gt;pvresize&lt;/code&gt; simply updates the metadata to make LVM aware of the new size.”&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;“The data area does not change; only the PV extent map is updated.”&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
— Jonathan Brassow, Red Hat LVM developer&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Source:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.redhat.com/archives/linux-lvm/2009-May/msg00040.html" rel="noopener noreferrer"&gt;https://www.redhat.com/archives/linux-lvm/2009-May/msg00040.html&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Because it modifies &lt;strong&gt;only metadata&lt;/strong&gt;, &lt;code&gt;pvresize&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does not touch data blocks
&lt;/li&gt;
&lt;li&gt;Does not affect the filesystem
&lt;/li&gt;
&lt;li&gt;Is safe under ongoing I/O

&lt;ul&gt;
&lt;li&gt;Provided the block device has been correctly resized on ALL paths &lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Is routinely run on &lt;strong&gt;root filesystems&lt;/strong&gt;, which cannot be unmounted
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Cloud vendor documentation confirming online PV resizing
&lt;/h3&gt;

&lt;p&gt;All three major cloud providers document &lt;code&gt;pvresize&lt;/code&gt; as part of their &lt;strong&gt;“expand disk without downtime”&lt;/strong&gt; workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AWS – Expand EBS volumes&lt;/strong&gt;
&lt;a href="https://docs.aws.amazon.com/ebs/latest/userguide/recognize-expanded-volume-linux.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/ebs/latest/userguide/recognize-expanded-volume-linux.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google Cloud – Resize persistent disks&lt;/strong&gt;
&lt;a href="https://cloud.google.com/compute/docs/disks/resize-persistent-disk" rel="noopener noreferrer"&gt;https://cloud.google.com/compute/docs/disks/resize-persistent-disk&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Azure – Expand disks *without downtime&lt;/strong&gt;*
&lt;a href="https://learn.microsoft.com/en-us/azure/virtual-machines/linux/expand-disks?tabs=ubuntu#expand-without-downtime" rel="noopener noreferrer"&gt;https://learn.microsoft.com/en-us/azure/virtual-machines/linux/expand-disks?tabs=ubuntu#expand-without-downtime&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These platforms are extremely conservative; documenting &lt;code&gt;pvresize&lt;/code&gt; online means it is &lt;strong&gt;fully safe and supported&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;code&gt;pvresize&lt;/code&gt; is a &lt;strong&gt;safe, online, non-disruptive&lt;/strong&gt; operation on modern LVM2.&lt;/p&gt;


&lt;h3&gt;
  
  
  2.2 LV Resize (&lt;code&gt;lvextend&lt;/code&gt;) – Online and Atomic
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;lvextend&lt;/code&gt; updates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LVM metadata
&lt;/li&gt;
&lt;li&gt;device-mapper mappings
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These operations are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Atomic
&lt;/li&gt;
&lt;li&gt;Metadata is committed atomically via device-mapper table swap
&lt;/li&gt;
&lt;li&gt;Safe during active I/O
&lt;/li&gt;
&lt;li&gt;Non-disruptive to mounted filesystems
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cloud vendors (AWS, GCP, Azure) all document &lt;code&gt;lvextend&lt;/code&gt; as &lt;strong&gt;online&lt;/strong&gt;, immediately after &lt;code&gt;pvresize&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
LVM2 PV and LV can be grown &lt;strong&gt;online&lt;/strong&gt;, even during live writes, with no unmount required.&lt;/p&gt;


&lt;h2&gt;
  
  
  3. Online Filesystem Growth
&lt;/h2&gt;

&lt;p&gt;After expanding the block device (SAN → PV → LV), the filesystem must grow to use the new space.&lt;br&gt;&lt;br&gt;
Modern Linux filesystems support &lt;strong&gt;online, mounted filesystem expansion&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Filesystem&lt;/th&gt;
&lt;th&gt;Online Grow&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;XFS v4+&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✔&lt;/td&gt;
&lt;td&gt;Designed for online growth; cannot shrink&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ext4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✔&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;resize2fs&lt;/code&gt; supports online grow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h3&gt;
  
  
  3.1 XFS – Designed for Online Filesystem Expansion
&lt;/h3&gt;

&lt;p&gt;Red Hat’s official XFS documentation states:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“The filesystem must be mounted to be grown.”&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Source:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/6/html/storage_administration_guide/xfsgrow" rel="noopener noreferrer"&gt;https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/6/html/storage_administration_guide/xfsgrow&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is one of the clearest vendor statements that XFS online growth is not just supported, but &lt;strong&gt;required&lt;/strong&gt; while mounted.&lt;/p&gt;

&lt;p&gt;Why XFS online growth is safe:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Journaled metadata operations
&lt;/li&gt;
&lt;li&gt;Allocation groups expand in place
&lt;/li&gt;
&lt;li&gt;Grows only while mounted (required)
&lt;/li&gt;
&lt;li&gt;Designed for SAN arrays, RAID, HPC
&lt;/li&gt;
&lt;li&gt;Handles heavy concurrent I/O
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
XFS is one of the safest and most robust filesystems for &lt;strong&gt;online, mounted, high-throughput&lt;/strong&gt; growth.&lt;/p&gt;


&lt;h2&gt;
  
  
  4. Network Share Clients Automatically See Grown Filesystems
&lt;/h2&gt;

&lt;p&gt;When an XFS or ext4 filesystem exported over NFSv4 is grown on the server, clients will automatically reflect the new size without requiring remounts.&lt;/p&gt;

&lt;p&gt;NFSv4 is stateful and revalidates filesystem attributes, so the updated capacity becomes visible to clients transparently.&lt;/p&gt;

&lt;p&gt;Example on the client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;df&lt;/span&gt; &lt;span class="nt"&gt;-h&lt;/span&gt; /mnt/nfs_share
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🔴 Older Requirements (Why Some Admins Avoid Online Resize)
&lt;/h2&gt;

&lt;p&gt;Before kernel &lt;strong&gt;2.6.31&lt;/strong&gt; and early LVM2/LVM1:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;pvresize&lt;/code&gt; sometimes failed to detect new sizes
&lt;/li&gt;
&lt;li&gt;Multipath resizing was unreliable
&lt;/li&gt;
&lt;li&gt;PV resizing sometimes required &lt;code&gt;pvcreate --restorefile&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;ext2/ext3 could corrupt if device size changed underneath &lt;/li&gt;
&lt;li&gt;SAN rescan tools were buggy
&lt;/li&gt;
&lt;li&gt;UNIX systems generally required unmounting for geometry changes
&lt;/li&gt;
&lt;li&gt;LVM1 had no reliable online extension
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This produced the legacy rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“Never resize storage while mounted.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Modern RHEL9 (2021+) systems:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Online PV expansion = &lt;strong&gt;safe&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Online LV expansion = &lt;strong&gt;safe&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Online XFS growth = &lt;strong&gt;officially supported&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;SAN growth = &lt;strong&gt;nondisruptive&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The entire modern stack is designed for online operation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Safe While Mounted / Under I/O?&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SAN / FC&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Expand LUN&lt;/td&gt;
&lt;td&gt;✔ Yes&lt;/td&gt;
&lt;td&gt;Engineered for nondisruptive growth&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LVM2 PV&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pvresize&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✔ Yes&lt;/td&gt;
&lt;td&gt;Only metadata updated; no data block changes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LVM2 LV&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;lvextend&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✔ Yes&lt;/td&gt;
&lt;td&gt;Atomic metadata update; safe under I/O&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Filesystems&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Online growth&lt;/td&gt;
&lt;td&gt;✔ Yes&lt;/td&gt;
&lt;td&gt;XFS/ext4 support mounted expansion&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

</description>
      <category>linux</category>
      <category>infrastructure</category>
    </item>
  </channel>
</rss>
