DEV Community

alok-38
alok-38

Posted on

Getting acquainted with basic Linux commands commonly used for debugging slow servers

Getting Started with Linux for DevOps: Monitoring CPU, Memory, and More

I keep hearing that Linux remains the dominant operating system for servers, containers, cloud instances, Kubernetes nodes, CI/CD runners, monitoring agents, and virtually every production environment that DevOps engineers touch in 2026 and beyond. Because of this, strong practical Linux skills are expected in almost every serious DevOps, SRE, or Platform Engineering role.

In this post, we won’t tackle complex production issues—like a containerized application not responding or nginx underperforming—just yet. Instead, we’ll focus on preliminary checks such as CPU uptime, memory usage, and more which includes realistic troubleshooting scenarios.


Prerequisites

  • A Linux distribution (examples here use Rocky Linux on VirtualBox)

  • SSH access to the virtual machine


1. Installing the required tools

sudo dnf install -y sysstat strace perf iotop
Enter fullscreen mode Exit fullscreen mode
  • sysstat -> provides sar (system activity reporter), a tool used to collect, report, and save system performance metrics over time.

Live example of sar in action CPU usage in 1-second intervals, 5 times

Let's break it down

  • %user → CPU used by user processes (your applications) → very low (0%)

  • %nice → CPU used by nice/low-priority processes → 0%

  • %system → CPU used by kernel/system processes → small (0.5–5%)

  • %iowait → CPU waiting for disk I/O → 0% → disk is not a bottleneck

  • %steal → CPU time taken by hypervisor for other VMs → 0%

  • %idle → CPU idle → very high (94–99%) → CPU is mostly free


2. Check System Load Safely

uptime
top -b -n 1
vmstat 1 5
iostat -xz 1 5
Enter fullscreen mode Exit fullscreen mode
  • top -b -n 1 → batch mode prevents cluttering the terminal.
  • vmstat and iostat give quick snapshots.

Let's view the output of two commands at a time

Let's carefully decode my uptime output

10:26:36 up 57 min,  2 users,  load average: 0.08, 0.02, 0.01
Enter fullscreen mode Exit fullscreen mode
  • 10:26:36 → the current system time when the command was run.
  • up 57 min → the system has been running continuously for 57 minutes.
  • 2 users → there are 2 active sessions currently logged into the system.
  • load average: 0.08, 0.02, 0.01 → three numbers representing average system load over (I'm ignoring this part for now)

Decoding the top command top -b -n 1

The top command shows a live, interactive view of system processes, CPU, memory, swap, and load averages. I won't try to learn the entire output. I will ignore the table for now.

Here we are executing the top command in batch mode with just one iteration as indicated by this switch -n 1

Let's start with the header line

top - 10:27:26 up 58 min,  2 users,  load average: 0.03, 0.01, 0.00
Enter fullscreen mode Exit fullscreen mode
  • 10:27:26 → current system time.

  • up 58 min → the system has been running continuously for 58 minutes.

  • 2 users → 2 logged-in users/sessions.

  • load average: 0.03, 0.01, 0.00 → the system load averages over 1, 5, and 15 minutes.

    • Very low numbers → CPU is mostly idle.
    • For 2 CPUs, load ≤ 2 is normal. Here, 0.03 is negligible.

2. Tasks

Tasks: 123 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
Enter fullscreen mode Exit fullscreen mode

System is healthy; almost all processes are idle.


3. CPU Usage

%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 95.2 id,  0.0 wa,  4.8 hi,  0.0 si,  0.0 st
Enter fullscreen mode Exit fullscreen mode

CPU is almost completely idle; no performance pressure.


4. Memory Usage

MiB Mem :   3653.4 total,   2960.2 free,    439.7 used,    470.5 buff/cache
Enter fullscreen mode Exit fullscreen mode

Memory is abundant; system is far from pressure!

5. Swap Usage

I’m still clarifying my understanding of swap and will write a follow-up post once I’ve learned more.

That wraps things up for now—more to come soon.

Top comments (0)