The mental checklist I use when troubleshooting Linux servers

#linux #beginners #tutorial #productivity

When a Linux server breaks or something doesn't go to plan, I used to panic and jump between several different commands. Over time, I realized almost every issue fits into the same pattern. This is the mental checklist I now fall back on, written down officially.

Step 1: What is broken?

Service not running?
Server unreachable?
Performance issue?
Permission issue?
Always define failure first

Step 2: Is the system alive?

Can I SSH in?
Is the server responsive?
Is the disk full?
Is RAM exhausted?

Step 3: Is the service running?

Is the process running?
Did it fail to start?
Did it crash?
This eliminates 50% of issues

Step 4: Check logs
Logs usually tell you:

Why it failed
What it tried to do
What it couldn't access
Learn to scan logs, not read every line

Step 5: What changed last?
Most issues come from:

Updates
Config edits
Permission changes
New files
Always ask: what changed?

Step 6: Narrow scope

Is it one user or all users?
One service or the whole system?
One port or all networking?
This prevents panic

Step 7: Test ONE thing at a time

Make a small change
Restart service
Observe
Never shotgun-fix

Step 8: Confirm + document

Is it fixed?
Why?
What would I do faster next time?
That's real troubleshooting

DEV Community

The mental checklist I use when troubleshooting Linux servers

Top comments (0)