π Author's Note:
After spending 2 years managing Linux systems in production, Iβve collected a set of real-world issues and battle-tested solutions.
This is not just a command dump β itβs a thinking framework + practical toolkit to troubleshoot smarter and faster.
Letβs dive into 14+ issues ranging from network outages and disk failures to memory leaks and corrupted filesystems.
β Issue 1: Server Not Reachable π
π§ Objective:
Determine why a server can't be reached from the network or another system.
β Step-by-Step Guide:
1. π Check Network Reachability:
ping server_ip
ping server_hostname
- π’ If both hostname & IP respond:
- The server is reachable; the issue may lie on the client side or higher-layer services (e.g., SSH, app).
-
β If hostname fails but IP works:
- Likely a DNS resolution problem.
- Check these files:
cat /etc/hosts cat /etc/resolv.conf cat /etc/nsswitch.conf
- Validate DNS server entries and name resolution order.
-
β If both hostname & IP fail:
- Check whether the issue is isolated or widespread.
ping another_server_in_same_network
2. π₯οΈ Access Server via Console (if available):
Check network interface status:
ip addr
nmcli device status
ifconfig
ip link show
Check if default gateway is reachable:
ip route
ping default_gateway
3. π Check Security Settings:
- SELinux:
getenforce
- Firewalls:
iptables -L
firewall-cmd --list-all
ufw status
4. π Physical Layer (if bare metal):
- Cable connected?
- NIC link lights blinking?
- Hardware errors in
dmesg
π Issue 2: Cannot Connect to Website or App
π§ Objective:
You can reach the server, but the web app or service is unreachable.
β Troubleshooting Steps:
1. Ping Server:
ping app_server_ip
If fails, follow Issue #1.
2. Check Port Reachability:
telnet server_ip 80 # For HTTP
nc -zv server_ip 443 # For HTTPS
3. If Port is Closed:
- Is the app/service running?
systemctl status nginx
service apache2 status
- Restart service:
systemctl restart nginx
4. Check App Logs:
journalctl -u nginx
tail -f /var/log/nginx/error.log
5. Confirm Listening Ports:
ss -tulnp
netstat -tulnp
6. Check Firewall/SELinux:
As shown in Issue #1
π Issue 3: SSH Fails (Root or User Login)
π§ Objective:
Fix issues preventing SSH login for root or users.
β Checklist:
1. Network Connectivity:
ping server_ip
If no ping, go to Issue #1.
2. Check SSH Port:
nc -zv server_ip 22
3. If Port is Open:
- Check SSH daemon:
systemctl status sshd
systemctl restart sshd
- Check SSH Config:
cat /etc/ssh/sshd_config | grep PermitRootLogin
- Verify user shell:
cat /etc/passwd | grep username
Shell must not be /sbin/nologin
or /bin/false
.
- Logs:
tail -f /var/log/secure
tail -f /var/log/auth.log
π½ Issue 4: Disk Full or Add Storage
π§ Symptoms:
- Apps crash, logs fail to write, or server feels sluggish.
β Steps:
1. Check Usage:
df -h
2. Find Large Files:
du -sh /*
du -sh /var/*
3. Cleanup Suggestions:
- Log rotation:
logrotate -f /etc/logrotate.conf
- Delete unused logs or archives
- Move files to other mounts
4. Disk Health:
badblocks -v /dev/sda
5. Monitor I/O:
iostat -xz 1
iotop
dstat
π§± Issue 5: Filesystem Corrupted
π§ Symptoms:
System wonβt boot or throws mount errors.
β Fix:
- Boot with Live CD or ISO into rescue mode.
- Mount root FS:
chroot /mnt/sysimage
- Check Logs:
dmesg | grep -i error
tail /var/log/messages
- Run fsck:
fsck /dev/sdX1
π§Ύ Issue 6: /etc/fstab
Missing or Invalid
β Recovery:
- Boot into rescue mode.
- Mount root:
chroot /mnt/sysimage
- View device UUIDs:
blkid
- Rebuild
/etc/fstab
: Example:
UUID=xxxxx / ext4 defaults 0 1
π« Issue 7: Cannot cd
to Directory (Even as Root)
π Common Causes:
- Path doesn't exist:
ls -ld /path/to/dir
- Missing execute bit:
chmod +x /dir
- Ownership issues:
chown user:group /dir
π Issue 8: Cannot Create Symlink or Hard Link
π Reasons:
- Target doesn't exist
- Youβre linking across filesystems (hard links require same device)
- Permission issues
β Examples:
ln -s /actual/path /shortcut/path
ln /file1 /file2 # Hard link
π§ Issue 9: Server Running Out of Memory
π§ͺ Check Usage:
free -h
top
htop
ps aux --sort=-%mem
π©Ί Analyze /proc/meminfo
:
cat /proc/meminfo | grep -i active
β Actions:
- Kill high memory consumers
- Adjust priority:
renice -n 10 -p <pid>
- Add or extend swap (see next)
πΎ Issue 10: Add or Extend Swap Space
β Add Swap File:
dd if=/dev/zero of=/swapfile bs=1G count=4
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
π Persistent Config:
echo '/swapfile none swap sw 0 0' >> /etc/fstab
β Issue 11: Can't Run Certain Commands
β Check:
- Is command installed?
which command
- In $PATH?
echo $PATH
- File executable?
ls -l /usr/bin/command
chmod +x
- Shared libraries:
ldd /usr/bin/command
π Issue 12: Unexpected Reboot or Crashes
π§ͺ Root Causes:
- Overheating
- Kernel panic
- Power/hardware issue
- Out Of Memory (OOM) killer
β Logs:
journalctl -xe
dmesg | less
uptime
π Issue 13: Server Has No IP Address
β Checklist:
- View interfaces:
ip addr
- NIC status:
nmcli device
- Restart networking:
systemctl restart NetworkManager
ποΈ Issue 14: Backup & Restore File Permissions
π§· Backup Permissions:
getfacl -R /var/www > www.acl
π Restore:
setfacl --restore=www.acl
π§ Pro Tip: Take VM snapshots before major permission changes!
π‘ Bonus: Useful Disk Partitioning Tips
- π Detect new disk:
echo 1 > /sys/block/sdX/device/rescan
- π Extend LVM:
pvcreate /dev/sdX
vgextend my_vg /dev/sdX
lvextend -l +100%FREE /dev/my_vg/my_lv
resize2fs /dev/my_vg/my_lv
π Final Thoughts
π― Every Linux problem has a root cause β donβt just reboot, investigate!
Use this guide like a runbook: step-by-step, methodical, and confident.
Keep calm, troubleshoot smart, and script your way to stability. π§ββοΈπ§
Top comments (0)