DEV Community

Qimin Zhao
Qimin Zhao

Posted on

How to investigate suspicious SSH logins without giving AI a shell

A lot of Linux incident response starts with a login question, not a malware sample.

Someone sees a spike of failed SSH attempts. A root login appears in the wrong time window. A service account logs in from an address nobody recognizes. A helpdesk ticket says "the server looks weird" and the only concrete clue is a username or IP address.

At that point, the useful question is not "is this host compromised?" It is more boring and more important:

  • Did anyone actually authenticate?
  • Which account was involved?
  • Was it password, key, sudo, su, or a scheduled task?
  • Was the same IP seen in web logs, current sockets, process context, or command history?
  • Did persistence, services, packages, or recent files change near the same time?
  • Can another responder review exactly what evidence was collected?

That last point matters. If you let an AI assistant freely run shell commands during the first pass, you can get speed, but you also create a new risk: the model may over-collect, mutate the host, or produce a confident answer that nobody can audit later.

For a login anomaly, I prefer a read-only evidence loop.

A practical first pass

Start with the narrow clue if you have one.

If the alert names a user:

oi login --user root -s 7d
Enter fullscreen mode Exit fullscreen mode

If the alert names an IP address:

oi login --ip 203.0.113.44 -s 7d
Enter fullscreen mode Exit fullscreen mode

If the alert is vague, start wider:

oi login -s 7d
oi scan -s 7d
Enter fullscreen mode Exit fullscreen mode

The goal of the first pass is not to prove every detail. The goal is to build a timeline that a human responder can challenge.

For a suspicious SSH login, I want the initial report to answer five things.

1. Authentication pattern

Look for the difference between noise and access.

A server can receive thousands of failed SSH attempts from the internet. That is useful background, but it is not the same as a successful session. The first split should be:

  • failed attempts only
  • successful login after many failures
  • accepted key from an unusual source
  • login by an account that normally should not be interactive
  • root login where root SSH should be disabled

A good report should show the exact log lines or normalized evidence behind that assessment, not just say "suspicious login observed."

2. Account context

The username is only one part of the story.

For the involved account, collect read-only account facts:

  • UID and groups
  • shell
  • home directory
  • recent password or account metadata if available
  • sudo-related context
  • whether the account looks like a human, service, or automation identity

This helps separate a noisy brute-force event from a potentially meaningful access event.

3. Session and process context

If the login was successful, ask what was active around the same time.

Useful read-only checks include:

  • current sessions
  • process snapshot
  • parent/child process hints
  • network sockets
  • command history if available and appropriate
  • recent files in likely working directories

None of these alone prove compromise. Together, they can show whether the login became a shell, touched a web root, started a process, connected outbound, or did nothing observable.

4. Persistence and service changes

Login anomalies often matter because of what follows.

For the same time window, check persistence surfaces:

  • cron and timer entries
  • systemd services
  • shell startup files
  • recently modified files
  • unusual listeners
  • package changes
  • container changes if the host runs containers

The first pass should be explicit about gaps. For example: "auth logs show a successful login, but no matching persistence change was found in the collected window" is much more useful than a dramatic conclusion.

5. Reviewable output

The output should not be an opaque chat answer.

For a real handoff, I want a case folder that contains:

  • evidence.jsonl for normalized observations
  • commands.log for what was collected
  • report.json for structured findings
  • report.md for the human-readable narrative

This lets a second responder inspect the evidence, rerun missing checks, disagree with the conclusion, or attach the report to a ticket.

Why the AI boundary matters

An AI assistant is useful for correlation. It can connect login times, accounts, IPs, files, services, and network state faster than a tired human staring at separate terminals.

But during first-pass incident response, the assistant should not have authority to remediate by default.

It should not:

  • kill processes
  • delete files
  • disable accounts
  • restart services
  • block IPs
  • change firewall rules
  • install packages
  • upload or download tools

Those actions may be appropriate later, but they belong in an approved response step, not in evidence collection.

The safer pattern is: collect bounded evidence, write an auditable report, then let the responder decide what to do next.

Where Open Investigator fits

I maintain Open Investigator at Arvanta Cyber. It is an Apache-2.0 local AI incident investigation CLI for Linux and Windows hosts. The project is built around this read-only first-pass model: AI gets sealed investigation tools, not arbitrary remediation authority.

For login anomalies, WebShell clues, suspicious IPs, Java service issues, and vague server alerts, the point is to turn a loose signal into a reviewable case folder.

Project:
https://github.com/SEc-123/open-investigator

Overview:
https://www.arvantacyber.com/open-investigator/

Related read-only AI incident response guide:
https://www.arvantacyber.com/open-investigator/articles/local-ai-server-incident-response/

Top comments (0)