A suspected WebShell is awkward because the first clue is often weak.
You may have one odd request in an access log, one newly modified file under a web root, a process running as the web user, or an outbound connection that does not fit the service.
The dangerous move is to jump straight into cleanup. Before deleting files or restarting services, a responder needs a small evidence map:
- what request or path started the suspicion
- whether related files changed recently
- whether the web user has unusual processes
- whether those processes have network connections
- whether persistence or scheduled tasks changed nearby
- whether auth logs show a related login or privilege event
- what evidence is still missing
That is the point where local, read-only AI can be useful.
Start from the web root, not a verdict
A practical first pass can start with the web root and recent time window:
oi web --root /var/www -s 7d
The goal is not to ask AI to decide whether the host is compromised. The goal is to collect enough context that a human can challenge the next step.
For a suspected WebShell, I would want the first-pass collector to look at:
- recently changed files under the web root
- suspicious script extensions or unexpected upload locations
- web access/error log entries around those paths
- requests with unusual parameters, encodings, or user agents
- processes owned by the web service user
- listening sockets and outbound connections
- persistence files, services, cron entries, and recent auth activity
None of those require remediation authority.
Correlate weak clues
A single clue is noisy. A PHP file changed yesterday may be a normal deploy. A POST request to an upload directory may be normal application behavior. An outbound connection may belong to a legitimate integration.
The value comes from correlation.
For example:
- A new file appears in an upload directory.
- Web logs show a POST to that file shortly after creation.
- A process owned by the web user opens an outbound connection.
- The same time window shows a service or cron change.
That chain is not a final incident report, but it is much more useful than a vague alert.
Keep the AI boundary boring
If an AI investigation tool can run arbitrary shell on a production host, it is too easy for the model to cross from investigation into response.
For this workflow, I want the model to request sealed read-only checks instead of improvising commands:
- web log checks
- recent file checks
- process snapshots
- network snapshots
- persistence snapshots
- auth/account context
- package and container context when relevant
The tool should log every command and every denial. It should write evidence separately from the summary, so a responder can reproduce or reject the conclusion.
What a useful output looks like
A useful first pass should produce a case folder, not just a paragraph.
The artifacts I want are:
-
evidence.jsonlfor structured observations -
commands.logfor the audit trail -
report.jsonfor machine-readable handoff -
report.mdfor human review
The report should say what was found, why it matters, what is missing, and what a human responder should verify next. It should not delete the suspected file, block the IP, kill the process, or restart the web service.
Where Open Investigator fits
I maintain Open Investigator at Arvanta Cyber. It is an Apache-2.0 local AI server investigation CLI for Linux and Windows hosts.
The WebShell use case is one of the scenarios it is designed for: start from a weak web clue, collect read-only host evidence, and produce a reviewable evidence bundle.
WebShell investigation page:
https://www.arvantacyber.com/open-investigator/webshell-investigation-tool/
Project overview:
https://www.arvantacyber.com/open-investigator/
GitHub:
https://github.com/SEc-123/open-investigator
If you do incident response, blue-team work, web operations, or SRE on Linux servers, the design question I care about is this: what evidence would you require before trusting an AI-assisted first pass on a suspected WebShell?
Top comments (0)