How to triage Java memory-shell clues without unsafe default heap dumps

#security #java #incidentresponse #opensource

Disclosure: I maintain Open Investigator at Arvanta Cyber.

A suspected Java memory shell is an awkward incident-response starting point. You may not have a clean IOC. You may only have a strange request path, a servlet that should not exist, a web process that opened an unexpected connection, or a Java service that suddenly behaves differently after a deploy window.

The risky move is to jump straight to heavy production diagnostics. Heap dumps and flight recordings can be useful, but they can also be large, sensitive, and disruptive if a team treats them as the first button to press.

For a first pass, I want a safer question:

What read-only evidence can I collect before asking for heavier JVM inspection?

The first-pass Java triage loop

I would start with low-impact context:

Identify Java processes and service owners.

Look at process command lines, working directories, users, service managers, and how the JVM was launched.

Review JVM flags and attach-style clues.

Useful clues include unexpected -javaagent, -agentlib, JDWP, Xbootclasspath, suspicious classpaths, unusual temp directories, and command lines that do not match the expected service unit.

Correlate with web and app evidence.

A memory shell often matters because it is reachable through a web surface. I want nearby web logs, request paths, status codes, user agents, reverse proxy logs, app logs, and recent changes under web roots or application directories.

Expand to process and network context.

Which ports are listening? Are there new outbound connections from the Java user? Are there child processes that should not exist? Did the Java service touch files or directories at the same time as the suspicious request?

Check persistence around the same window.

Systemd units, cron entries, startup scripts, modified service files, package changes, and recently changed files can explain whether this looks transient or durable.

Where AI can help safely

AI is useful here when it correlates many small pieces of evidence. It is less useful, and more dangerous, when it gets broad production-changing authority.

For this type of investigation, I want the AI to ask for bounded read-only checks and produce a report a human can challenge. I do not want it to kill the Java process, delete files, restart the service, block IPs, disable accounts, or start dumping memory without approval.

With Open Investigator, the low-impact starting point looks like this:

oi java -s 14d
oi mem -s 14d

Those checks focus on Java process metadata, JVM launch clues, related web/app evidence, process context, network context, recent files, and persistence evidence.

If the production owner approves deeper JVM inspection, the deeper path is explicit:

oi mem -s 14d -m inv --java-deep
oi java -s 14d -m inv --java-deep

And heavy artifacts are separate decisions:

oi mem -s 14d -m inv --java-deep --heap-dump
oi mem -s 14d -m inv --java-deep --jfr-dump

That separation matters. A first-pass report should not quietly turn into a heap dump workflow.

What the output should achieve

The goal is not AI says this is a memory shell. The useful output is a case folder with evidence and uncertainty preserved:

evidence.jsonl for individual evidence records
commands.log for commands and denials
report.json for structured handoff
report.md for responder review

A good report should say what was observed, what is suspicious, what looks normal, what evidence is missing, and what needs manual confirmation.

That is the practical value: faster first-pass triage without pretending the tool has finished the incident.

Where Open Investigator fits

Open Investigator is an Apache-2.0 local AI server investigator. It runs on Linux and Windows hosts and gives the model sealed read-only tools for auth, accounts, processes, network, persistence, services, web logs, Java clues, recent files, packages, containers, and history.

The boundary is deliberate: it investigates, but it does not remediate.

Related Arvanta page on Java memory-shell investigation:
https://www.arvantacyber.com/open-investigator/java-memory-shell-investigation/

Open Investigator overview:
https://www.arvantacyber.com/open-investigator/

Open-source repo:
https://github.com/SEc-123/open-investigator

I would be interested in feedback from Java operators, SREs, DFIR teams, and blue-team engineers: what evidence would you require before approving deeper JVM inspection on a production service?