Most incident-response writeups focus on the detection moment: a suspicious IP, a strange login, a web-root file, a Java service behaving oddly, or a process listening on a port it should not expose.
That first clue matters, but the next problem is usually more operational:
How do you turn the first 15-30 minutes of host triage into something another responder can trust, review, and continue?
A useful first-pass handoff is not just a paragraph that says "probably compromised." It should preserve the evidence trail, the commands or collectors used, the confidence level, the gaps, and the next manual checks.
Here is the handoff shape I like for AI-assisted host investigation.
Start with a bounded question
Instead of giving an AI agent a production shell, start with a narrow investigation question and a time window.
For example:
oi ask "Investigate this server for suspicious login, web, process, network, persistence, and recent-file clues over the last 7 days. Do not remediate." -s 7d
Or if the alert starts from one external address:
oi ip 203.0.113.77 -s 7d
The important part is not the exact command. The important part is the boundary: collect and correlate host evidence, but do not mutate the host.
Collect evidence by category, not by vibes
For a server case, a first pass should usually cover at least these areas:
- authentication events and failed/successful login patterns
- local accounts, groups, sudoers, and authorized keys summaries
- process snapshots and suspicious command lines
- listening sockets and active network connections
- systemd, cron, startup items, and other persistence points
- web logs and recent web-root file changes where relevant
- Java process and memory-shell perimeter clues where relevant
- recent files, packages, containers, and shell history clues
The goal is not to prove the whole incident in one pass. The goal is to avoid handing off a vague summary with no supporting trail.
Keep the case directory as the handoff object
Open Investigator writes every run into a case folder like this:
.oi/cases/<case-id>/
case.json # investigation input and mode
evidence.jsonl # evidence records with evidence_id
commands.log # allowed/denied command audit
report.json # structured report
report.md # human-readable report
That structure is useful because different responders need different artifacts.
The analyst reading quickly wants report.md.
The person validating claims wants evidence.jsonl.
The reviewer checking what the tool did wants commands.log.
The team integrating the result into a ticket, case system, or downstream tool wants report.json.
What should go into report.md
A good first-pass report should separate conclusions from evidence. I would expect sections like:
- executive summary
- observed signals
- timeline or chain of suspicious activity
- key evidence IDs
- confidence and risk level
- evidence gaps
- recommended manual follow-up
- explicit non-actions taken by the tool
The phrase "explicit non-actions" matters. If the tool did not block an IP, kill a process, delete a file, disable an account, restart a service, change firewall state, or isolate the host, the report should make that clear.
That is not just legal caution. It helps the next responder understand that the system was used for investigation, not remediation.
What should go into commands.log
If an AI system can ask for host checks, the audit trail should show what was allowed and what was denied.
For example, I want to know whether the run used only sealed read-only collectors, whether a policy-filtered read-only command fallback was used, and whether anything was denied because it looked destructive or outside scope.
A useful command audit answers:
- Which collector or command ran?
- Why was it allowed?
- Was anything denied?
- Which case did the action belong to?
- Did it write only to the case directory?
This is one of the places where AI incident tooling should be boring on purpose.
Do not let the summary outrun the evidence
The report should not pretend to know what it did not inspect.
Examples of honest gaps:
- outbound traffic was not proven
- EDR telemetry was not correlated yet
- cloud control-plane logs were not reviewed
- packet captures were not available
- application-level impact was not validated
- Java deep diagnostics were not enabled
- heap or JFR artifacts were intentionally not collected
Those gaps are not failures. They are the checklist for the next responder.
Why read-only matters for handoff
In an incident, the first tool that touches the host can accidentally change the evidence story.
For a first-pass AI investigator, I want the default boundary to be:
- read logs, metadata, process snapshots, network state, persistence config, and relevant file metadata
- write only case artifacts and audit records
- avoid remediation authority by default
- require explicit flags for heavier diagnostics
- preserve enough raw evidence for a human to challenge the summary
That boundary makes the result more useful to a real DFIR/SOC workflow because another person can pick up the case without guessing what the AI changed.
A practical handoff checklist
Before sending the case to a teammate, I would check:
- Is the original question and time window clear in
case.json? - Does
report.mdcite evidence IDs instead of unsupported claims? - Does
evidence.jsonlinclude the raw or summarized observations needed to challenge the conclusion? - Does
commands.logshow allowed and denied actions? - Are gaps written as follow-up tasks?
- Are heavy artifacts, if any, explicitly requested and stored under the case directory?
- Is the report clear that investigation happened but remediation did not?
That is the difference between "AI wrote a summary" and "the team received a case they can work."
Where Open Investigator fits
I maintain Open Investigator at Arvanta Cyber. It is an Apache-2.0 local AI server investigator for Linux and Windows incident response.
The design goal is narrow: let AI collect and correlate host evidence through sealed read-only tools, then produce reviewable case artifacts. It is not an EDR replacement, not a SIEM/SOAR replacement, and not an automated remediation system.
Useful starting points:
- Product overview: https://www.arvantacyber.com/open-investigator/
- AI DFIR report page: https://www.arvantacyber.com/open-investigator/ai-dfir-reporting-tool/
- Open-source repo: https://github.com/SEc-123/open-investigator
I would be interested in how other DFIR, SOC, SRE, and security engineering teams structure first-pass handoff reports. The report format is often where tooling either becomes operationally useful or turns into one more summary to distrust.
Top comments (0)