DEV Community

pete
pete

Posted on

ForensicDetective: When You Don’t Have Time to Read 80,000 Lines of Garbage

System: MatrixSwarm Infrastructure
Component: forensic_detective agent
Purpose: Correlate, analyze, and elevate status reports from the hive into structured, intelligent incident forensics.


What Is ForensicDetective?

forensic_detective is not a logger.

It’s a real-time forensic correlation engine inside your swarm. When other agents report warnings, failures, or breaches, forensic_detective builds a story from the noise:

  • Who triggered what.
  • What other events happened in the same window.
  • Why it matters.
  • And what to do about it (with or without AI).

Core Capabilities

1. Structured Event Ingestion

Listens to incoming reports via:

role: hive.forensics.data_feed@cmd_ingest_status_report
Enter fullscreen mode Exit fullscreen mode

Receives events from agents like:

  • gatekeeper
  • nginx_watchdog
  • system_health
  • network_health
  • ghost_wire

2. Event Hashing + Buffering

Every event is hashed (minus timestamp) to avoid dupes. Stored in a rotating buffer:

  • Default: last 100 events
  • Retention: 120 seconds

This creates a live memory of what just happened across your infrastructure.

3. Critical Event Triggering

When an incoming report has:

"severity": "CRITICAL"
Enter fullscreen mode Exit fullscreen mode

It:

  • Checks cooldown per service (default: 300s)
  • Assigns a unique incident UUID
  • Triggers the full forensic process

4. Event Correlation

All buffered events in the last 120s are pulled in. This is how you get context:

"Nginx crashed" alone means nothing.
But "Nginx crashed + disk 95% full + CPU spike + login from China" means everything.

5. Forensic Report Generation

Loads a per-service Python module dynamically:

from forensic_detective.factory.watchdog.nginx.investigator import Investigator
Enter fullscreen mode Exit fullscreen mode

If found, it runs:

add_specific_findings(findings)
Enter fullscreen mode Exit fullscreen mode

And appends insight to the incident.

No module? No problem. It still builds a default report.

6. Alert Broadcasting

Once analysis is done:

  • Formats an alert embed (title, description, incident ID)
  • Sends it to: hive.alert@cmd_send_alert_msg
  • Includes full findings, even if the analysis says: "Your disk is full, genius."

7. Oracle Integration (Optional)

If enabled:

"oracle_analysis": {
  "enable_oracle": 1,
  "role": "hive.oracle"
}
Enter fullscreen mode Exit fullscreen mode

Then the detective sends a GPT-style prompt like:

"CPU high. Nginx failed. Disk at 99%. What happened? What should we do?"

Oracle responds with a full RCA + command steps.

It gets rebroadcast under:

## AI-Enhanced Analysis
Enter fullscreen mode Exit fullscreen mode

If enabled in config:

"oracle_analysis": { "enable_oracle": 1, "role": "hive.oracle" }
Enter fullscreen mode Exit fullscreen mode
  • Sends a structured prompt (critical event + preceding context) to an Oracle AI.

  • Oracle responds with root cause + remediation steps.

  • forensic_detective rebroadcasts as a AI-Enhanced Analysis alert, which can piped to Slack, Telegram, Discord, Email, etc

And logged.

8. Postmortem File Saved

Every triggered incident gets saved to:

/swarm/sessions/your_agent/summary/YYYYMMDD-nginx-failure.json
Enter fullscreen mode Exit fullscreen mode

Contents:

  • Incident ID
  • Timestamp
  • Critical event
  • All correlated events
  • Full forensic report (Oracle + local)

Why It Matters

Problem Without FD With ForensicDetective
Nginx went down You grep logs for 10 mins Instant report w/ root cause
Alerts spam you All say "HIGH CPU" One incident. One cause. Full stack trace.
No traceability Logs rotated JSON archive with context + command steps
No escalation You miss the trend FD ties related events into a single actionable failure

How To Enable It

  1. In your directive:
{
  "universal_id": "forensic-detective-1",
  "name": "forensic_detective",
  "config": {
    "oracle_analysis": {
      "enable_oracle": 1,
      "role": "hive.oracle"
    },
    "alert_to_role": "hive.alert"
  },
  "service-manager": [
    {
      "role": ["hive.forensics.data_feed@cmd_ingest_status_report"]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode
  1. Make your agents report structured events using:
send_status_report(status, severity, details, metrics)
Enter fullscreen mode Exit fullscreen mode
  1. Optional: Drop in your own custom factory:
forensic_detective/factory/watchdog/nginx/investigator.py
Enter fullscreen mode Exit fullscreen mode

With a function:

def add_specific_findings(self, findings):
    findings.append("nginx failed due to repeated 502 errors")
    return findings
Enter fullscreen mode Exit fullscreen mode

Resources

GitHub: https://github.com/matrixswarm/matrixos

GitHub: https://github.com/matrixswarm/phoenix

Docs: https://matrixswarm.com

Discord: https://discord.gg/CyngHqDmku

Telegram: https://t.me/matrixswarm

Python: pip install matrixswarm

Codex: /agents/gatekeeper

X/Twitter: @matrixswarm

💬 Join the Hive:
Join the Swarm: https://discord.gg/CyngHqDmku
Report bugs, fork the swarm, or log your own Codex banner.

Top comments (0)