DEV Community

Cover image for SHENRON v0.3.3: From Telemetry Generator to Blue-Team Reasoning Instrument
GnomeMan4201
GnomeMan4201

Posted on

SHENRON v0.3.3: From Telemetry Generator to Blue-Team Reasoning Instrument

What changed between "here is synthetic telemetry" and "here is what your validation claims to prove."

Repo: https://github.com/GnomeMan4201/shenron


The first article described what SHENRON is: a defensive telemetry simulation platform that generates adversarial-shaped synthetic events without producing payloads, shellcode, subprocess execution, or portable adversarial procedure.

This article describes what it became.


The gap the first version couldn't close

Generating synthetic telemetry is useful. It lets you test whether your SIEM ingests the right fields, whether your detection rules are pointed at the right signal vocabulary, and whether your analysts recognize the event sequences they need to recognize.

But it doesn't answer the harder question: what exactly did your validation claim to prove?

A detection stack validated against persistence-shaped telemetry has not been tested against C2 beaconing, lateral movement, or anti-forensics. That is not a failure — it is a scope boundary. The problem is when the scope boundary is invisible.

v0.3.3 makes it visible.

In plain terms: SHENRON does not ask whether a detector is "good." It asks whether the evidence produced by a synthetic telemetry run actually supports the claim being made about that run. That makes it less of a pass/fail simulator and more of a scope-control instrument for blue-team validation.


What ships in one command

git clone https://github.com/GnomeMan4201/shenron
cd shenron
python3 shenron.py --release-demo
Enter fullscreen mode Exit fullscreen mode

That produces a 9-file artifact bundle:

shenron_demo_run.jsonl              40 synthetic events
shenron_demo_report.md              run report
safety_verification.md              safety contract verification
navigator_layer.json                ATT&CK Navigator layer (synthetic)
shenron_demo_run_ecs.json           ECS-formatted events
shenron_demo_run_ecs_bulk.ndjson    Elastic bulk API format
shenron_demo_run_splunk_hec.json    Splunk HEC format
narrative.md                        tactic profile narrative
charts/                             5 dark-mode PNGs
MANIFEST.md                         bundle index
Enter fullscreen mode Exit fullscreen mode

Every record in the JSONL carries an explicit safety contract:

{
  "phase": "OBSERVE",
  "layer": "beacon_emitter_cloak",
  "signal": "periodic_beacon",
  "mitre_technique": "T1071.001",
  "safety": {
    "simulation_only": true,
    "executable": false,
    "payload_present": false,
    "portable_adversarial_procedure": false,
    "network_connection": false,
    "subprocess_spawned": false,
    "real_file_written": false,
    "shell_invoked": false
  }
}
Enter fullscreen mode Exit fullscreen mode

The ECS export goes directly into Elastic:

curl -X POST 'http://localhost:9200/_bulk' \
     -H 'Content-Type: application/x-ndjson' \
     --data-binary @shenron_demo_run_ecs_bulk.ndjson
Enter fullscreen mode Exit fullscreen mode

Every ECS event carries event.dataset: shenron.synthetic, labels.simulation_only: true, and [SHENRON SYNTHETIC] in the message field. These are not real events. Your detection rules firing or not firing on them tells you something about your rules, not about real adversarial behavior.


The coverage gap feature

The most important feature in v0.3.3 is --narrate.

After running two different scenarios, compare them:

python3 shenron.py --scenario apt_kill_chain --dry-run
python3 shenron.py --scenario persistence_runbook --dry-run
python3 shenron.py --compare <apt_id> <persistence_id> --narrate
Enter fullscreen mode Exit fullscreen mode

The terminal output:

  [NARRATIVE]   apt_kill_chain → persistence_runbook

  Coverage gap families (4):
    ✗  Command-and-Control
    ✗  Defense Evasion
    ✗  Lateral Movement
    ✗  Discovery

  Primary concern:
    If C2-shaped telemetry is not in your validation set, your detectors
    have not been tested against the phase where most APT campaigns are
    first visible — initial callback after compromise.
Enter fullscreen mode Exit fullscreen mode

The full narrative report names every missing signal by tactic family:

### Command-and-Control

MITRE descriptors not present in Run B: T1071, T1132
Signal shapes absent from Run B: DNS-based C2 signaling,
encoded URI C2 parameter, and periodic C2 beaconing.

> If C2-shaped telemetry is not in your validation set, your detectors
> have not been tested against the phase where most APT campaigns are
> first visible — initial callback after compromise.
Enter fullscreen mode Exit fullscreen mode

And produces a concrete recommendation:

To close the Command-and-Control, Defense Evasion, Lateral Movement,
and Discovery gaps, run a scenario that includes those signal families
alongside persistence_runbook. Suggested: apt_kill_chain (covers C2,
lateral movement, persistence, and evasion) or evasion_stress_test
(covers masquerading, log deletion, and anti-forensics).
Enter fullscreen mode Exit fullscreen mode

This is deterministic and template-based. No LLM. The narration engine classifies signals into tactic families using a static taxonomy of 80+ signal names and 35 MITRE technique IDs, then assembles analyst-language prose from that classification.


Coverage assumption auditing

Define what you believe your detection stack covers:

name: persistence_coverage_assumption
claims:
  - "We can observe persistence-shaped telemetry"
  - "We can detect suspicious scheduled task behavior"
expected_techniques:
  - T1053.005
  - T1547.001
expected_signals:
  - scheduled_task_creation
  - hidden_temp_directory
expected_phases:
  - EXECUTE
Enter fullscreen mode Exit fullscreen mode

Audit it:

python3 shenron.py --assumption assumptions/my_assumption.yaml \
  --events artifacts/demo/shenron_demo_run.jsonl
Enter fullscreen mode Exit fullscreen mode
  [ASSUMPTION]  persistence_coverage_assumption
  [RECORDS]     40

  Claims        0 supported  0 partial  2 unsupported
  Techniques    2 observed   1 missing
  Signals       1 observed   1 missing

  [COVERAGE]    45.0%
  [VERDICT]     FAIL
Enter fullscreen mode Exit fullscreen mode

SHENRON is not telling you your detection stack fails. It is telling you that your assumption, as written, is not supported by the evidence in this artifact. That is a different question, and a more useful one.

The assumption auditor asks: what exactly did your validation claim to prove, and does the evidence support that claim?


Coverage drift tracking

python3 shenron.py --coverage-history --out-dir reports/history
Enter fullscreen mode Exit fullscreen mode

With 256 runs across 8 campaigns in the timeline:

  [HISTORY]     256 runs · 8 campaigns

  apt_kill_chain              7 runs   16 techniques
  c2_shape_detection_test    79 runs   16 techniques
  full_stack_adversarial     26 runs   28 techniques
  persistence_pressure_test 108 runs   10 techniques

  [DRIFT]       No technique drift detected
Enter fullscreen mode Exit fullscreen mode

No drift is good news for scenario consistency: the campaigns are producing stable technique sets across runs. It does not mean the detection stack is effective; it means the validation artifact has not silently changed shape. The tracker becomes more useful over time as you modify scenarios, update layer configurations, and run across different environments. If a configuration change silently drops technique coverage, the history report shows it.


Mutation variants

Test whether your analysis pipeline is brittle:

python3 shenron.py --mutate \
  --events artifacts/demo/shenron_demo_run.jsonl \
  --out-dir artifacts/mutations
Enter fullscreen mode Exit fullscreen mode

Seven safe variants:

  [field_drop          ]   40 →  40 records    40 changes
  [timing_jitter       ]   40 →  40 records    40 changes
  [label_ambiguity     ]   40 →  40 records     3 changes
  [signal_density_high ]   40 → 120 records    80 changes
  [signal_density_low  ]   40 →  17 records    23 changes
  [phase_imbalance     ]   40 →  40 records    30 changes
  [technique_noise     ]   40 →  40 records    11 changes
Enter fullscreen mode Exit fullscreen mode

Run --verify-safety on any variant to confirm the safety contract is intact:

python3 shenron.py --verify-safety artifacts/mutations/mutation_label_ambiguity.jsonl
Enter fullscreen mode Exit fullscreen mode

The question each mutation is designed to answer:

  • field_drop: Does your pipeline depend on optional fields being present?
  • timing_jitter: Do your correlation rules break on ±5 minute timing variance?
  • label_ambiguity: Do your rules fire on specific signal names or on signal patterns?
  • signal_density_high: Can your SIEM handle 3× the expected event volume without dropping records?
  • signal_density_low: Does your detection still work when 50% of events are missing?
  • phase_imbalance: Does your phase-aware analysis handle imbalanced runs?
  • technique_noise: Do your MITRE-based correlations produce false positives under noisy technique labels?

What this does not prove

Every report SHENRON produces includes this section. It is not boilerplate — it is load-bearing.

  • That real adversarial techniques were executed
  • That real detection rules fired on these signals
  • That a SIEM or EDR would catch the described behaviors
  • That coverage in SHENRON equals coverage in production
  • That closing a SHENRON gap closes the same gap in your environment

SHENRON tests the telemetry pipeline layer. It is complementary to adversarial emulation, not a substitute. The value is in making assumptions inspectable, not in replacing the work of actually testing your stack against real execution.


The core identity

SHENRON does not ask "is your detector good?"

It asks: what exactly did your validation claim to prove?

That question is more honest, more tractable, and often more useful than vague detection coverage claims.


Repo: https://github.com/GnomeMan4201/shenron
Tag: v0.3.3 — 50 layers · 283 tests · zero hardcoded paths · PASS verdict

gnomeman4201 / badBANANA Research Collective

Observable adversarial behavior, not portable adversarial procedure.

Top comments (0)