What changed between "here is synthetic telemetry" and "here is what your validation claims to prove."
Repo: https://github.com/GnomeMan4201/shenron
The first article described what SHENRON is: a defensive telemetry simulation platform that generates adversarial-shaped synthetic events without producing payloads, shellcode, subprocess execution, or portable adversarial procedure.
This article describes what it became.
The gap the first version couldn't close
Generating synthetic telemetry is useful. It lets you test whether your SIEM ingests the right fields, whether your detection rules are pointed at the right signal vocabulary, and whether your analysts recognize the event sequences they need to recognize.
But it doesn't answer the harder question: what exactly did your validation claim to prove?
A detection stack validated against persistence-shaped telemetry has not been tested against C2 beaconing, lateral movement, or anti-forensics. That is not a failure — it is a scope boundary. The problem is when the scope boundary is invisible.
v0.3.3 makes it visible.
In plain terms: SHENRON does not ask whether a detector is "good." It asks whether the evidence produced by a synthetic telemetry run actually supports the claim being made about that run. That makes it less of a pass/fail simulator and more of a scope-control instrument for blue-team validation.
What ships in one command
git clone https://github.com/GnomeMan4201/shenron
cd shenron
python3 shenron.py --release-demo
That produces a 9-file artifact bundle:
shenron_demo_run.jsonl 40 synthetic events
shenron_demo_report.md run report
safety_verification.md safety contract verification
navigator_layer.json ATT&CK Navigator layer (synthetic)
shenron_demo_run_ecs.json ECS-formatted events
shenron_demo_run_ecs_bulk.ndjson Elastic bulk API format
shenron_demo_run_splunk_hec.json Splunk HEC format
narrative.md tactic profile narrative
charts/ 5 dark-mode PNGs
MANIFEST.md bundle index
Every record in the JSONL carries an explicit safety contract:
{
"phase": "OBSERVE",
"layer": "beacon_emitter_cloak",
"signal": "periodic_beacon",
"mitre_technique": "T1071.001",
"safety": {
"simulation_only": true,
"executable": false,
"payload_present": false,
"portable_adversarial_procedure": false,
"network_connection": false,
"subprocess_spawned": false,
"real_file_written": false,
"shell_invoked": false
}
}
The ECS export goes directly into Elastic:
curl -X POST 'http://localhost:9200/_bulk' \
-H 'Content-Type: application/x-ndjson' \
--data-binary @shenron_demo_run_ecs_bulk.ndjson
Every ECS event carries event.dataset: shenron.synthetic, labels.simulation_only: true, and [SHENRON SYNTHETIC] in the message field. These are not real events. Your detection rules firing or not firing on them tells you something about your rules, not about real adversarial behavior.
The coverage gap feature
The most important feature in v0.3.3 is --narrate.
After running two different scenarios, compare them:
python3 shenron.py --scenario apt_kill_chain --dry-run
python3 shenron.py --scenario persistence_runbook --dry-run
python3 shenron.py --compare <apt_id> <persistence_id> --narrate
The terminal output:
[NARRATIVE] apt_kill_chain → persistence_runbook
Coverage gap families (4):
✗ Command-and-Control
✗ Defense Evasion
✗ Lateral Movement
✗ Discovery
Primary concern:
If C2-shaped telemetry is not in your validation set, your detectors
have not been tested against the phase where most APT campaigns are
first visible — initial callback after compromise.
The full narrative report names every missing signal by tactic family:
### Command-and-Control
MITRE descriptors not present in Run B: T1071, T1132
Signal shapes absent from Run B: DNS-based C2 signaling,
encoded URI C2 parameter, and periodic C2 beaconing.
> If C2-shaped telemetry is not in your validation set, your detectors
> have not been tested against the phase where most APT campaigns are
> first visible — initial callback after compromise.
And produces a concrete recommendation:
To close the Command-and-Control, Defense Evasion, Lateral Movement,
and Discovery gaps, run a scenario that includes those signal families
alongside persistence_runbook. Suggested: apt_kill_chain (covers C2,
lateral movement, persistence, and evasion) or evasion_stress_test
(covers masquerading, log deletion, and anti-forensics).
This is deterministic and template-based. No LLM. The narration engine classifies signals into tactic families using a static taxonomy of 80+ signal names and 35 MITRE technique IDs, then assembles analyst-language prose from that classification.
Coverage assumption auditing
Define what you believe your detection stack covers:
name: persistence_coverage_assumption
claims:
- "We can observe persistence-shaped telemetry"
- "We can detect suspicious scheduled task behavior"
expected_techniques:
- T1053.005
- T1547.001
expected_signals:
- scheduled_task_creation
- hidden_temp_directory
expected_phases:
- EXECUTE
Audit it:
python3 shenron.py --assumption assumptions/my_assumption.yaml \
--events artifacts/demo/shenron_demo_run.jsonl
[ASSUMPTION] persistence_coverage_assumption
[RECORDS] 40
Claims 0 supported 0 partial 2 unsupported
Techniques 2 observed 1 missing
Signals 1 observed 1 missing
[COVERAGE] 45.0%
[VERDICT] FAIL
SHENRON is not telling you your detection stack fails. It is telling you that your assumption, as written, is not supported by the evidence in this artifact. That is a different question, and a more useful one.
The assumption auditor asks: what exactly did your validation claim to prove, and does the evidence support that claim?
Coverage drift tracking
python3 shenron.py --coverage-history --out-dir reports/history
With 256 runs across 8 campaigns in the timeline:
[HISTORY] 256 runs · 8 campaigns
apt_kill_chain 7 runs 16 techniques
c2_shape_detection_test 79 runs 16 techniques
full_stack_adversarial 26 runs 28 techniques
persistence_pressure_test 108 runs 10 techniques
[DRIFT] No technique drift detected
No drift is good news for scenario consistency: the campaigns are producing stable technique sets across runs. It does not mean the detection stack is effective; it means the validation artifact has not silently changed shape. The tracker becomes more useful over time as you modify scenarios, update layer configurations, and run across different environments. If a configuration change silently drops technique coverage, the history report shows it.
Mutation variants
Test whether your analysis pipeline is brittle:
python3 shenron.py --mutate \
--events artifacts/demo/shenron_demo_run.jsonl \
--out-dir artifacts/mutations
Seven safe variants:
[field_drop ] 40 → 40 records 40 changes
[timing_jitter ] 40 → 40 records 40 changes
[label_ambiguity ] 40 → 40 records 3 changes
[signal_density_high ] 40 → 120 records 80 changes
[signal_density_low ] 40 → 17 records 23 changes
[phase_imbalance ] 40 → 40 records 30 changes
[technique_noise ] 40 → 40 records 11 changes
Run --verify-safety on any variant to confirm the safety contract is intact:
python3 shenron.py --verify-safety artifacts/mutations/mutation_label_ambiguity.jsonl
The question each mutation is designed to answer:
-
field_drop: Does your pipeline depend on optional fields being present? -
timing_jitter: Do your correlation rules break on ±5 minute timing variance? -
label_ambiguity: Do your rules fire on specific signal names or on signal patterns? -
signal_density_high: Can your SIEM handle 3× the expected event volume without dropping records? -
signal_density_low: Does your detection still work when 50% of events are missing? -
phase_imbalance: Does your phase-aware analysis handle imbalanced runs? -
technique_noise: Do your MITRE-based correlations produce false positives under noisy technique labels?
What this does not prove
Every report SHENRON produces includes this section. It is not boilerplate — it is load-bearing.
- That real adversarial techniques were executed
- That real detection rules fired on these signals
- That a SIEM or EDR would catch the described behaviors
- That coverage in SHENRON equals coverage in production
- That closing a SHENRON gap closes the same gap in your environment
SHENRON tests the telemetry pipeline layer. It is complementary to adversarial emulation, not a substitute. The value is in making assumptions inspectable, not in replacing the work of actually testing your stack against real execution.
The core identity
SHENRON does not ask "is your detector good?"
It asks: what exactly did your validation claim to prove?
That question is more honest, more tractable, and often more useful than vague detection coverage claims.
Repo: https://github.com/GnomeMan4201/shenron
Tag: v0.3.3 — 50 layers · 283 tests · zero hardcoded paths · PASS verdict
gnomeman4201 / badBANANA Research Collective
Observable adversarial behavior, not portable adversarial procedure.
Top comments (0)