DEV Community

Cover image for detflow: A Detection-Engineering Copilot You Can pip install
Vinay
Vinay

Posted on • Originally published at vinayvobbili.github.io

detflow: A Detection-Engineering Copilot You Can pip install

TL;DR ๐Ÿš€

I shipped detflow to PyPI โ€” an open-source, vendor-neutral detection-engineering copilot. It does the four things I found myself re-implementing inside every detection-as-code workflow: draft a detection from plain English (as Sigma or Cortex XSIAM XQL), lint it offline, find overlaps against the rules you already run, and review it like a senior detection engineer. ๐Ÿ›ก๏ธ

2 formats

draft & review in Sigma or Cortex XQL โ€” one portable, one native

1 protocol

bring any model: an OpenAI-compatible endpoint or a LangChain failover chain

0 crashes

lint & overlap need no model; review degrades to a deterministic floor

This is the detection-side sibling of iocflow. iocflow handles the indicator lifecycle; detflow handles the rule lifecycle. Same design DNA: deterministic primitives first, the LLM as an enhancement that can fail without taking the tool down with it. ๐Ÿงฑ

The itch

A detection-as-code pipeline โ€” the kind that turns a rule into a reviewed, tested merge request โ€” has a handful of stages that have nothing to do with your SIEM vendor:

  • Is this rule even valid? (lint / schema-check)
  • An analyst can describe the behavior but doesn't write Sigma fluently โ€” can we draft the first version?
  • Are we about to ship coverage we already have? (dedup against the catalog)
  • Would a senior engineer approve this, and what would they flag? (quality, false-positive risk, ATT&CK mapping, gaps)

I'd written those four stages more than once. They're generic โ€” the only vendor-specific parts of a real pipeline are compiling to your query language and dry-running against your tenant. So I carved the generic four out of a detection-as-code workbench I'd built and made them a clean, public library. ๐Ÿงฐ

What it looks like

Draft a detection from a sentence โ€” in either language:

import detflow

sigma = detflow.draft("powershell with an encoded command spawned from a Word macro")
print(sigma.rule)                       # a full Sigma rule, ready to lint

xql = detflow.draft("same thing, but for Cortex XSIAM", fmt="cortex-xql")
print(xql.rule)                         # dataset = ... | filter ... | limit 100
Enter fullscreen mode Exit fullscreen mode

Lint it offline โ€” no model, no network, no keys:

report = detflow.lint(sigma.rule)       # or lint_sigma / lint_xql
print(report.status, report.summary)    # "pass" / "warn" / "fail"
for f in report.findings:
    print(f.level, f.message)
Enter fullscreen mode Exit fullscreen mode

Review it like a senior engineer, deduped against your own inventory:

catalog = [
    {"name": "Encoded PowerShell", "source": "crowdstrike", "techniques": ["T1059.001"]},
    {"name": "WMI Process Create",  "source": "sigma",       "techniques": ["T1047"]},
]
result = detflow.review(sigma.rule, catalog=catalog)
print(result.quality_score, result.false_positive_risk, result.verdict)
for o in result.overlaps:               # "you may already cover this"
    print(" โ€ข", o.source, o.name, "โ€”", o.reason)
Enter fullscreen mode Exit fullscreen mode

The whole flow, end to end:

flowchart LR
    NL([plain English]) -->|draft| RULE[Sigma / XQL rule]
    RULE -->|lint| LINT[schema + best-practice findings]
    RULE -->|find_overlaps| OV[catalog dedup]
    LINT --> REV{{review}}
    OV --> REV
    REV --> V([quality ยท FP risk ยท ATT&CK ยท verdict])
Enter fullscreen mode Exit fullscreen mode

There's a CLI too, for the terminal-and-CI crowd:

detflow draft "credential dumping via comsvcs MiniDump" -f cortex-xql
detflow lint rule.yml
detflow review rule.yml --catalog catalog.json --json
Enter fullscreen mode Exit fullscreen mode

Model-agnostic on purpose ๐Ÿ”Œ

detflow doesn't import an SDK or hard-code a provider. A "model" is anything with one method:

def complete(self, system: str, user: str, *, json: bool = False) -> str: ...
Enter fullscreen mode Exit fullscreen mode

That gives you three ways in. A built-in OpenAIChatModel talks to any OpenAI-compatible endpoint โ€” OpenAI, Azure, a local vLLM/Ollama server, a gateway. default_model() builds one from DETFLOW_LLM_* env vars. Or you wrap any LangChain chat model:

from langchain_failover import FailoverChatModel
from detflow.llm import LangChainModel

chain = FailoverChatModel(models=[primary, local_fallback])
model = LangChainModel(chain)
detflow.review(rule, catalog=catalog, model=model)   # rides the failover chain
Enter fullscreen mode Exit fullscreen mode

That FailoverChatModel is langchain-failover, another package I extracted and published โ€” so a primary-model outage transparently falls back to a secondary mid-review. Three of my OSS packages quietly eating each other's dog food. ๐Ÿ•

Never-raises, deterministic floor

The contract I care about most: detflow degrades, it doesn't break. ๐ŸŽฏ

  • Lint and overlap need no model at all โ€” they're pure, stdlib-plus-PyYAML, and run in CI with zero secrets.
  • Drafting requires a model (you're asking it to write), but a model error comes back as a result with an error field, not an exception.
  • Review uses a model when one is present and falls back to a deterministic floor when it isn't โ€” you still get the lint results, the catalog overlaps, and the parsed ATT&CK techniques. review() never raises.

So detflow is safe to drop into a pipeline that sometimes has an LLM available and sometimes doesn't. The boring, testable parts stay up regardless; the AI adds judgment when it can.

Why two formats

Sigma is the portable, reviewable, vendor-neutral standard โ€” it lints cleanly and ports across SIEMs. Cortex XSIAM XQL is what actually runs on that platform. Supporting both means you can author once in Sigma for portability, or go straight to XQL when you want the platform's full expressiveness โ€” and detflow lints and reviews either one. The drafting prompts are language-aware (the XQL prompt knows XQL has no startswith/endswith and uses | filter, not SQL where), so you don't get SQL-shaped hallucinations back. ๐Ÿง 

The bigger pattern

This is the same lesson as the IOC work: when you want to show AI in your engineering, the junior move is to make everything an LLM call. The stronger, more deployable story is deterministic primitives plus optional AI โ€” the schema checks and dedup are boring and tested, the model writes and reviews where judgment helps, and nothing falls over when the model is slow or absent.

detflow runs on Python 3.9+, keeps import detflow dependency-light (the LLM client is an extra), ships py.typed for downstream type-checking, and every piece is independently useful.

If you run a detection-as-code pipeline, I'd love to know which query language you'd want next. ๐Ÿ‘‹

Top comments (0)