SentinelCraft | Sentinel Detection-as-Code | R.A.H.S.I. Framework™ Analysis
A SOC Engineering Blueprint for Managing Microsoft Sentinel Detections as Code
🛡️Let's Connect & Continue the Conversation
🛡️Read Complete Article |
🛡️Let's Connect |
Microsoft Sentinel detections should not live only as portal edits.
They should be engineered, versioned, reviewed, tested, deployed, measured, and improved like production security code.
That is the purpose of SentinelCraft.
SentinelCraft is a Detection-as-Code framework for managing Microsoft Sentinel content through source control, infrastructure-as-code, CI/CD pipelines, controlled promotion, and SOC engineering discipline.
This includes:
- Analytics rules
- Scheduled KQL detections
- Automation rules
- Hunting queries
- Parsers
- Playbooks
- Workbooks
- MITRE ATT&CK mappings
- Entity mappings
- Custom alert details
- Incident grouping logic
- Deployment workflows
- Rollback procedures
- Detection lifecycle metadata
A Sentinel rule is not mature because it exists.
A Sentinel rule becomes mature when the SOC can answer:
- Where is the source-controlled rule?
- Who reviewed the change?
- Which KQL logic was tested?
- Which MITRE tactic or technique does it map to?
- Which entities are mapped?
- Which custom details are surfaced?
- Which automation rule or playbook responds?
- Which workspace receives deployment?
- How is rollback handled?
- How is detection value measured?
Without this structure, Microsoft Sentinel content becomes portal drift.
With this structure, Microsoft Sentinel becomes an engineered detection platform.
1. Why Detection-as-Code Matters in Microsoft Sentinel
Most SOC teams start with portal-based configuration.
That is normal.
They create analytics rules, adjust KQL, enable templates, configure automation rules, and build hunting queries directly in the Sentinel interface.
But as the environment matures, portal-only management creates problems.
Common issues include:
- No version history
- No peer review
- No approval workflow
- No rollback plan
- No test gate
- No deployment consistency
- No clear ownership
- No environment promotion
- No reliable change tracking
- Workspace drift
- Duplicate rules
- Broken KQL after edits
- Inconsistent naming
- Unmapped entities
- Unclear MITRE coverage
- Weak documentation
This is where Detection-as-Code becomes necessary.
Detection-as-Code means detection content is treated like software.
It is written, reviewed, stored, tested, deployed, and maintained through engineering workflows.
For Microsoft Sentinel, this means detection content should move from isolated portal configuration into a controlled source repository.
2. What SentinelCraft Means
SentinelCraft is the discipline of building and operating Microsoft Sentinel detection content as code.
It connects:
Threat Requirement
↓
KQL Detection Logic
↓
Rule Metadata
↓
MITRE ATT&CK Mapping
↓
Entity Mapping
↓
Automation and Response
↓
Source Control
↓
Pull Request Review
↓
CI/CD Validation
↓
Workspace Deployment
↓
Monitoring and Tuning
↓
Versioned Improvement
The repository becomes the source of truth.
The Sentinel portal becomes the runtime surface.
This distinction matters.
If the portal is the source of truth, changes become hard to govern.
If the repository is the source of truth, detections become manageable, reviewable, portable, and recoverable.
3. Core Principle: The Repository Is the Source of Truth
In a SentinelCraft model, detection content should be stored in a source control repository such as GitHub or Azure DevOps.
The repository should contain the deployable definition of Sentinel content, not just screenshots or notes.
This may include:
- Bicep files
- ARM templates
- Analytics rule definitions
- Automation rule definitions
- Hunting query definitions
- Parser definitions
- Playbook templates
- Workbook templates
- Parameter files
- Deployment scripts
- Documentation
- Test data
- Rule ownership metadata
When the repository is the source of truth, changes can be reviewed before deployment.
This creates better control over the SOC content lifecycle.
4. Sentinel Content That Should Be Managed as Code
Detection-as-Code should not be limited to analytics rules only.
A mature Microsoft Sentinel environment has many connected content types.
| Content Type | Why It Should Be Managed as Code |
|---|---|
| Analytics rules | Core detection logic and alert generation |
| Automation rules | Incident handling, routing, tagging, and response control |
| Hunting queries | Reusable threat hunting logic |
| Parsers | Field normalization and reusable query abstraction |
| Playbooks | Automated response and enrichment workflows |
| Workbooks | SOC visibility, dashboards, and coverage reporting |
| Watchlists | Detection context, allow lists, critical assets, and enrichment data |
| Parameter files | Environment-specific deployment values |
| Documentation | Rule purpose, ownership, testing, and response guidance |
When these assets are managed separately through manual changes, the SOC loses control.
When they are managed together as code, the SOC gains engineering discipline.
5. Detection-as-Code Pipeline
A strong SentinelCraft pipeline should follow a clear workflow.
Developer or Detection Engineer
↓
Feature Branch
↓
KQL and Template Update
↓
Local Validation
↓
Pull Request
↓
Peer Review
↓
Automated Checks
↓
CI/CD Deployment
↓
Non-Production Sentinel Workspace
↓
Validation and Tuning
↓
Production Promotion
↓
Monitoring and Feedback
The goal is not to make detection engineering slower.
The goal is to make detection engineering safer, repeatable, and measurable.
6. Recommended Repository Structure
A clean repository structure helps the SOC scale detection content.
Example structure:
sentinelcraft/
│
├── analytics-rules/
│ ├── identity/
│ ├── endpoint/
│ ├── cloud/
│ ├── network/
│ └── saas/
│
├── automation-rules/
│
├── hunting-queries/
│
├── parsers/
│
├── playbooks/
│
├── workbooks/
│
├── watchlists/
│
├── parameters/
│ ├── dev/
│ ├── test/
│ └── prod/
│
├── docs/
│ ├── detection-standards.md
│ ├── naming-conventions.md
│ └── review-checklist.md
│
└── pipelines/
├── github-actions/
└── azure-devops/
This structure separates content by function and environment.
It also makes it easier for detection engineers, SOC analysts, cloud engineers, and security architects to collaborate.
7. Bicep and ARM Templates for Sentinel Content
Microsoft Sentinel content can be represented using infrastructure-as-code formats such as Bicep or ARM templates.
Bicep is often easier to read and maintain than raw ARM JSON.
A Detection-as-Code approach should define Sentinel content in reusable templates that can be deployed consistently across environments.
This is especially useful when managing:
- Multiple Sentinel workspaces
- Dev, test, and production environments
- Regional SOC deployments
- MSSP or multi-tenant operations
- Standard detection baselines
- Repeatable automation patterns
The key is consistency.
A production analytics rule should not be manually different from the tested version unless the difference is intentional, documented, and parameterized.
8. Analytics Rule Engineering
A production Microsoft Sentinel analytics rule should define more than a KQL query.
It should include the complete detection behavior.
Important components include:
- Rule name
- Description
- KQL query
- Query frequency
- Query period
- Severity
- MITRE ATT&CK tactic
- MITRE ATT&CK technique
- Entity mapping
- Custom details
- Alert grouping
- Incident creation settings
- Suppression logic
- Automation rule linkage
- Response playbook linkage
- Owner
- Version
- Validation status
A rule without metadata is hard to operate.
A rule with strong metadata becomes a managed detection asset.
9. Example Analytics Rule Metadata
Detection-as-Code should store engineering context alongside detection logic.
rule_name: Suspicious Privileged Role Assignment
platform: Microsoft Sentinel
category: Identity
severity: High
status: Production
owner: SOC Detection Engineering
mitre_tactic: Privilege Escalation
mitre_technique_id: T1078
mitre_technique_name: Valid Accounts
data_sources:
- AuditLogs
- AzureActivity
query_frequency: 15m
query_period: 1h
entity_mappings:
- Account
- IP
- AzureResource
custom_details:
- RoleName
- TargetUser
- InitiatingUser
automation:
- Enrich identity context
- Notify SOC channel
- Create high-priority incident
last_validated: 2026-05-12
version: 1.0.0
This metadata helps the SOC understand what the rule does, why it exists, who owns it, and how it should behave.
10. KQL as Detection Logic
KQL is not just a query language.
In SentinelCraft, KQL is the expression of adversary behavior.
A weak query searches for keywords.
A strong detection models behavior.
Example concept:
AuditLogs
| where OperationName has_any ("Add member to role", "Add eligible member to role")
| extend InitiatingUser = tostring(InitiatedBy.user.userPrincipalName)
| extend TargetUser = tostring(TargetResources[0].userPrincipalName)
| extend RoleName = tostring(TargetResources[0].modifiedProperties[0].newValue)
| where RoleName has_any ("Global Administrator", "Privileged Role Administrator", "Security Administrator")
| project
TimeGenerated,
OperationName,
InitiatingUser,
TargetUser,
RoleName,
Result
This query is not simply searching logs.
It is modeling a security-relevant behavior: privileged role assignment.
That behavior can become an analytics rule, hunting query, workbook panel, and response workflow.
11. Rule Naming Convention
A consistent naming convention improves readability and triage.
Recommended format:
[Category][ATT&CK-ID] Detection Name
Examples:
[Identity][T1078] Suspicious Privileged Role Assignment
[Endpoint][T1059.001] Suspicious Encoded PowerShell Command
[Cloud][T1098] Unusual Application Consent Grant
A good name should help analysts understand the detection before they open the query.
12. Rule Versioning
Every production detection should have a version.
Versioning helps the SOC understand the history and maturity of a rule.
Example version model:
| Version | Meaning |
|---|---|
| 0.1.0 | Draft logic |
| 0.5.0 | Lab-tested rule |
| 0.9.0 | Pilot deployment |
| 1.0.0 | Production-ready |
| 1.1.0 | Minor tuning update |
| 2.0.0 | Major logic redesign |
Versioning is especially useful when rules are tuned after false positives, incident reviews, red team findings, or threat intelligence updates.
13. Pull Request Review for Detections
Every production detection change should go through review.
A pull request should answer:
- What changed?
- Why is the change needed?
- Which threat behavior does it detect?
- Which data source does it require?
- Which KQL logic changed?
- Which MITRE mapping applies?
- Which entities are mapped?
- What false positives are expected?
- Was the detection tested?
- What is the rollback plan?
- Which workspace will receive deployment?
This turns detection updates into controlled engineering changes.
14. Detection Review Checklist
Before merging a Sentinel rule, reviewers should verify:
| Review Area | Question |
|---|---|
| Purpose | Does the rule have a clear detection objective? |
| KQL quality | Is the query efficient, readable, and accurate? |
| Telemetry | Are required tables available and reliable? |
| ATT&CK mapping | Are tactics and techniques correctly mapped? |
| Entity mapping | Are users, hosts, IPs, files, or resources mapped? |
| Alert details | Does the alert explain why it fired? |
| Severity | Is severity justified by risk and confidence? |
| False positives | Are expected benign patterns documented? |
| Automation | Is response automation appropriate? |
| Testing | Has the rule been validated? |
| Ownership | Is an owner assigned? |
| Rollback | Can the change be reverted safely? |
This checklist improves quality before deployment.
15. CI/CD for Sentinel Content
A SentinelCraft CI/CD workflow should validate and deploy content automatically.
The pipeline can perform checks such as:
- Template syntax validation
- Required metadata validation
- File naming validation
- KQL formatting checks
- Parameter validation
- Environment targeting
- Deployment preview
- Non-production deployment
- Production deployment after approval
- Deployment logging
- Failure notification
A simple pipeline flow may look like this:
Pull Request Opened
↓
Static Validation
↓
KQL Review
↓
Template Validation
↓
Peer Approval
↓
Merge to Main
↓
Deploy to Test Workspace
↓
Validation
↓
Manual Approval
↓
Deploy to Production Workspace
CI/CD does not replace detection expertise.
It protects detection expertise from unsafe deployment practices.
16. Workspace Promotion Strategy
Sentinel content should move through controlled environments.
Recommended pattern:
Development Workspace
↓
Testing Workspace
↓
Production Workspace
Each environment has a purpose.
| Environment | Purpose |
|---|---|
| Development | Build and experiment with KQL logic |
| Testing | Validate rule behavior and false positives |
| Production | Generate operational incidents for SOC response |
This prevents untested detection logic from creating noisy production incidents.
17. Parameterization
Different workspaces often require different values.
Examples include:
- Workspace ID
- Resource group
- Subscription ID
- Rule enabled state
- Severity overrides
- Query frequency
- Suppression settings
- Watchlist names
- Logic App resource IDs
- Environment-specific thresholds
Parameter files allow the same detection logic to be deployed across environments without hardcoding values.
Example:
{
"workspaceName": {
"value": "sentinel-prod"
},
"ruleEnabled": {
"value": true
},
"severity": {
"value": "High"
},
"queryFrequency": {
"value": "PT15M"
}
}
Parameterization reduces duplication and improves maintainability.
18. Import and Export Strategy
Import and export features can help move Sentinel analytics rules and automation rules between environments.
However, export should not become the long-term operating model.
Export is useful for:
- Capturing existing portal-created rules
- Migrating content into source control
- Creating a baseline
- Recovering rule definitions
- Converting manual content into managed content
After export, the SOC should clean, normalize, document, and store the content in the repository.
The repository should then become the long-term source of truth.
19. Avoiding Portal Drift
Portal drift happens when someone edits Sentinel content directly in the portal while the repository contains a different version.
This creates confusion.
Symptoms of portal drift include:
- Rule behavior differs from repository definition
- Production rule has unreviewed changes
- KQL differs across workspaces
- Automation is disconnected from code
- Rollback overwrites unknown changes
- Analysts cannot explain why a rule changed
To reduce portal drift:
- Treat the repository as authoritative
- Limit direct portal edits
- Export emergency changes back into source control
- Review deployment logs
- Tag repository-managed content
- Document change ownership
- Use pull requests for rule updates
Portal edits may still happen during emergencies.
But they should not become the normal operating model.
20. Automation Rules as Code
Automation rules should also be managed as code.
Automation rules can control:
- Incident assignment
- Incident tagging
- Severity adjustment
- Playbook execution
- Incident suppression
- Routing logic
- Status updates
- SOC workflow triggers
If automation rules are not versioned, response behavior can change without review.
That is risky.
A detection may be well engineered, but a poorly governed automation rule can still route, suppress, or escalate incidents incorrectly.
Detection-as-Code must include response logic.
21. Playbooks as Code
Microsoft Sentinel playbooks are commonly based on Azure Logic Apps.
They should be treated as response code.
Playbooks may perform actions such as:
- Enrich IP addresses
- Enrich user identity
- Pull device risk
- Create ITSM tickets
- Notify SOC channels
- Disable users
- Isolate endpoints
- Block indicators
- Collect evidence
- Request approval
- Update incidents
Because playbooks can take operational or containment actions, they need strong governance.
Playbook changes should be reviewed, tested, and deployed through controlled processes.
22. Hunting Queries as Code
Hunting queries are often treated informally.
That is a mistake.
Hunting queries represent reusable investigative logic.
They should be stored, reviewed, tagged, and maintained.
A hunting query should include:
- Hunt name
- Description
- KQL logic
- Required tables
- ATT&CK mapping
- Expected output
- Frequency
- Owner
- Last reviewed date
- Promotion status
Some hunting queries should remain exploratory.
Others should eventually become analytics rules.
Detection-as-Code helps manage that lifecycle.
23. Parser Management
Parsers are critical for reusable detection logic.
A parser can normalize source-specific fields and hide complexity from analysts.
For example, instead of writing different queries for every firewall vendor, the SOC can query a normalized parser function.
Parser changes should be managed carefully.
A parser update can affect:
- Analytics rules
- Hunting queries
- Workbooks
- Dashboards
- Analyst workflows
- Automation logic
That makes parser versioning important.
A parser is not just a helper query.
It is shared detection infrastructure.
24. Workbooks as Code
Workbooks provide visibility into SOC operations.
They can show:
- Detection coverage
- Rule health
- Connector health
- Incident trends
- MITRE ATT&CK mapping
- False positive patterns
- Rule deployment status
- Analyst workload
- Data ingestion patterns
- Hunting outcomes
If workbooks are manually built and not versioned, dashboards can drift across environments.
Managing workbooks as code helps maintain consistent SOC visibility.
25. Sentinel Solutions and Content Packages
Sentinel solutions and packaged content can accelerate deployment.
However, deployed content should still be reviewed.
A SOC should ask:
- Which rules are enabled?
- Which tables are required?
- Which connectors are needed?
- Which detections overlap with existing rules?
- Which rules are noisy?
- Which rules require tuning?
- Which rules map to priority threats?
- Which automation is attached?
- Which content should be customized?
- Which content should be disabled?
Content deployment is not the same as detection maturity.
The SOC must still engineer, tune, and validate the content.
26. Testing Sentinel Detections
A detection that has not been tested is an assumption.
Testing should validate:
- Required telemetry appears
- KQL matches expected behavior
- Rule schedule works
- Lookback window is correct
- Entity mapping works
- Custom details are useful
- Incident grouping behaves correctly
- Severity is appropriate
- Automation triggers correctly
- False positives are understood
- Analysts can investigate the output
Testing methods may include:
- Lab simulations
- Atomic Red Team
- Purple team activity
- Red team exercises
- Historical log replay
- Controlled cloud activity
- Synthetic events
- Manual KQL validation
The test result should be documented in the repository.
27. Detection Lifecycle States
Every detection should have a lifecycle state.
| State | Meaning |
|---|---|
| Draft | Initial idea or incomplete logic |
| Lab Testing | KQL is being validated |
| Pilot | Rule is deployed in limited mode |
| Production | Rule creates operational incidents |
| Tuning | Rule is active but being refined |
| Deprecated | Rule is replaced or outdated |
| Retired | Rule is removed from active use |
Lifecycle states help prevent abandoned or untested rules from remaining active forever.
28. Rollback Strategy
Rollback is one of the biggest benefits of Detection-as-Code.
If a rule causes excessive false positives or breaks due to schema changes, the SOC should be able to revert quickly.
A rollback plan should define:
- Previous working version
- Trigger for rollback
- Approval process
- Deployment method
- Communication channel
- Post-rollback validation
- Incident cleanup process
Rollback should not require guessing what changed.
The repository should show the change history clearly.
29. Detection Quality Metrics
SentinelCraft should be measured.
Useful metrics include:
- Number of rules managed as code
- Percentage of rules with owners
- Percentage of rules with MITRE mapping
- Percentage of rules with entity mapping
- Percentage of rules with custom alert details
- Number of rules validated in last 90 days
- Number of rules deployed through CI/CD
- Number of direct portal changes
- Failed deployments
- Rollbacks
- False positive rate
- Mean time to detect
- Mean time to triage
- Mean time to respond
- Coverage by tactic
- Coverage by data source
- Coverage by business risk
The goal is not to count rules.
The goal is to measure reliable detection capability.
30. Common Mistakes in Sentinel Detection-as-Code
SOC teams should avoid these mistakes:
- Treating Detection-as-Code as only template deployment
- Storing rules in Git without review discipline
- Not testing KQL before production deployment
- Ignoring entity mapping
- Ignoring custom alert details
- Deploying rules without owners
- Using inconsistent naming
- Hardcoding workspace-specific values
- Allowing direct portal edits to become normal
- Not versioning parsers
- Not managing playbooks as code
- Not documenting rollback procedures
- Deploying content without tuning
- Treating vendor templates as production-ready by default
Detection-as-Code is not just a repository.
It is an operating model.
31. Practical Implementation Roadmap
A SOC can adopt SentinelCraft in phases.
Phase 1: Inventory
Collect current Sentinel content:
- Analytics rules
- Automation rules
- Hunting queries
- Parsers
- Playbooks
- Workbooks
- Watchlists
Phase 2: Export and Baseline
Export existing content where possible and create a repository baseline.
Phase 3: Standardize
Define standards for:
- Naming
- Folder structure
- Metadata
- ATT&CK mapping
- Entity mapping
- Severity logic
- Pull request review
- Deployment approval
Phase 4: Convert to Code
Convert high-value Sentinel content into Bicep or ARM templates.
Phase 5: Build CI/CD
Create GitHub Actions or Azure DevOps pipelines for validation and deployment.
Phase 6: Add Environments
Introduce dev, test, and production workspace promotion.
Phase 7: Enforce Review
Require pull request review for production detection changes.
Phase 8: Measure and Improve
Track detection quality, deployment reliability, false positives, and coverage improvement.
32. R.A.H.S.I. Framework™ Analysis
From the R.A.H.S.I. Framework™ perspective, SentinelCraft represents a shift in SOC maturity.
A basic SOC asks:
Did we create the rule?
A mature SOC asks:
Can we version it, test it, deploy it, explain it, roll it back, and prove its detection value?
That is the difference between Sentinel administration and Sentinel engineering.
SentinelCraft turns Microsoft Sentinel into a managed SOC control plane where detection logic is not random, manual, or fragile.
It becomes:
- Source-controlled
- Reviewed
- Tested
- Parameterized
- Deployable
- Traceable
- Reversible
- Measurable
- Aligned to adversary behavior
- Connected to response
The future of Sentinel maturity is not more portal configuration.
It is engineered detection delivery.
33. Key Design Principles
1. Treat detections as production code
Detection logic should be versioned, reviewed, tested, and deployed through controlled workflows.
2. Make the repository the source of truth
The Sentinel portal should reflect deployed content, not become the primary place where production logic is edited.
3. Manage full SOC content as code
Analytics rules, automation rules, hunting queries, parsers, playbooks, and workbooks should be governed together.
4. Validate before deployment
KQL, templates, parameters, and metadata should be checked before production deployment.
5. Promote through environments
Development, testing, and production workspaces should support safe release of detection content.
6. Preserve rollback capability
Every production detection change should be reversible.
7. Measure detection value
A rule is valuable when it improves detection, investigation, response, or coverage.
8. Reduce portal drift
Direct portal changes should be controlled, documented, and synchronized back into the repository.
SentinelCraft is the discipline of managing Microsoft Sentinel detections and SOC content as code.
It transforms Sentinel from a portal-managed SIEM into an engineered detection platform.
In this model:
- KQL becomes detection logic.
- Analytics rules become versioned assets.
- Automation rules become governed response logic.
- Hunting queries become reusable research assets.
- Parsers become shared detection infrastructure.
- Playbooks become response code.
- Workbooks become versioned SOC visibility.
- CI/CD becomes the delivery engine.
- The repository becomes the source of truth.
The strongest SOCs will not be the ones with the most manually created rules.
They will be the ones with the most reliable, reviewed, tested, deployable, and measurable detection content.
Detection-as-Code is now a SOC engineering discipline.
aakashrahsi.online
Top comments (0)