Bala Paranj

Posted on May 14

The $0 cloud infrastructure security stack

#security #aws #cloud #opensource

Maya Kaczorowski documented Oblique's $0 security stack for code, email, logs, and devices. This is the companion piece: the $0 stack for cloud infrastructure — intent verification, compound risk detection, and formal safety proofs for AWS configurations, with nine independent reasoning engines.

Inspired by a $0 stack

Maya Kaczorowski recently wrote about Oblique's $0 security stack — world-class security tooling at zero cost. Semgrep for code analysis, TruffleHog for secret scanning, RunReveal for SIEM, Sublime for email, Apple Business for device management. All free or free-tier. All solving real problems. Her point is important: the excuse that security costs too much no longer holds.

Her article covers application security, email security, log aggregation, and device management. This article covers a different domain: cloud infrastructure security — verifying whether your AWS resources are configured safely, not just correctly.

The distinction matters. A configuration can be correct by every checklist and still be unsafe. Three individually-correct settings — an unauthenticated identity pool, a scoped IAM role, and a private PHI-tagged bucket — can compose into a path that lets anonymous users reach patient data. No individual check catches it because the vulnerability exists in the composition, not in any single setting. This stack solves that problem.

The infrastructure security pipeline

Commercial cloud security posture management — Wiz, Orca, Prisma Cloud — starts at five figures annually. The open-source alternative costs nothing, and it does something commercial tools structurally cannot: verify that your configurations don't contradict your own declared intent, with mathematical proofs from engines built to verify flight software.

Input              Evaluation           Reasoning              Downstream
─────              ──────────           ─────────              ──────────
Steampipe    →     Stave          →     9 External Engines →   Neo4j GDS
(cloud SQL)        (CEL predicates      Z3 / cvc5 / Yices      SIEM
                    + compound chains)  Soufflé / Clingo       SARIF
                                        Prolog / PySAT         Evidence bundles
                                        Risk / Game Theory

Each layer is independent. Replace any piece without touching the others.

Steampipe is the input layer. It queries your cloud APIs like a database — AWS, Azure, and GCP resources become SQL tables. Steampipe produces the inventory. It doesn't evaluate it.

Stave is the evaluation layer. It reads observation snapshots and evaluates them against 2,650+ controls across 74 AWS service domains. CEL predicates detect individual misconfigurations. Compound chains compose multiple findings into named attack paths. The evaluation is deterministic — same snapshot, same findings, every time.

Nine reasoning engines consume Stave's fact export independently. Z3 proves whether forbidden states are mathematically reachable. Soufflé enumerates blast radius and reachability paths. Clingo fires declarative violation rules. Each engine adds a reasoning dimension that CEL predicates structurally cannot express — quantification, graph traversal, satisfiability. The engines are external consumers, not internal components. Stave exports facts as JSONL triples or SMT-LIB assertions. The engines read them.

Downstream systems consume what Stave and the engines produce. Neo4j Community Edition provides graph analysis — centrality, shortest paths, choke points. SARIF output feeds IDEs. JSONL output feeds SIEMs. Signed evidence bundles feed compliance audits.

What scanners check vs. what Stave verifies

A scanner asks: "Is this setting correct?" It checks attributes on individual resources — encryption enabled, public access blocked, MFA enforced. These are necessary checks. They verify that individual nodes meet a baseline.

Stave asks a different question: "Do your configurations contradict what you declared?"

When you tag a bucket data_classification: phi, you're declaring intent: this contains patient records. When a Bedrock knowledge base indexes that bucket and a customer-facing agent serves the results without a guardrail, your infrastructure contradicts your declaration. Three individually-correct configurations compose into a violation of your own stated intent.

Scanners check 10 attributes per resource. With five layers of AWS security — IAM, SCPs, resource policies, VPC endpoints, trust relationships — a single bucket can exist in over 200 possible effective-access states. The other 190 stay unexamined. Stave collapses that state space through invariants: define which states are forbidden, prove they're unreachable.

The findings reflect this difference:

A scanner says: "S3 bucket is publicly accessible. Severity: High."

Stave says:

CHAIN: bedrock_rag_phi_exposure (CRITICAL)

  CTL.COGNITO.UNAUTH.ACCESS.001
    Identity pool allows unauthenticated access

  CTL.IAM.ROLE.MAPPED.BROAD.001
    Mapped role has s3:GetObject on PHI bucket

  CTL.BEDROCK.AGENT.GUARDRAIL.001
    Agent has no content-filtering guardrail

  Compound: anonymous internet user can reach
  patient health records through the RAG pipeline.
  Three settings, three clean individual checks,
  one CRITICAL attack path.

The scanner found zero issues. Stave found the composition that makes the system unsafe.

What 2,650 controls cover

The catalog spans 74 AWS service domains. The controls that matter most are the families that detect structural patterns no individual check can see.

32 AI agent identity controls cover Bedrock agents, SageMaker pipelines, and Lambda tool chains. Agent execution role overprivilege, missing guardrails, ghost action groups referencing deleted Lambda functions, shadow agents created outside IaC, knowledge base data boundary violations. Five compound chains compose these into attack paths: agent-to-PHI exposure through RAG pipelines, cross-account training data access, shadow agent credential theft.

Shadow admin detection catches IAM roles that accumulated permissions beyond their declared scope. A role named S3-ReadOnly that can retrieve secrets, invoke Lambda functions, and enumerate the full IAM inventory. Five controls check permission drift (unused service ratio via Access Advisor), category mixing (data access combined with IAM write), and intent mismatch (permissions contradict the declared role-type tag). Two compound chains fire when the pattern is systemic.

Vendor delegation governance checks whether vendors with access to your S3 buckets have exceeded their declared scope, whether their access review is overdue, whether they can make your bucket public, and whether you can revoke their access. Five controls, one compound chain at threshold 3-of-5 — a single overdue review is a reminder, three concurrent failures is a systemic governance breakdown.

23 ghost reference controls detect policies that reference resources absent from the snapshot — dangling IAM trust policies pointing at deleted accounts, Cognito triggers referencing deleted Lambda functions, S3 policies granting access to buckets that no longer exist. An attacker who recreates the deleted resource under the same name inherits every permission the dangling policy still grants.

47 credential TTL controls across IAM rotation, token expiry, certificate lifecycle, Secrets Manager rotation, and KMS key management. The Time-Bound Credential Invariant checks not just that a TTL is declared, but that the TTL hasn't elapsed — the difference between "rotation is configured" and "rotation actually happened."

Temporal analysis treats time as a built-in dimension. Drift detection compares snapshots and flags configuration changes. Duration tracking measures how long a misconfiguration has persisted — the same public bucket at 6 hours and 6 months tells different stories. Observation freshness detects stale snapshots — findings based on data that's 90 days old need different urgency than findings based on data from today.

Nine engines, one fact export

Stave evaluates controls with CEL predicates — the primary detection mechanism. But CEL has expressivity limits: it can't quantify over lists, traverse reachability graphs, or prove satisfiability of combined assertions. The nine external engines fill those gaps.

The fact export is the interface. Stave projects observation properties into JSONL triples (subject-predicate-object) and SMT-LIB assertions. 44 scalar predicates and 6 per-element array projectors cover AI agents, VPC peering, EC2 instance profiles, IAM role drift, and S3 delegation. Every engine reads the same facts and produces independent analysis.

Z3 / cvc5 / Yices (SMT solvers) prove whether forbidden states are reachable. "Can an anonymous user reach PHI data through any combination of identity pool, role mapping, and bucket policy?" The answer is sat (reachable — unsafe) or unsat (mathematically impossible — proven safe). These are the solvers Microsoft Research built for verifying flight control software and CPU designs. The proof is deterministic and independently reproducible. Microsoft has used the same approach to solve firewall rules verification for Azure.

Soufflé (Datalog) enumerates reachability paths and counts blast radius. "How many resources can this compromised role reach?" "Which vendor principals have excessive delegation reach on this bucket?" Per-element facts — has_unused_service(role, "lambda"), has_delegated_principal(bucket, principal_arn) — enable queries that name specific services and specific principals, not just booleans.

Clingo (Answer Set Programming) fires declarative violation rules. Shipped rules cover AI agent patterns (broad Lambda + no guardrail), delegation violations (unknown principal, scope exceeded, irrevocable access), shadow admin signals (incompatible categories + unused services), VPC peering exposure, and shadow EC2 lateral movement. Every remediated fixture produces zero violations — verified end-to-end.

Prolog derives proof trees showing step-by-step reasoning: attacker → shadow account → trusted role → production data. PySAT checks boolean satisfiability on multi-control compounds. Risk model computes exploitation probability. Game theory quantifies attacker cost vs. defender remediation ROI. TLA+ checks temporal safety — how many configuration changes separate the current state from an unsafe state.

Each engine finds a different class of issue on the same fact set. That's breadth without tool sprawl — nine reasoning dimensions from one data export.

Where this fits in a security program

A security program has multiple layers. Each addresses a different domain. Each can be $0 independently:

Layer                    Domain                  $0 tools
─────                    ──────                  ────────
Application security     Code + dependencies     Semgrep, TruffleHog
Infrastructure security  Cloud configuration     Steampipe, Stave, Neo4j CE
Detection & response     Logs + runtime events   RunReveal
Email security           Phishing + BEC          Sublime Security
Device management        Endpoints               Apple Business

The infrastructure layer is the one most startups skip because commercial CSPM starts at five figures. The pipeline above is $0 and covers 2,650+ controls with formal verification from engines designed for safety-critical systems.

Setting it up

Step 1: Install Steampipe and the AWS plugin.

brew install turbot/tap/steampipe
steampipe plugin install aws

Steampipe reads your AWS credentials from the standard locations. No additional configuration needed.

Step 2: Verify your inventory.

select name, region, bucket_policy_is_public
from aws_s3_bucket
where bucket_policy_is_public = true;

If this returns results, you have public S3 buckets. Steampipe makes cloud inventory queryable in seconds.

Step 3: Try the demo (30 seconds, no AWS account needed).

git clone https://github.com/sufield/stave.git
cd stave
bash examples/demo-ai-security/run.sh

The demo shows the full pipeline on built-in fixtures: 5 AI agent findings → 3 CRITICAL compound chains → remediation → clean. No cloud credentials required.

Step 4: Run against your own snapshots.

stave apply --observations ./my-snapshots

Stave evaluates every observation against 2,650+ controls. Compound chains fire when multiple findings compose into attack paths on the same resource.

Step 5: Export to reasoning engines.

# JSONL triples for Soufflé / Clingo
stave export-sir --format jsonl --observations ./my-snapshots > facts.jsonl

# SMT-LIB assertions for Z3 / cvc5
stave export-sir --format smt2 --observations ./my-snapshots > facts.smt2

Step 6: Track drift over time.

stave diff --before ./snapshot-march --after ./snapshot-april

Compares snapshots and flags configuration changes with before-and-after state.

Step 7: Export graph for Neo4j.

stave graph export --format graphml --observations ./my-snapshots > graph.graphml

Load into Neo4j Community Edition for centrality analysis, shortest-path computation, and effective permission reasoning.

What you get for $0

The input layer (Steampipe)

SQL queries over AWS, Azure, and GCP resources
150+ plugins covering every major cloud provider and SaaS service

The evaluation layer (Stave)

2,650+ controls across 74 AWS service domains
30+ compound chains composing individual findings into named attack paths
32 AI agent identity controls (Bedrock, SageMaker, Lambda tool chains)
Shadow admin detection (permission drift + category mixing + intent mismatch)
Vendor delegation governance (scope, lifecycle, revocability, escalation)
23 ghost reference controls detecting dangling policies after resource deletion
47 credential TTL controls with elapsed-TTL verification
Temporal analysis: drift detection, duration tracking, observation freshness
Intent verification: findings fire when configurations contradict declared tags, role-type taxonomies, and vendor registries

The reasoning layer (9 engines)

Z3 / cvc5 / Yices: mathematical proofs of safety or reachability
Soufflé: blast radius enumeration and reachability path counting
Clingo: declarative violation rules across 5 attack pattern families
Prolog: step-by-step proof trees for attack path derivation
PySAT / Risk / Game Theory / TLA+: satisfiability, probability, cost, temporal safety

The insights layer (downstream)

Neo4j GDS: graph centrality, shortest paths, choke point identification
SARIF: IDE integration for developer-visible findings in pull requests
JSONL: SIEM ingestion for correlation with runtime events
Evidence bundles: signed compliance archives

The commercial equivalent of this pipeline costs $25,000–$100,000+ annually. The open-source version costs nothing, includes formal verification from SMT solvers designed for flight software, and verifies intent — not just configuration — across 74 service domains.

Infrastructure security doesn't have to cost anything

The reason most startups skip cloud security posture management is that the tools cost more than their entire infrastructure spend. A startup paying $500/month for AWS can't justify $50,000/year for a CSPM tool. So they don't. Their S3 buckets stay public, their IAM roles accumulate permissions nobody reviews, their deleted resources leave ghost references nobody detects, and their AI agents serve PHI through RAG pipelines nobody verified.

The open-source pipeline removes this barrier. Steampipe + Stave + nine reasoning engines + Neo4j costs nothing and provides capabilities commercial CSPM tools structurally cannot: intent verification against operator declarations, compound risk detection across service boundaries, mathematical safety proofs from SMT solvers, and temporal analysis that tracks how configurations evolve over time.

Maya Kaczorowski showed that application security, email security, log aggregation, and device management can all be $0. This article shows the same is true for cloud infrastructure security — and that the $0 version doesn't just match the commercial tools. On compound risk detection, intent verification, and formal safety proofs, it goes beyond them.

Three tools. Nine engines. Zero dollars. A pipeline, not a product.

Stave is an open-source intent verification engine for cloud infrastructure with 2,650+ controls, compound risk detection, and nine independent reasoning engines. Steampipe is an open-source tool for querying cloud APIs via SQL. Together with Neo4j for graph insights, they form the $0 infrastructure security pipeline.

DEV Community