Z3 is one of the most powerful reasoning engines ever built. Microsoft Research created it to verify chip designs and flight software. It can take your cloud configuration, model it as a set of logical assertions, and mathematically prove whether an attack path exists.
Z3 says when it finds one:
sat
Three letters. No context. No explanation. No link to the configuration property that caused it. No fix. Just "sat" — which means "satisfiable," which means "the unsafe state you asked about is reachable," which means your cloud is vulnerable. Probably. If you encoded the question correctly. Which you can't verify from the output.
This is the gap between a powerful engine and a useful tool. The engine answers the question. The tool explains the answer. This article explains why the explanation layer matters more than the engine, and what it looks like in practice.
What Z3 outputs
Let's say you want to check whether an anonymous internet user can read PHI data from an S3 bucket through a Cognito identity pool. You model the configuration as SMT-LIB assertions, write a query, and run it through Z3.
The input Z3 sees:
(set-logic ALL)
(declare-fun allows_unauthenticated (String String) Bool)
(declare-fun maps_unauth_to (String String) Bool)
(declare-fun has_action (String String) Bool)
(declare-fun has_tag (String String) Bool)
(assert (allows_unauthenticated "pool-abc" "true"))
(assert (maps_unauth_to "pool-abc" "role/AppUnauthRole"))
(assert (has_action "role/AppUnauthRole" "s3:GetObject"))
(assert (has_tag "bucket-prod-phi" "data_classification:phi"))
(declare-const principal String)
(declare-const action String)
(declare-const resource String)
(assert (allows_unauthenticated principal "true"))
(assert (has_action principal action))
(assert (has_tag resource "data_classification:phi"))
(check-sat)
(get-model)
The output Z3 produces:
sat
(
(define-fun principal () String "pool-abc")
(define-fun action () String "s3:GetObject")
(define-fun resource () String "bucket-prod-phi")
)
In SMT-LIB, sat means the forbidden state is reachable. The model names the specific pool, action, and bucket. The proof is mathematically sound.
If you're a security engineer who just wants to know whether your Cognito configuration is safe, this output is useless. You have three questions:
- Is this result correct? Did the input assertions match your configuration, or did the translation introduce a bug?
- What does it mean? Which specific settings in which specific files create the vulnerability?
- What do I do about it? What's the fix, what does it cost, and how do I verify it worked?
Z3 answers none of these. It answered the math question. The security question is still open.
The two translation boundaries
Between your cloud configuration and Z3's answer, there are two translation steps. Each can introduce bugs. Each is invisible if you only look at the solver's output.
YOUR CLOUD CONFIG THE MATH YOUR ANSWER
S3 bucket policy → SMT-LIB assertions → "sat"
IAM role policy (was the translation (was the translation
Cognito settings correct?) back to cloud
terms correct?)
ENCODING BOUNDARY Z3 SOLVER DECODING BOUNDARY
(can have bugs) (trusted) (can have bugs)
Encoding bugs: Your S3 bucket has PublicAccessBlock.BlockPublicPolicy = true, but the translation emits has_public_read "bucket" "true". The assertion is wrong — the bucket is private, but Z3 thinks it's public. Z3 faithfully returns sat based on the wrong input. The "vulnerability" doesn't exist. You can't tell from sat that the encoding was wrong.
Decoding bugs: Z3 returns sat with a model. You read the model and conclude "the bucket is directly accessible from the internet." But the actual path is through the Cognito identity pool, not direct access. The model told you which variables were assigned, but the interpretation of those variables — what they mean in cloud terms — is your responsibility. Misread the model, misunderstand the vulnerability.
The solver is the most reliable component in the pipeline. The translation layers are where the bugs hide. And they're the layers that get the least attention because everyone focuses on the solver.
What an orchestration layer provides
An orchestration layer wraps the solver with five capabilities that make the answer trustworthy, traceable, and actionable:
1. Encoding explanation — "Did the tool understand my configuration?"
Before the solver runs, you see what the tool extracted from your configuration, in your language:
Configuration Summary: 7 assets, 43 facts extracted
Asset: arn:aws:s3:::prod-phi (S3 Bucket)
├── Public read access: ENABLED
│ Source: prod-phi-bucket.obs.json → access.public_read
│ Fact: a3f8c2e91b04
│
├── Encryption: AES256 (SSE-S3, not KMS)
│ Source: prod-phi-bucket.obs.json → encryption.algorithm
│ Fact: b7d1e4f03c89
│
└── Data classification: PHI
Source: prod-phi-bucket.obs.json → tags.data_classification
Fact: c9e2a1f04d77
Asset: arn:aws:cognito-identity:...:identitypool/abc (Identity Pool)
├── Unauthenticated access: ALLOWED
│ Source: cognito-pool.obs.json → identity.access.allow_unauthenticated
│ Fact: d4e5f6a7b8c9
│
└── Maps unauthenticated users to: arn:aws:iam::111122223333:role/AppUnauthRole
Source: cognito-pool.obs.json → identity.cognito.unauth_role_arn
Fact: e5f6a7b8c9d0
No SMT-LIB. No predicate names. "Public read access: ENABLED" — the security engineer compares this to their mental model of the bucket. If the bucket is private, the encoding is wrong and the engineer knows before the solver runs.
Every fact has a unique identifier and a traceable source showing which file and which property path produced it. This is the audit trail. When the solver's answer doesn't match expectations, the engineer traces the identifier back to the source and checks whether the encoding was correct.
2. Human verdict — "What does the result mean?"
After the solver runs, you see the answer in security language:
VERDICT: UNSAFE
An anonymous internet user can read PHI data from the prod-phi
bucket through the Cognito identity pool.
The forbidden state is reachable because:
1. Identity pool allows unauthenticated access
(cognito-pool.obs.json → identity.access.allow_unauthenticated = true)
2. Unauthenticated users receive credentials for AppUnauthRole
(cognito-pool.obs.json → identity.cognito.unauth_role_arn)
3. AppUnauthRole has s3:GetObject permission
(iam-role.obs.json → policies.attached_policies[0].Action)
4. Target bucket contains PHI
(prod-phi-bucket.obs.json → tags.data_classification = phi)
Not sat. Not a model with define-fun expressions. A four-step chain in plain English, each step linked to a specific file and property path. The security engineer reads it and knows exactly which settings create the vulnerability and where they live.
3. Fix guidance — "What do I do about it?"
FIX: Disable unauthenticated access on the identity pool.
aws cognito-identity update-identity-pool \
--identity-pool-id us-east-1:abc123 \
--no-allow-unauthenticated-identities
Cost: $0. Time: 30 seconds.
Effect: Breaks the chain at step 1.
VERIFICATION: After applying the fix, re-run the analysis.
Expected result: SAFE
(Z3 returns UNSAT — no assignment of principals, actions,
and resources can satisfy the attack path conditions.)
The fix is a shell command. The cost is quantified. The effect names the specific chain step that breaks. The verification tells the engineer what to expect and explains the solver's output in cloud terms.
4. Traceability — "Which property caused this?"
Every step in the verdict traces back to a specific property in a specific file through a unique identifier:
# The verdict says step 1 caused the chain.
# Trace identifier d4e5f6a7b8c9:
grep "d4e5f6a7b8c9" facts.jsonl
# Returns:
# fact_id: d4e5f6a7b8c9
# subject: pool-abc
# predicate: allows_unauthenticated = true
# source: cognito-pool.obs.json
# property: identity.access.allow_unauthenticated
# captured: 2026-05-01T00:00:00Z
One identifier. One grep. Full trace from the verdict to the configuration property, including when the snapshot was taken. No manual correlation across output files.
5. Encoding verification — "Can I trust the translation?"
The orchestration layer verifies its own encoding by comparing each extracted fact against the raw configuration:
Encoding verification: 43/43 facts verified ✓
Every extracted fact matches the corresponding property
in the observation file. The solver's input is consistent
with the configuration snapshot.
Or, when there's a bug:
Encoding verification: 41/43 facts verified
MISMATCH:
Fact d4e5f6a7b8c9:
Extracted: allows_unauthenticated = "true"
Observation: identity.access.allow_unauthenticated = "false"
File: cognito-pool.obs.json
→ ENCODING BUG: fact says unauthenticated is allowed,
but the configuration says it's disabled.
The UNSAFE verdict may be incorrect.
No other tool in the market verifies its own translation layer. The security engineer doesn't need to read SMT-LIB to trust the result — the tool proves its encoding is correct, or reports where it is wrong.
What you get without orchestration vs. with it
| Capability | Z3 alone | With orchestration |
|---|---|---|
| Prove an attack path exists | sat |
UNSAFE: anonymous user can read PHI data |
| Explain which settings cause it | Read the SMT-LIB model | Four-step chain with file names and property paths |
| Verify the encoding is correct | Manually review assertions | Automated: 43/43 facts match observations |
| Trace the finding to a file | Grep through comments | One identifier, one grep, full trace |
| Get the fix | Derive it from the model | Shell command, cost, time, expected result |
| Run multiple solvers | Reformat per solver | Same input, three solvers, consensus |
| Explain to an auditor | Show them SMT-LIB | Show them the chain in English with evidence |
The right column is the product. The left column is a math library. Security engineers don't need a math library — they need trustworthy answers in their language.
Why this matters more for compound detection
The orchestration layer is most valuable when the analysis spans multiple services. A single-service check ("is this bucket public?") is simple enough to verify manually. A cross-service check ("can an anonymous user reach PHI data through a chain of Cognito → IAM → S3?") involves three services, three configuration files, and dozens of properties.
The encoding has more places to be wrong. The verdict has more steps to explain. The traceability has more links to follow.
This is where raw solver output becomes dangerous. If Z3 returns sat on a three-service chain and the encoding has a bug in the IAM layer, the finding is a false positive. The security team triages it as critical, burns two sprints investigating, and discovers it was caused by a property path error in the translation code. The encoding verification catches that error before the solver runs.
If Z3 returns unsat and the encoding has a bug — a property that should be true was encoded as false — the finding is a false negative. The team thinks they're safe. They're not. The encoding verification catches that too.
The more complex the analysis, the more valuable the translation layer. Single-service checks can tolerate raw solver output. Cross-service compound detection cannot.
The bottom line
Don't evaluate formal verification tools by which solver they use. Z3, cvc5, and Yices all produce correct answers to the questions they're asked. The question is whether the question was asked correctly and whether the answer is translated back to your language.
Don't accept sat as an answer. Accept "UNSAFE: an anonymous internet user can read PHI data from the prod-phi bucket through the Cognito identity pool, caused by these four settings in these three files, fixable with this one command for $0 in 30 seconds, verified by three independent solvers, encoding confirmed correct against 43 observation properties."
That's an answer. sat is a data point.
The orchestration layer described in this article is implemented in Stave, an open-source static analysis tool that evaluates cloud configurations via CEL predicates and exports standardized facts for consumption by nine independent reasoning engines. The encoding explanation, verdict translation, traceability, and encoding verification are built on Stave's fact_id provenance chain — every fact carries a deterministic identifier linking the solver's input to the specific observation file and property path that produced it. All analysis runs on air-gapped snapshots with no cloud credentials required.
Top comments (0)