Sentinel compliance agent

Posted on Jun 20

I spent 6 months building a runtime governance layer for AI agents — here's what survived testing"

#agents #ai #security #showdev

Agents are moving from demos to touching money, infrastructure, and customer data.

Sentinel SCA is a runtime admissibility layer. Before an agent action executes, Sentinel evaluates the request and returns one of three verdicts:

ALLOW
REVIEW
DENY

Every decision is cryptographically signed and recorded in a tamper-evident audit ledger.

This post is the validation report: what I built, what I tested, what passed, and what remains unproven.

At the time of writing, Sentinel has been validated across:

1,400+ signed governance decisions
100% signature verification in validation runs
Replay attack testing under concurrency
Decision consistency testing
Reputation adaptation testing
Evidence bundle generation and SIEM export validation

I’m a solo founder. I spent six months building Sentinel and then trying to break it.

This is the part most AI project posts skip: the testing.

If you’re building agents that do anything consequential, I think the problem I ran into is about to become your problem too.

The Problem Isn’t Capability. It’s Admissibility.

Most agent-safety conversations focus on capabilities:

What can an agent do?

Capabilities matter, but they are only part of the story.

Capabilities are static.

Admissibility is contextual.

A capability only says an agent may attempt an action. It says nothing about whether that action should execute right now.

Consider:

Financial Example

Capability:

transfer_funds

Scenario A:

Transfer $40
Known internal destination
Agent with clean history

Probably admissible.

Scenario B:

Transfer $400,000
Unknown destination
Agent failed ten governance checks this morning

Probably not admissible.

Same capability.

Different admissibility.

Infrastructure Example

Capability:

restart_service

Scenario A:

Restart staging service
During maintenance window

Probably admissible.

Scenario B:

Restart production database cluster
During peak customer traffic

Probably not admissible.

Again:

Same capability.

Different admissibility.

The decision cannot live in the permission grant.

It has to be made at execution time, with context.

That is the gap Sentinel sits in.

What Is Sentinel SCA?

Sentinel is a runtime governance layer for autonomous systems.

It sits between an agent and an execution boundary.

For every proposed action:

Agent proposes action

↓

Sentinel evaluates admissibility

↓

ALLOW / REVIEW / DENY

↓

Decision signed

↓

Audit ledger

↓

Evidence export / SIEM

The design principle I held throughout development was simple:

Sentinel decides what may execute.

The customer decides what their agent should be able to attempt.

Customers own:

Agents
Capabilities
Policies
Reviewers

Sentinel governs execution.

It does not own the agent.

The Governance Pipeline

Every action passes through multiple governance layers.

Identity Verification

Every request is signed using Ed25519.

Unknown, suspended, or improperly signed agents receive no execution authority.

Freshness, Replay Defense, and Idempotency

Timestamp windows, nonce validation, replay detection, and idempotency controls prevent old authority from being reused.

Capability Enforcement

The agent must possess the capability being attempted.

Missing capability:

DENY.

Schema Validation

Actions must conform to expected structures.

Malformed requests never reach execution.

Operational Controls

Global freezes, kill switches, and rate limits can halt activity regardless of agent status.

Deterministic Policy Evaluation

The same canonical action should always produce the same governance result.

Risk Assessment

Risk scoring considers:

Action type
Notional value
Frequency
Context

History-Based Trust Signal

Agent behavior influences future admissibility decisions.

Good behavior builds trust.

Unsafe behavior reduces it.

Verdict

Sentinel returns:

ALLOW
REVIEW
DENY

The decision is then:

Signed
Recorded
Auditable

Validation Results

Anyone can describe a system.

Fewer people publish test results.

Verdicts Scale With Consequence

Governance is graded.

Not binary.

Action Verdict
Read-only health check ALLOW
Small approved trade ALLOW
Large trade REVIEW
Funds transfer REVIEW
Funds transfer above limit DENY
Deploy to staging REVIEW
Missing capability DENY

An important clarification:

Missing capability DENY results are enforcement decisions.

They are not false positives.

For actions with valid capabilities, the false-positive rate in this validation run was zero.

Replay Protection Holds Under Concurrency

I fired one signed request thirty times simultaneously.

Result:

Executed: 1
Rejected: 29

Exactly-once behavior held under concurrent load.

For financial or infrastructure actions, this property is non-negotiable.

Identical Actions Produce Identical Decisions

The same canonical action was executed repeatedly.

Results:

Same verdict every time
No decision drift
Valid signatures on every response

Equivalent inputs produced equivalent outputs.

Reputation Moves In The Right Direction

This was one of the tests I most wanted to fail honestly.

Fresh agent.

60 actions.

Phase 1

Safe actions:

Reads
Small trades

Result:

Reputation increased steadily.

Phase 2

Risky actions:

Large transfers
Destructive operations

Result:

Reputation decreased steadily.

This is reported as observed validation behavior, not a universal guarantee.

However, the signal behaved correctly:

Good behavior increased trust.

Bad behavior reduced it.

Reviews Resolve Even When Nobody Is Watching

A REVIEW cannot remain unresolved forever.

Otherwise governance becomes a denial-of-service against itself.

Review resolution follows risk bands:

Low Risk

Automatically approved after timeout.

Medium Risk

Escalated to the customer reviewer.

If unanswered:

Fail closed.

High Risk

Automatically denied.

A background worker finalizes unresolved reviews.

No dashboard needs to remain open.

No action remains stuck forever.

No race condition can double-decide an action.

For obvious reasons, exact thresholds are intentionally not published.

Cryptographic Auditability

Every decision contains:

Ed25519 signature
Deterministic action hash
Policy version
Key identifier

Records are hash-linked into an append-only chain.

Validation confirmed:

Signature integrity
Chain continuity
Independent verification

Evidence bundles and SIEM exports are generated for downstream audit systems.

What Sentinel Does Not Solve

This section matters.

Every security system has boundaries.

Sentinel does not:

Prevent model jailbreaks by itself
Replace application authorization
Eliminate insider abuse
Remove the need for secure key management
Guarantee model correctness

Sentinel is a runtime governor.

It governs execution authority.

It is not a replacement for every other security control.

What I’m Not Claiming

I want this section to be explicit.

Production-Ready Does Not Mean Finished

The core governance boundary is coherent and testable.

The product is still evolving.

Long-Duration Soak Testing

Not yet completed.

Independent Security Review

Not yet completed.

I want:

External penetration testing
Independent cryptographic review

Availability

99.9% is a target.

It is not yet measured historical uptime.

Performance

Latency and throughput numbers are observed validation results.

They are not contractual guarantees.

Enterprise Features

Some enterprise features remain roadmap items:

SSO / SAML
HSM-backed identity
Private VPC deployments
Multi-region deployments

Early Audit Records

A small number of early audit records predate persistence of the exact canonical signing payload.

They remain chain-verifiable but cannot support standalone signature verification.

I’d rather disclose that than hide it.

Why This Matters

The industry is getting very good at making agents capable.

We spend far less effort making their actions:

Bounded
Attributable
Governed
Auditable

I don’t think the answer is to slow agents down.

I think the answer is a thin layer between intent and execution.

A layer that:

Allows most actions
Pauses when judgment is required
Refuses when necessary
Produces evidence afterward

And can prove exactly what happened.

That’s the thesis behind Sentinel.

Challenge The Model

I’d rather hear:

“Your replay test is weak because X.”

than:

“Cool project.”

If you’re:

A security engineer
A compliance specialist
An auditor
An AI platform builder
A red teamer

Tell me where this model breaks.

Questions I’m particularly interested in:

How would you bypass admissibility?
What evidence artifact is missing?
What canonicalization edge cases am I ignoring?
Where does this governance model fail under real-world pressure?

The fastest way to improve governance infrastructure is to expose it to people whose job is to challenge assumptions.

If you see a flaw, point it out.

That’s how Sentinel gets better.

Website: https://sentinelsca.com

Documentation: https://sentinelsca.com/docs

Validation Report: https://sentinelsca.com/docs/validation-report

GitHub: [https://github.com/sentinelSCA/sentinel]

— Building Sentinel SCA, solo, and testing it in the open.

DEV Community

I spent 6 months building a runtime governance layer for AI agents — here's what survived testing"

Top comments (0)