Posted on Mar 9

Meet StatlerScore, a Credit Score for your Cloud

#cloud #opensource #aws #security

StatlerScore is a quantitative framework that translates cloud infrastructure security risks into a 300–850 scale, analogous to financial credit scoring. The system bridges the gap between technical security debt and executive visibility by providing a standardized, evidence-based metric that communicates security posture across technical and business stakeholders.

The minimum viable product practicum deliverable will include a working internal scoring engine with dashboard, evidence of AWS testing to verify the scoring tiers, and two blog posts. I will also develop a system to verify the scores run, analogous to a credit score bureau.

The name "StatlerScore" was inspired by the sharp-tongued critic from the Muppets, and differentiates this project from the more generic "Cloud Security Credit Score" concept.

Motivation

Communication Gap — Everyone in the United States understands the concept of a credit score. Security dashboards with dozens of findings across multiple tools dilute urgency. A single number on a familiar scale communicates risk instantly to executives, board members, and engineers alike.
Benchmarking Absence — Cloud security benchmarks exist, but no unified scoring standard allows for consistent, cross-organization comparison.
Continuous vs. Point-in-Time — Annual SOC 2 audits provide a snapshot. Cloud infrastructure changes daily. StatlerScore attestations carry a 90-day TTL with expiry warnings, ensuring stale scores are never misrepresented as current posture
Verifiability - StatlerScore's Cloud Credit Bureau produces cryptographically signed attestations with hash-chained logs and Merkle tree proofs, so any third party can independently verify that a score is authentic, unmodified, and anchored to a specific point in time.

Score

Number Ranking	FICO Credit Score	Cloud Credit Score
800-850	Exceptional	Resilient posture
740-799	Very Good	Strong posture
670-739	Good	Stable posture
580-669	Fair	Accumulating technical risk
300-579	Poor	Critical remediation required

StatlerScores should expose trend lines, not just numbers:

↑ Improving
→ Stable
↓ Degrading

Architecture

src/
├── collectors/          # CloudHarvester — gathers AWS evidence via boto3
│   └── cloudharvester.py
├── scoring/             # Weighted pillar scoring engine with calibration curve
│   ├── score.py
│   └── history.py
├── reporting/           # CLI report renderer and matplotlib visualizations
│   ├── render.py
│   └── visualize.py
└── verification/        # Cloud Credit Bureau — FastAPI attestation service
    ├── api.py           # Bureau API (13 endpoints)
    ├── attestor.py      # HMAC-SHA256 signing and verification
    ├── auth.py          # API key generation and hashing
    ├── merkle.py        # Merkle tree with RFC 6962 domain separation
    ├── store.py         # Persistent JSON-backed log with hash-chain integrity
    └── timestamp.py     # RFC 3161 trusted timestamping via DigiCert TSA

Pillar Weights

Pillar	Weight	Evidence Sources
Security	35%	IAM, S3 exposure, network controls, encryption, detection services
Reliability	25%	CloudTrail, backups, multi-AZ, EKS health, lifecycle policies
Operational Excellence	25%	Logging validation, observability, resource hygiene, version currency
Performance Efficiency	15%	Auto Scaling, CloudFront, Graviton adoption

Cost Optimization and Sustainability are excluded as they fall outside the scope of security risk quantification.

Evidence Collection

The CloudHarvester collector gathers evidence from 16 AWS services (S3, IAM, CloudTrail, EC2, RDS, EKS, GuardDuty, Security Hub, and others) and normalizes findings into four categories of signals: boolean flags (e.g., "Is root MFA enabled?"), ratios (e.g., "What percentage of IAM users have MFA?"), capped counts (e.g., "How many CloudWatch alarms exist, up to a ceiling of 10?"), and penalties (e.g., "Each open security group deducts 10% from its signal"). A mock evidence mode (USE_MOCK=1) allows testing without live AWS credentials.

Signal Normalization

Every check produces a signal between 0.0 (worst) and 1.0 (best). The engine uses five normalization functions:

flag — binary (1.0 if enabled, 0.0 if not)
ratio — direct pass-through of a 0–1 ratio
invert — 1.0 minus the ratio (used when higher values are worse, like public bucket exposure)
capped — count divided by a ceiling, maxing at 1.0
penalty — starts at 1.0 and deducts a fixed amount per occurrence

Each pillar averages its applicable signals, skipping any that return None (indicating the service isn't in use). The four pillar scores are then combined using a weighted average, with re-normalization if any pillar is N/A.

Score Calibration Curve

A plain linear mapping (300 + raw × 550) doesn't reflect how credit scores are actually distributed. StatlerScore uses a piecewise linear calibration curve modeled on real FICO distribution data:

Raw Score	Credit Score	Tier Boundary
0%	300	—
50%	580	Poor → Fair
65%	670	Fair → Good
80%	740	Good → Very Good
93%	800	Very Good → Exceptional
100%	850	Perfect

This means reaching "Exceptional" requires 93%+ raw posture — not the 91% a linear formula would require. The compression at the top reflects reality: truly hardened cloud environments are rare, and the top tier should be hard to earn.

Key Factors

The engine also identifies which specific checks are helping and hurting the overall score, sorted by impact. Checks scoring at or below 50% appear as "hurting" factors (worst first), and checks at 80% or above appear as "helping" factors (best first). This gives operators an immediate remediation priority list without requiring them to dig through the full pillar breakdown.

Cloud Credit Bureau

The bureau is a FastAPI service that acts as an independent scoring authority — analogous to how Equifax or TransUnion operate for consumer credit. Clients submit evidence; the bureau scores it, signs the attestation, and appends it to a tamper-evident log.

Attestation Lifecycle

An organization registers via POST /organizations and receives a one-time API key.
The CloudHarvester collects evidence from the target AWS account.
The client submits evidence to POST /attest with their Bearer token.
The bureau runs the scoring engine, creates the attestation record, and signs it.
The signed record is appended to a hash-chained log and covered by a Merkle tree.

Cryptographic Integrity

HMAC-SHA256 Signatures — Each attestation is signed over a concatenation of five fields: evidence hash, timestamp, account ID, previous record hash, and validity expiration. Binding valid_until into the signature prevents the bureau from quietly extending or shortening a score's validity after issuance. Verification uses constant-time comparison to prevent timing attacks.

Hash-Chained Log — Every record includes the SHA-256 hash of its predecessor, creating a chain back to a genesis record. Any deletion, insertion, or reordering breaks the chain at the tampered point. The GET /chain/verify endpoint walks the full log and confirms integrity.

Score Expiry

Attestations carry a configurable TTL (default: 90 days). The API returns expiry metadata with every score query — including days_remaining, expired status, and an expiring_soon warning when fewer than 14 days remain. This incentivizes continuous monitoring without applying artificial score degradation.

Bureau API Endpoints

Endpoint	Auth	Purpose
`POST /organizations`	—	Register an org, receive API key
`GET /organizations/{id}`	—	Look up a registered org
`POST /attest`	Bearer	Score evidence, return signed attestation
`GET /score/{id}/latest`	Bearer	Most recent score with expiry info
`GET /score/{id}/status`	Bearer	Lightweight expiry check
`GET /history/{id}`	Bearer	Full attestation history
`GET /verify/{id}`	—	Verify HMAC signature of one record
`GET /chain/verify`	—	Verify hash-chain integrity across full log
`GET /merkle/root`	—	Current Merkle root + latest anchor
`GET /proof/{id}`	—	Merkle inclusion proof for one attestation
`GET /merkle/anchor/latest`	—	Full RFC 3161 anchor record
`GET /merkle/anchor/verify/{root}`	—	Re-verify stored TSR via openssl

Public endpoints (verification, chain integrity, Merkle proofs) require no authentication — any third party can audit the log.

What This Is Not

Not another CVE database — StatlerScore evaluates infrastructure posture against the Well-Architected Framework, not individual vulnerability records.
Not an external attack surface scanner — StatlerScore evaluates internal configuration with direct AWS API access and produces cryptographically verifiable attestations.
Not a cost optimization tool — The scoring framework is scoped exclusively to security risk quantification.

Future Work

Jira integration for automated security debt ticketing based on "hurting" factors
Multi-cloud support extending the collector framework beyond AWS
Decay factor applying continuous score degradation (e.g., 0.5%/day) as an alternative to binary TTL expiry
Dashboard UI with historical trend visualization beyond the current CLI and matplotlib outputs

DEV Community