StatlerScore is a quantitative framework that translates cloud infrastructure security risks into a 300–850 scale, analogous to financial credit scoring. The system bridges the gap between technical security debt and executive visibility by providing a standardized, evidence-based metric that communicates security posture across technical and business stakeholders.
The minimum viable product practicum deliverable will include a working internal scoring engine with dashboard, evidence of AWS testing to verify the scoring tiers, and two blog posts. I will also develop a system to verify the scores run, analogous to a credit score bureau.
The name "StatlerScore" was inspired by the sharp-tongued critic from the Muppets, and differentiates this project from the more generic "Cloud Security Credit Score" concept.
Motivation
- Communication Gap — Everyone in the United States understands the concept of a credit score. Security dashboards with dozens of findings across multiple tools dilute urgency. A single number on a familiar scale communicates risk instantly to executives, board members, and engineers alike.
- Benchmarking Absence — Cloud security benchmarks exist, but no unified scoring standard allows for consistent, cross-organization comparison.
- Continuous vs. Point-in-Time — Annual SOC 2 audits provide a snapshot. Cloud infrastructure changes daily. StatlerScore attestations carry a 90-day TTL with expiry warnings, ensuring stale scores are never misrepresented as current posture
- Verifiability - StatlerScore's Cloud Credit Bureau produces cryptographically signed attestations with hash-chained logs and Merkle tree proofs, so any third party can independently verify that a score is authentic, unmodified, and anchored to a specific point in time.
Score
| Number Ranking | FICO Credit Score | Cloud Credit Score |
|---|---|---|
| 800-850 | Exceptional | Resilient posture |
| 740-799 | Very Good | Strong posture |
| 670-739 | Good | Stable posture |
| 580-669 | Fair | Accumulating technical risk |
| 300-579 | Poor | Critical remediation required |
StatlerScores should expose trend lines, not just numbers:
↑ Improving
→ Stable
↓ Degrading
Architecture
src/
├── collectors/ # CloudHarvester — gathers AWS evidence via boto3
│ └── cloudharvester.py
├── scoring/ # Weighted pillar scoring engine with calibration curve
│ ├── score.py
│ └── history.py
├── reporting/ # CLI report renderer and matplotlib visualizations
│ ├── render.py
│ └── visualize.py
└── verification/ # Cloud Credit Bureau — FastAPI attestation service
├── api.py # Bureau API (13 endpoints)
├── attestor.py # HMAC-SHA256 signing and verification
├── auth.py # API key generation and hashing
├── merkle.py # Merkle tree with RFC 6962 domain separation
├── store.py # Persistent JSON-backed log with hash-chain integrity
└── timestamp.py # RFC 3161 trusted timestamping via DigiCert TSA
Pillar Weights
| Pillar | Weight | Evidence Sources |
|---|---|---|
| Security | 35% | IAM, S3 exposure, network controls, encryption, detection services |
| Reliability | 25% | CloudTrail, backups, multi-AZ, EKS health, lifecycle policies |
| Operational Excellence | 25% | Logging validation, observability, resource hygiene, version currency |
| Performance Efficiency | 15% | Auto Scaling, CloudFront, Graviton adoption |
Cost Optimization and Sustainability are excluded as they fall outside the scope of security risk quantification.
Evidence Collection
The CloudHarvester collector gathers evidence from 16 AWS services (S3, IAM, CloudTrail, EC2, RDS, EKS, GuardDuty, Security Hub, and others) and normalizes findings into four categories of signals: boolean flags (e.g., "Is root MFA enabled?"), ratios (e.g., "What percentage of IAM users have MFA?"), capped counts (e.g., "How many CloudWatch alarms exist, up to a ceiling of 10?"), and penalties (e.g., "Each open security group deducts 10% from its signal"). A mock evidence mode (USE_MOCK=1) allows testing without live AWS credentials.
Signal Normalization
Every check produces a signal between 0.0 (worst) and 1.0 (best). The engine uses five normalization functions:
- flag — binary (1.0 if enabled, 0.0 if not)
- ratio — direct pass-through of a 0–1 ratio
- invert — 1.0 minus the ratio (used when higher values are worse, like public bucket exposure)
- capped — count divided by a ceiling, maxing at 1.0
- penalty — starts at 1.0 and deducts a fixed amount per occurrence
Each pillar averages its applicable signals, skipping any that return None (indicating the service isn't in use). The four pillar scores are then combined using a weighted average, with re-normalization if any pillar is N/A.
Score Calibration Curve
A plain linear mapping (300 + raw × 550) doesn't reflect how credit scores are actually distributed. StatlerScore uses a piecewise linear calibration curve modeled on real FICO distribution data:
| Raw Score | Credit Score | Tier Boundary |
|---|---|---|
| 0% | 300 | — |
| 50% | 580 | Poor → Fair |
| 65% | 670 | Fair → Good |
| 80% | 740 | Good → Very Good |
| 93% | 800 | Very Good → Exceptional |
| 100% | 850 | Perfect |
This means reaching "Exceptional" requires 93%+ raw posture — not the 91% a linear formula would require. The compression at the top reflects reality: truly hardened cloud environments are rare, and the top tier should be hard to earn.
Key Factors
The engine also identifies which specific checks are helping and hurting the overall score, sorted by impact. Checks scoring at or below 50% appear as "hurting" factors (worst first), and checks at 80% or above appear as "helping" factors (best first). This gives operators an immediate remediation priority list without requiring them to dig through the full pillar breakdown.
Cloud Credit Bureau
The bureau is a FastAPI service that acts as an independent scoring authority — analogous to how Equifax or TransUnion operate for consumer credit. Clients submit evidence; the bureau scores it, signs the attestation, and appends it to a tamper-evident log.
Attestation Lifecycle
- An organization registers via
POST /organizationsand receives a one-time API key. - The CloudHarvester collects evidence from the target AWS account.
- The client submits evidence to
POST /attestwith their Bearer token. - The bureau runs the scoring engine, creates the attestation record, and signs it.
- The signed record is appended to a hash-chained log and covered by a Merkle tree.
Cryptographic Integrity
HMAC-SHA256 Signatures — Each attestation is signed over a concatenation of five fields: evidence hash, timestamp, account ID, previous record hash, and validity expiration. Binding valid_until into the signature prevents the bureau from quietly extending or shortening a score's validity after issuance. Verification uses constant-time comparison to prevent timing attacks.
Hash-Chained Log — Every record includes the SHA-256 hash of its predecessor, creating a chain back to a genesis record. Any deletion, insertion, or reordering breaks the chain at the tampered point. The GET /chain/verify endpoint walks the full log and confirms integrity.
Score Expiry
Attestations carry a configurable TTL (default: 90 days). The API returns expiry metadata with every score query — including days_remaining, expired status, and an expiring_soon warning when fewer than 14 days remain. This incentivizes continuous monitoring without applying artificial score degradation.
Bureau API Endpoints
| Endpoint | Auth | Purpose |
|---|---|---|
POST /organizations |
— | Register an org, receive API key |
GET /organizations/{id} |
— | Look up a registered org |
POST /attest |
Bearer | Score evidence, return signed attestation |
GET /score/{id}/latest |
Bearer | Most recent score with expiry info |
GET /score/{id}/status |
Bearer | Lightweight expiry check |
GET /history/{id} |
Bearer | Full attestation history |
GET /verify/{id} |
— | Verify HMAC signature of one record |
GET /chain/verify |
— | Verify hash-chain integrity across full log |
GET /merkle/root |
— | Current Merkle root + latest anchor |
GET /proof/{id} |
— | Merkle inclusion proof for one attestation |
GET /merkle/anchor/latest |
— | Full RFC 3161 anchor record |
GET /merkle/anchor/verify/{root} |
— | Re-verify stored TSR via openssl |
Public endpoints (verification, chain integrity, Merkle proofs) require no authentication — any third party can audit the log.
What This Is Not
- Not another CVE database — StatlerScore evaluates infrastructure posture against the Well-Architected Framework, not individual vulnerability records.
- Not an external attack surface scanner — StatlerScore evaluates internal configuration with direct AWS API access and produces cryptographically verifiable attestations.
- Not a cost optimization tool — The scoring framework is scoped exclusively to security risk quantification.
Future Work
- Jira integration for automated security debt ticketing based on "hurting" factors
- Multi-cloud support extending the collector framework beyond AWS
- Decay factor applying continuous score degradation (e.g., 0.5%/day) as an alternative to binary TTL expiry
- Dashboard UI with historical trend visualization beyond the current CLI and matplotlib outputs
Top comments (0)