DEV Community

Cover image for Measuring DevOps maturity without a consultant: an open-source, automatable baseline
Xianpeng Shen
Xianpeng Shen

Posted on • Originally published at devops-maturity.github.io

Measuring DevOps maturity without a consultant: an open-source, automatable baseline

Maturity models usually mean a spreadsheet and a consultant. I wanted the
opposite: something I could pip install, run in 60 seconds, drop into CI,
and share as a badge — covering the whole DevOps lifecycle, not just one
slice of it.

That's DevOps Maturity — an open-source spec plus tooling (CLI, web UI,
GitHub Action) built around a weighted checklist.

The gap it fills

There are great tools that go deep on one axis, but none give a quick,
broad "where do we stand" baseline:

Tool Focus Note
DORA metrics delivery outcomes measures results, not practices
OpenSSF Scorecard OSS security health security-only, public repos
SLSA supply-chain integrity deep & narrow (we map to it)
DevOps Maturity practices in place, end-to-end build, quality, security, supply chain, analysis, reporting

They're complements, not competitors. DevOps Maturity is the fast first
pass that tells you which deep tool to reach for next.

60-second try

pip install devops-maturity
dm assess
Enter fullscreen mode Exit fullscreen mode

You get an overall score, a level (WIP → PASSING → BRONZE → SILVER → GOLD),
per-category scores, prioritized recommendations, and a badge URL.

Make it repeatable

Keep a devops-maturity.yml in the repo and run it in CI:

dm config --file devops-maturity.yml --format json
Enter fullscreen mode Exit fullscreen mode

…or let the GitHub Action re-assess and keep the badge current on every
change.

Skip the questionnaire (AI auto-mode)

ANTHROPIC_API_KEY=... devops-maturity assess --auto --ai anthropic
Enter fullscreen mode Exit fullscreen mode

It infers the answers from the repo's README, CI config and file tree.
Works with OpenAI / Anthropic / Gemini, or fully local via Ollama.

It's early — tell me where it's wrong

Apache-2.0, Python, built mostly solo. I'd love feedback on the criteria
and weights specifically: anything missing, mis-weighted, or just wrong?

Top comments (0)