Debasish Mohanty

Posted on Mar 13

Building an Enterprise Release Governance Platform for Kubernetes (DevSecOps + CI/CD)

#kubernetes #security #cloudnative #devops

Introduction

Modern CI/CD pipelines often rely heavily on test results to decide whether a release should proceed.

However, in real production environments, tests passing alone does not guarantee a safe release.

For example:

A deployment may pass all tests
But the Kubernetes cluster might already be under pressure
Other services may be crashlooping
Security vulnerabilities might exist in dependencies

To explore this problem, I built a platform called Enterprise Release Governance System (ERGS).

The goal of ERGS is to transform a traditional CI/CD pipeline into a release intelligence system that evaluates multiple signals before allowing a release.

The Problem with Traditional CI/CD

Typical pipelines usually follow a pattern like this:

Run tests
Build artifacts
Deploy

This approach ignores important signals such as:

security vulnerabilities
dependency risks
cluster health
runtime platform stability

In large systems, releasing without considering these signals can introduce serious operational risks.

What is ERGS?

Enterprise Release Governance System (ERGS) is a governance layer on top of CI/CD pipelines.

It integrates multiple validation layers and generates a final release decision:

GO → Safe to release
HOLD → Manual review required
NO-GO → Release blocked

The entire system runs inside GitHub Actions pipelines and produces consolidated reports.

Project repository:

👉 https://github.com/Debasish-87/ReleaseGuard

ERGS High-Level Architecture

                     ┌─────────────────────────┐
                     │     Developer Commit     │
                     │     GitHub Repository    │
                     └─────────────┬───────────┘
                                   │
                                   ▼
                       ┌────────────────────┐
                       │   GitHub Actions    │
                       │  CI/CD Pipeline     │
                       └─────────┬───────────┘
                                 │
                                 ▼

        ┌───────────────────────────────────────────────────┐
        │              RELEASE GOVERNANCE PIPELINE           │
        └───────────────────────────────────────────────────┘

      ┌───────────────┐
      │ Layer 1       │
      │ Application   │
      │ Testing       │
      │ (Allure)      │
      └──────┬────────┘
             │
             ▼

      ┌───────────────┐
      │ Layer 2       │
      │ DevSecOps     │
      │ Security Scan │
      │               │
      │ Semgrep       │
      │ Trivy         │
      │ Gitleaks      │
      └──────┬────────┘
             │
             ▼

      ┌───────────────┐
      │ Layer 3       │
      │ SBOM &        │
      │ Dependency    │
      │ Security      │
      │               │
      │ Syft          │
      │ Grype         │
      └──────┬────────┘
             │
             ▼

      ┌───────────────┐
      │ Layer 4       │
      │ Kubernetes    │
      │ Platform      │
      │ Validation    │
      │ (KPQE)        │
      │               │
      │ Node checks   │
      │ Pod health    │
      │ Crashloops    │
      └──────┬────────┘
             │
             ▼

      ┌───────────────┐
      │ Layer 5       │
      │ Release       │
      │ Dashboard     │
      │               │
      │ Consolidated  │
      │ Reports       │
      └──────┬────────┘
             │
             ▼

      ┌───────────────┐
      │ Layer 6       │
      │ Final         │
      │ Decision      │
      │ Engine        │
      │               │
      │ GO / HOLD     │
      │ / NO-GO       │
      └──────┬────────┘
             │
             ▼

      ┌───────────────────────────┐
      │ GitHub Pages Reports      │
      │                           │
      │ /allure                   │
      │ /security                 │
      │ /sbom                     │
      │ /kpqe                     │
      │ /dashboard                │
      │ /decision                 │
      └───────────────────────────┘

System Architecture

The pipeline evaluates releases using multiple layers.

Layer 1 — Automated Testing

Application tests are executed and results are published using Allure reports.

Outputs include:

Allure HTML report
test execution summary
testing intelligence signals

Layer 2 — DevSecOps Security Scans

Security analysis is performed using several tools:

Semgrep → static code analysis
Trivy → vulnerability scanning
Gitleaks → secret detection

These tools identify potential security risks before deployment.

Layer 3 — SBOM Generation

Software Bill of Materials (SBOM) is generated using Syft.

Dependencies are then scanned for vulnerabilities using Grype.

Outputs include:

CycloneDX SBOM
vulnerability reports

This ensures visibility into software supply chain risks.

Layer 4 — Kubernetes Platform Validation

Before approving a release, the pipeline validates the cluster health.

The platform checks:

node readiness
pod crashloops
restart risk signals
overall cluster health

This stage is implemented using Kubernetes Platform Quality Engineering (KPQE).

Layer 5 — Release Intelligence Dashboard

All signals are merged into a consolidated dashboard.

The dashboard provides:

release summary
testing insights
security scan results
SBOM vulnerability status
Kubernetes readiness signals

Layer 6 — Final Decision Engine

Finally, a decision engine evaluates governance rules.

Example logic:

GO

tests passed
no critical vulnerabilities
cluster health acceptable

HOLD

tests pass but security warnings exist

NO-GO

tests fail
critical vulnerabilities detected
cluster validation fails

The system generates a final output:

final-decision.jsonjson

Example Release Governance Decision

A typical decision might look like this:

json { "releaseDecision": "GO", "tests": "passed", "security": "clean", "clusterStatus": "healthy" }

This provides a clear automated decision for CI/CD pipelines.

Why Release Governance Matters

Modern production environments are extremely complex.

A safe release decision should consider:

code quality
security posture
supply chain risks
infrastructure health

By combining these signals, we can significantly improve release reliability.

Demo

Demo video:

https://youtu.be/rC9K4sqsgE0

Project repository:

https://github.com/Debasish-87/ReleaseGuard

Conclusion

CI/CD pipelines are excellent for automation, but they often lack governance intelligence.

A governance layer like ERGS allows teams to make smarter release decisions by combining:

testing signals
security scans
dependency analysis
Kubernetes cluster health

Instead of relying only on tests passing, releases can be evaluated with a holistic risk perspective.

💬 I'm curious how other teams handle release governance in Kubernetes environments.

Do you validate cluster health before deployments, or rely only on CI/CD pipeline checks?

DEV Community