DEV Community

jindy zhao
jindy zhao

Posted on • Originally published at kv-shepherd.io

Why KubeVirt Needs a Governance Layer — and How We Built One


KubeVirt puts VMs on Kubernetes. But it leaves a different set of questions
open: who can request a VM, who approves it, where are the quotas enforced,
and where is the audit trail?

KubeVirt Shepherd is an open-source
governance platform for KubeVirt, designed from the start around approval
workflows, RBAC, and audit logging as foundational architecture. The project
grew out of internal use in a financial-services Kubernetes environment and is
Apache 2.0 licensed.


The Problem

In most KubeVirt environments, the VM lifecycle looks like this:

Developer wants a VM → kubectl apply → done.
Enter fullscreen mode Exit fullscreen mode

This works for dev clusters. In production, harder questions surface:

  • Who approved this VM? KubeVirt has no built-in approval flow.
  • Which team owns it? No resource-to-team mapping.
  • A VM has been idle for three months — who cleans it up? No lifecycle governance.
  • Security incident — where are the operation records? No platform-level audit trail.
  • Multiple clusters — how do you enforce policy consistently? No unified governance plane.

The existing options each come with trade-offs:

Approach Trade-off
OpenShift Virtualization Full-featured, but tightly coupled to the OpenShift ecosystem
Raw kubectl + K8s RBAC Too low-level — no approval flow, no self-service UI
Build your own portal Ongoing development and maintenance cost

What Shepherd Does

Capability Description
Approval workflows Every VM operation — create, modify, start, stop, delete — goes through a structured request → approve → deliver flow
Dual-layer RBAC Platform-level roles plus System → Service → VM membership inheritance, with environment scoping
Audit trail Every resource change records the actor, timestamp, and payload
Multi-cluster Manage VMs across multiple Kubernetes/KubeVirt clusters from one control plane
Console access Browser-based VNC and serial console with approval-aware entrypoints
i18n Chinese and English UI included
Auth provider plugins SDK for LDAP, OIDC, and custom identity source integrations

Architecture Choices

Web UI (React 19 · Next.js 16)
  ↓ REST / WebSocket
Go Backend (Gin · Ent ORM · River Queue)
  ↓
PostgreSQL 18 (single data store)
  ↓
Kubernetes / KubeVirt Clusters (client-go · multi-cluster)
Enter fullscreen mode Exit fullscreen mode

A few decisions that shaped the project:

PostgreSQL-only runtime

Shepherd deliberately avoids Redis and external message queues. PostgreSQL
handles business state, audit data, encrypted credentials, and background jobs
(via River).

The practical benefit: async tasks and business data commit in the same
database transaction — either both succeed or both roll back. This avoids the
partial-failure scenarios that are common when a separate message queue is
involved (e.g., a quota is deducted but the VM never gets created).

The operational benefit: one database to back up, monitor, and scale. Deployment
complexity drops significantly compared to a stack with Redis + RabbitMQ +
PostgreSQL.

Contract-first API

The OpenAPI spec is the single source of truth for both the Go backend and the
TypeScript frontend. Server types and client types are generated from the spec.
A CI gate blocks merges if the generated code drifts from the spec.

Architecture Decision Records

The project maintains 53 ADRs that document key decisions — ORM selection,
async model, transaction strategy, concurrency patterns, and more. ADRs are
immutable once accepted; changes require a new ADR that supersedes the old one.
CI gates enforce compliance with active ADRs.

This matters because governance decisions tend to erode over time as a codebase
grows. Making them explicit and enforceable keeps the architecture consistent
as the project evolves.


How Shepherd Compares to OpenShift Virtualization

Shepherd and OpenShift Virtualization operate at different levels:

Dimension OpenShift Virtualization Shepherd
Scope Full enterprise virtualization platform Governance layer for KubeVirt VMs
Multi-cluster Requires RHACM Built in
Approval workflows Available Core architecture
Self-service model Operator-driven Request → approve → deliver
Vendor dependency OpenShift ecosystem Any Kubernetes distribution
License Commercial Apache 2.0

If your team is already on OpenShift and satisfied with its VM governance,
that stack likely covers your needs. If you run vanilla KubeVirt and want a
governance layer without platform lock-in, Shepherd may be worth a look.


Try It

Online demo (no setup)

Open demo.kv-shepherd.io in your browser. The
instance is pre-seeded with sample data. You can walk through the full flow:
log in, browse VMs, submit a request, approve it, and check the audit log.

Self-hosted (Docker Compose)

One command to deploy on a VPS or local machine:

mkdir -p shepherd-deploy && cd shepherd-deploy
curl -fsSL https://raw.githubusercontent.com/kv-shepherd/shepherd/main/deploy/prod/deploy-prod.sh | \
  bash -s -- --release-images --with-seed
Enter fullscreen mode Exit fullscreen mode

Helm charts are also available for Kubernetes-native installs:

helm repo add shepherd https://kv-shepherd.github.io/helm-charts
helm repo update
helm upgrade --install shepherd shepherd/shepherd \
  --namespace shepherd --create-namespace
Enter fullscreen mode Exit fullscreen mode

See docs/DEPLOYMENT.md
for external PostgreSQL, domain/TLS configuration, and the security checklist.


Current Status

Shepherd is in Alpha. The core governance paths — approval workflows, RBAC,
audit trails, VM lifecycle management — have been validated through internal
production use. The Alpha label reflects deliberate caution while broader
external feedback is gathered.

What is planned next:

  • Finish live E2E validation across all major paths
  • Harden deployment documentation and upgrade guidance
  • Keep the PostgreSQL-only runtime baseline through V1

Features tracked as RFCs for future versions include VM snapshots, clone
workflows, external approval system adapters, and event archiving. See
ROADMAP.md.


Get Involved

Shepherd is a solo-maintained project at this stage. All forms of participation
are welcome — code, bug reports, documentation, or simply sharing your
experience:

  • Try the demo and share your impressions
  • Report bugs or request features via GitHub Issues
  • Join the conversation on Discord
  • Star the repo if you find it useful — it helps with visibility

The project is Apache 2.0 licensed. Contributions follow the
DCO sign-off model.


Links:

Top comments (0)