DEV Community

Cover image for We Rebuilt the Kubernetes Dashboard From Scratch — Here's What We Added
Gregory Griffin
Gregory Griffin

Posted on

We Rebuilt the Kubernetes Dashboard From Scratch — Here's What We Added

We Rebuilt the Kubernetes Dashboard From Scratch — Here's What We Added


The official Kubernetes Dashboard was archived in January 2026. The Angular codebase was unmaintainable, the toolchain was fragile, and the upstream maintainers called it quits. We needed a dashboard for our home lab security platform — so we built one from scratch.

This is the story of what we built, why we made the decisions we did, and what makes it different from the alternatives.


Why Not Headlamp, Lens, or Rancher?

We evaluated the obvious alternatives:

  • Lens — desktop app, not in-cluster, commercial freemium model. Not suitable for a hardened in-cluster deployment.
  • Headlamp — great project, but its extensibility model relies on dynamically loaded JavaScript plugins. We preferred a dashboard with no plugin runtime and a fully auditable, static codebase.
  • Rancher — full multi-cluster platform, overkill for a single cluster, and it needs its own namespace footprint

We wanted something that runs in-cluster as a hardened pod, has no external dependencies beyond the Kubernetes API, and is auditable — no plugin runtime, no dynamic code loading.

One more thing the original couldn't do: work with MetalLB. It was designed around kubectl proxy and NodePort — never intended to sit behind a real in-cluster gateway. Getting it onto a MetalLB IP required Kong as the gateway, which the original had no support for.

So we forked the archived dashboard and rebuilt it.


The Stack

Frontend: React 19 + Material UI v6 + Vite + TypeScript

Backend: Four Go modules — API, Auth, Metrics Scraper, Common

Gateway: Kong 3.6 (DBless) as the in-cluster API gateway

Auth: Bearer token (Kubernetes Service Account) with CSRF protection

Deploy: Raw Kubernetes manifests — no Helm

The original Go backend contracts were preserved. Only the UI layer was replaced.


What We Shipped Beyond "Standard Dashboard"

The usual dashboard shows you workload lists, pod logs, and a shell. We added ten native features that normally require separate tools:

1. Policy Audit (Polaris-native)

14 security checks against live pod specs — no Polaris deployment needed:

  • Privilege escalation allowed?
  • Host network / host PID / host IPC?
  • Resource limits set?
  • Readiness and liveness probes defined?
  • Image tag is latest?
  • seccompProfile configured?

Each workload gets a score 0–100. The namespace gets an aggregate. Filterable by severity (Danger / Warning / Pass).

The entire thing is a single Go package reading pod specs directly from the Kubernetes API. No admission webhooks, no third-party operators.

2. Resource Efficiency (Goldilocks-style)

Compares actual CPU/memory usage (from metrics.k8s.io) against container requests and limits:

  • No Limits — the container has no resource limits set (dangerous)
  • Hot — actual usage exceeds requests (risking OOM or CPU throttling)
  • Cold — requests are much larger than actual usage (wasting capacity)
  • OK — requests and limits are reasonable

CSV export for capacity planning. With VictoriaMetrics enabled, trend arrows (↑ ↓ →) show whether usage is growing, shrinking, or stable.

3. Certificate Tracker

Parses every kubernetes.io/tls Secret across all namespaces using Go's stdlib crypto/x509. No cert-manager dependency.

Shows: Common Name, SANs, issuer, expiry date, days remaining, and a status badge (Valid / Warning ≤30d / Critical ≤7d / Expired).

4. RBAC Viewer

Joins all ClusterRoles, Roles, ClusterRoleBindings, and RoleBindings into a flat view. Filters by subject, scope, or kind. Wildcard (*) detection flags any binding that grants broad permissions.

5. AI Assistant (Claude Sonnet)

A chat drawer powered by Anthropic's Claude via SSE streaming. When opened from a pod detail page, it automatically injects the pod's spec, container statuses, and recent events as system context.

Ask it: "Why is this pod crash-looping?" and it reads the events, the image, the environment variables, and the resource limits to give you an actual answer — not a generic Kubernetes doc quote.

6. Event Timeline

Live cluster event feed refreshing every 5 seconds. Events are time-bucketed by minute, Warning events highlighted in amber, filterable by namespace and free text. Useful for watching a rolling restart or debugging a flapping pod without staring at kubectl get events -w.

7. Cluster Map

All Deployments, DaemonSets, and StatefulSets rendered as health cards grouped by namespace — zoom from 40% to 150%, filter by Error/Warning, click through to the workload detail. Useful for a quick visual sweep of the cluster when something is off.

8. Application Projects

Per-namespace project cards showing pod health (Running / Pending / CrashLoopBackOff / Failed), workload counts (Deployments, StatefulSets, DaemonSets), and CPU/memory request totals. Clicking a card navigates to the pod list filtered to that namespace. System namespaces hidden by default with a toggle.

9. Kubescape Security (auto-detected)

If Kubescape Operator is running in the cluster, a Security section appears automatically in the nav — no configuration, no feature flags. It reads Kubescape's CRDs (spdx.softwarecomposition.kubescape.io) and displays:

  • Security Overview — SVG donut charts for config scan pass/fail ratio and CVE severity breakdown, plus a lowest-scoring workloads table. One-page health snapshot before you drill in.
  • Config Scan — compliance score per workload (0–100), filterable by namespace
  • Vulnerabilities — CVE findings per workload with severity breakdown (Critical / High / Medium / Low)

If Kubescape is not installed, the Security section simply doesn't appear.

10. Email Notifications (Microsoft Graph API)

Two notification modes:

  • Health Digest — daily email with cluster health score, namespace table, top issues
  • Event Alerts — real-time email on CrashLoop, OOM kill, ImagePullBackOff, NodeNotReady, PVC issues. 1-hour dedup per workload so you don't get flooded.

No SMTP server needed — sends via Microsoft Graph API (works with any Microsoft 365 / Entra ID account).


Security Hardening

We ran an OWASP Top 10 audit and fixed everything we found:

  • No wildcard RBAC — each pod has its own ServiceAccount with least-privilege rules
  • NetworkPolicy default-deny — every pod has an explicit allow list; all other traffic is denied
  • readOnlyRootFilesystem: true on all containers except Kong (which writes Lua cache)
  • All capabilities dropped (drop: [ALL])
  • seccompProfile: RuntimeDefault on all pods
  • CSRF protection on all mutating POST endpoints
  • Bearer tokens never logged, never cached in plain text
  • API response cache keyed by SHA256(token) + URL — users can never see each other's cached data

RBAC-Aware UI

Every action button (Delete, Scale, Restart, Rollback, Pause) checks the user's actual Kubernetes permissions via SelfSubjectAccessReview before rendering. If the user's token doesn't have patch on daemonsets, the Rollback button is greyed out with a tooltip — not shown at all, not erroring after the click.


Rollback With Revision History

Deployments, DaemonSets, and StatefulSets all have a rollback button in the action bar. Click it and you get a revision picker:

  • Deployments — reads ReplicaSets annotated with deployment.kubernetes.io/revision
  • DaemonSets / StatefulSets — reads ControllerRevision objects (the same mechanism kubectl rollout undo uses)

Select a revision, see the container image and age, click Rollback. Same UX as kubectl rollout undo --to-revision=N but without leaving the browser.


Optional: VictoriaMetrics Backend

The dashboard ships with SQLite as the default metrics cache (same as the original). If you deploy VictoriaMetrics alongside it (one StatefulSet, one PVC — the manifest is included), the metrics scraper starts pushing to it via remote write, and you get:

  • Pod CPU/memory sparklines with time range toggles (1h / 6h / 24h / 7d)
  • Trend arrows on the Resource Efficiency page
  • Historical data surviving pod restarts

If you don't deploy VictoriaMetrics, nothing changes. The capabilities endpoint returns {"victoriaMetrics": false} and the UI shows nothing different.


Gateway API Support

If your cluster runs Istio, Envoy Gateway, Contour, or any other Gateway API implementation, the dashboard automatically detects gateway.networking.k8s.io in the API server's discovery endpoint and adds a Gateway API section to the nav — GatewayClasses, Gateways, HTTPRoutes. If the CRDs aren't there, the section doesn't appear.

No configuration. No feature flags.


The Architecture Decision We're Most Proud Of

Everything is native Go — no external operators, no CRD controllers, no plugin runtime.

The policy audit doesn't need Polaris deployed. The cert tracker doesn't need cert-manager. The efficiency analysis doesn't need Goldilocks or VPA. The RBAC viewer doesn't need any extra tooling.

This means:

  • One fewer deployment per feature
  • No version coupling between the dashboard and external tools
  • The entire security surface is one set of pods with auditable source code

For Kubescape, we read its CRDs dynamically — if Kubescape is present, the CVE and compliance data appears. If not, nothing changes. Same pattern for Gateway API.


ISMS Core Integration

The dashboard also exposes a GET /api/v1/summary endpoint — a machine-readable cluster health snapshot (node status, pod counts by phase, recent Warning events, policy audit score, cert expiry counts). An ISMS CORE connector polls this endpoint and maps the data to ISO 27001 controls, so cluster health flows into the wider security posture platform automatically.

This follows the same pattern as the Kubescape and Gateway API integrations: the dashboard is a passive provider of structured data, not a controller.


Links

Top comments (0)