War Story: How a Leaked Azure Service Principal Compromised 2026 AKS Cluster

#story #leaked #azure #service

War Story: How a Leaked Azure Service Principal Compromised a 2026 AKS Cluster

It was 72 hours before Black Friday 2026, and the DevOps team at FinSecure, a mid-sized fintech startup, was racing to scale their Azure Kubernetes Service (AKS) cluster to handle projected 10x traffic spikes for their payment processing platform. None of us expected that a single leaked service principal (SP) would bring the entire cluster to its knees.

The Leak: A Simple Mistake with Massive Consequences

The incident started with a junior developer working on a new CI/CD pipeline for a side project: a internal dashboard to track AKS node health. To test the pipeline, they created an Azure service principal with Contributor permissions on the AKS resource group, and a client secret valid for 2 years. In a rush to push code before a deadline, they committed the SP’s client ID and secret to a GitHub repo they thought was private—but a misconfigured repository setting had made it public days earlier.

Within 4 hours of the commit, an attacker using automated GitHub secret scanning tools (like TruffleHog and GitLeaks) picked up the credentials. They validated the SP by authenticating to Azure via the CLI: az login --service-principal -u -p --tenant , and found the SP had more than enough permissions to access FinSecure’s production AKS cluster.

The Attack: From SP to Full Cluster Control

The attacker first enumerated all resources accessible to the SP, using az resource list to find the AKS cluster named finsecure-prod-aks-2026. They retrieved cluster credentials with a single command: az aks get-credentials --resource-group finsecure-prod-rg --name finsecure-prod-aks-2026 --overwrite-existing. Within minutes, they had full kubectl access to the cluster.

Worse, FinSecure’s AKS cluster had no Pod Security Standards enabled, no network policies, and most pods ran as root with no resource limits. The attacker first deployed a Monero crypto miner as a DaemonSet to all nodes, then targeted a pod running the payment API that had unencrypted access to a Azure SQL database storing customer PII. They exfiltrated 12GB of customer data over 48 hours before triggering any alerts.

Detection: A Lucky Break in Azure Defender

The breach was only caught because FinSecure’s SecOps team had recently enabled Azure Defender for Cloud (now Microsoft Defender for Cloud) with the AKS plan. A high-severity alert fired when the SP logged in from an IP address in Kyiv, Ukraine—a location no FinSecure employee had ever accessed Azure from. The team cross-referenced the SP’s sign-in logs with their repo scanning tool, which found the leaked secret in the public GitHub repo within minutes.

Remediation: Stopping the Bleed

The first step was revoking the compromised SP’s credentials and deleting the SP entirely. The team then rotated all secrets across the AKS cluster, invalidated all active pod identities, and isolated the cluster from the public internet by updating Network Security Group (NSG) rules. They deleted the malicious DaemonSet and all miner pods, then restored the payment API pod from a clean backup.

Post-incident, the team scanned all 47 internal GitHub repos for leaked secrets, finding 3 additional exposed SPs that were immediately revoked. They also enabled Azure RBAC for Kubernetes, Pod Security Standards (restricted mode), and Calico network policies for the AKS cluster.

Lessons Learned: How to Avoid This Fate

Never commit service principal secrets, connection strings, or keys to code repositories. Use Azure Key Vault to store secrets, and managed identities for Azure resources wherever possible to eliminate secret management entirely.
Enforce least privilege for all service principals: The compromised SP only needed permissions to deploy pods to AKS, but had Contributor access to the entire resource group. Restrict SPs to only the permissions they need, and set short secret expiration periods (max 90 days).
Enable security monitoring for AKS: Turn on Microsoft Defender for Cloud’s AKS plan, and integrate AKS audit logs with your SIEM. Set alerts for suspicious SP sign-ins, unusual pod deployments, and high resource usage.
Follow AKS security best practices: Enable Pod Security Standards, network policies, and disable root pod execution. Use image scanning for all container images deployed to the cluster.
Implement secret scanning in CI/CD pipelines: Use tools like GitHub Advanced Security, TruffleHog, or Azure DevOps secret scanning to catch leaked credentials before they’re merged to main branches.

For FinSecure, the incident cost $240k in remediation, legal fees, and customer churn—but it led to a complete overhaul of their cloud security posture. Don’t let a leaked service principal be the entry point for your next breach.