Static Service Account Keys Are Still Your Biggest GCP Identity Risk
Most GCP environments I audit have the same problem hiding in plain sight. Not misconfigured firewall rules. Not overly permissive IAM roles. Service account keys.
I find them in GitHub repos, in CI/CD environment variables, stored on developer laptops, committed to private repos that "nobody external can access." The teams running these environments aren't careless. They're experienced engineers who set up keys years ago when it was the standard approach, and nobody has had the bandwidth to migrate.
That key sitting in your Jenkins server is a ticking breach. And unlike a compromised password, a compromised GCP key doesn't trigger an account lockout after failed attempts. It just works — silently, indefinitely — until you notice the billing spike or the security incident.
The $450k Weekend
One team I worked with learned this the hard way. A service account key leaked through a public GitHub commit. The commit was reverted within hours, but the key was already harvested by automated scrapers. Over a single weekend, attackers spun up Cloud Run instances across every available region, running crypto mining workloads.
The bill: $450,000.
GCP support eventually provided credits, but the incident consumed weeks of engineering time, triggered their SOC 2 auditor's attention, and forced an emergency security review across their entire infrastructure.
The key had been valid for three years. Nobody remembered creating it.
What Most Teams Get Wrong
The solution to this problem has existed for years: Workload Identity Federation. External identities — GitHub Actions runners, GitLab CI, even AWS workloads — can exchange OIDC tokens for short-lived GCP credentials. No keys required.
For GKE workloads, Workload Identity lets Kubernetes Service Accounts impersonate GCP Service Accounts without any credentials stored in the cluster.
These aren't new features. They're production-ready and well-documented. So why do I still find keys everywhere?
Because teams implement one piece without completing the migration.
I see this pattern constantly:
- WIF configured for GitHub Actions, but old keys left active "just in case the new approach breaks"
- Workload Identity enabled on GKE, but legacy deployments still mounting key files as secrets
- Org policy blocking key creation, but dozens of existing keys still valid and in use
The partial migration is almost worse than no migration. Your audit trail shows both authentication methods being used. Your security team can't tell which is legitimate. Your attackers now have two paths into your systems.
The Identity Pattern That Actually Works
Eliminating keys requires two components working together: Workload Identity Federation and Service Account Impersonation.
WIF handles machine-to-machine authentication. Your GitHub Actions workflow authenticates to GCP without storing any secrets:
- uses: google-github-actions/auth@v2
with:
workload_identity_provider: projects/123456/locations/global/workloadIdentityPools/github-pool/providers/github-provider
service_account: deploy-sa@project.iam.gserviceaccount.com
No key to rotate. No secret to leak. The token expires automatically.
For GKE, the Kubernetes Service Account annotation binds to a GCP Service Account:
gcloud iam service-accounts add-iam-policy-binding deploy-sa@project.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:project.svc.id.goog[production/app-ksa]"
Service Account Impersonation handles the human side. Instead of developers holding permanent credentials to a powerful service account, they impersonate a scoped service account on demand:
gcloud config set auth/impersonate_service_account deploy-sa@project.iam.gserviceaccount.com
The developer's identity is still the audit principal. You can see exactly who impersonated which account, when, and what they did. Compare that to five engineers sharing the same downloaded key file — your audit logs just show the service account, with no way to trace the actual human.
The Org Policy That Creates Friction
Once you're confident your workloads don't need keys, enforce it:
constraints/iam.disableServiceAccountKeyCreation
This org policy prevents anyone from generating new keys. I've seen it implemented successfully — and I've seen it create chaos.
The chaos happens when you enable the policy before educating your engineering team. Developers who don't know about WIF or gcloud auth application-default login suddenly can't authenticate their local development environments. They file urgent tickets. They complain about "security blocking progress." Some creative ones figure out workarounds that are worse than the original keys.
The migration order matters. Document the new authentication patterns. Train your developers. Set up WIF for CI/CD. Verify that no active workloads depend on keys. Then enable the org policy.
This sequence aligns with the Security by Design phase of our SCALE framework — identity architecture has to be right before you build automation on top of it.
The Trade-offs Nobody Mentions
WIF and impersonation aren't without friction.
Local development gets more complex. With keys, developers could just set GOOGLE_APPLICATION_CREDENTIALS and move on. With WIF, you need gcloud auth application-default login workflows documented and understood. Some developers will resist this. Your platform team needs to make the secure path the easy path.
Audit configuration has to be correct. Impersonation creates cleaner audit trails, but only if you're capturing the right logs. sts.googleapis.com events need to be in your Cloud Audit Logs configuration. I've seen teams implement impersonation and then realize months later that they weren't logging the token exchanges.
Cross-project impersonation gets complicated fast. A service account in Project A impersonating a service account in Project B that accesses resources in Project C creates a chain that's hard to audit and easy to misconfigure. Keep impersonation chains to one hop maximum.
What This Means for SOC 2
Every SOC 2 audit I've supported in the last three years has flagged service account keys. The auditors aren't wrong — long-lived credentials with no rotation policy and unclear ownership are a control gap.
The finding usually reads something like: "Service account keys exist without defined rotation schedules or ownership assignment."
You can write a policy that says keys must be rotated every 90 days. You can assign ownership in a spreadsheet. You can build automation to rotate keys. Or you can eliminate keys entirely and remove the finding at its root.
Eliminating keys is not optional for regulated SaaS. The migration path from keys to WIF is well-defined — the blocker is usually organizational, not technical. Someone has to own the project, inventory the existing keys, map them to workloads, and execute the migration without breaking production.
That's the work. It's not glamorous. It doesn't involve new tools or exciting architecture diagrams. But it's the single highest-impact security improvement most GCP environments can make today.
If identity boundaries are wrong, everything built on top of them inherits the risk.
Author: Amit Malhotra, Principal GCP Architect, Buoyant Cloud Inc
Work with a GCP specialist — book a free discovery call
Work with a GCP specialist — book a free discovery call → https://buoyantcloudtech.com
Top comments (0)