DEV Community

Indra Gusti Prasetya
Indra Gusti Prasetya

Posted on • Originally published at indragustiprasetya.com

Non-Human Identity Governance: Field Tips for 2026

You locked down your human logins years ago: SSO, MFA, a joiner-mover-leaver process, access reviews every quarter. The machine identities never got that treatment, and they bred. Service accounts, API keys, OAuth tokens, SSH keys, CI jobs, RPA bots, and now AI agents. In cloud-native shops these non-human identities (NHIs) outnumber people 144:1 (Entro Labs, H1 2025); even cautious enterprise-wide counts sit at 45:1. They rarely expire, nobody owns them, and SOC 2, ISO 27001, PCI DSS, and NIST 800-53 mostly leave them in a grey zone. OWASP cared enough to publish a Non-Human Identities Top 10 for 2025, and the headline risks are boring on purpose: improper offboarding, leaked secrets, over-privilege, and long-lived credentials. If someone just handed you "go govern the machine identities," here is what actually moves the needle, in roughly the order I'd do it.

The tips

  1. Build one correlated inventory before you touch a single permission. The thing that kills most NHI programs on day one is partial visibility: secrets in a vault, service accounts in IAM, tokens scattered across SaaS apps, certs in a fourth place. Stop inventorying by storage location and key it by identity instead, joining each credential to an owner, a last-used timestamp, and its permissions. Start with what the cloud APIs hand you for free.
   # AWS: IAM users acting as service accounts + when their keys last worked
   aws iam list-users --query 'Users[].UserName' --output text \
    | xargs -n1 -I{} aws iam list-access-keys --user-name {} \
      --query 'AccessKeyMetadata[].[UserName,AccessKeyId,CreateDate]' --output text
Enter fullscreen mode Exit fullscreen mode
  1. Replace static cloud keys in CI with OIDC workload identity federation. A long-lived AWS_SECRET_ACCESS_KEY or a GCP JSON key file sitting in CI secrets is the classic NHI breach path, and rotating it is a chore nobody does on schedule. GitHub Actions can trade a short-lived OIDC token for cloud access that expires in about an hour and is scoped to one job, so there's no stored secret to leak in the first place. This is the single change with the best effort-to-risk ratio on the list.
   permissions:
     id-token: write   # lets the job request the OIDC token
     contents: read
   steps:
     - uses: aws-actions/configure-aws-credentials@v4
       with:
         role-to-assume: arn:aws:iam::111122223333:role/ci-deploy
         aws-region: us-east-1   # no access keys anywhere in the repo
Enter fullscreen mode Exit fullscreen mode
  1. Put a hard ceiling on token lifetime (OWASP NHI7). A long-lived secret turns a one-time leak into permanent access, which is why an old key is worth more to an attacker than a fresh one. Audit for credentials with no expiry or absurd TTLs and cap them, then make minutes the default for anything machine-to-machine. The keys that bite you are always the ones created in 2021 that nobody remembers.
   # GCP: service-account keys older than 90 days, rotate or kill them
   gcloud iam service-accounts keys list \
     --iam-account=svc@project.iam.gserviceaccount.com \
     --format="table(name, validAfterTime)" --filter="validAfterTime<-P90D"
Enter fullscreen mode Exit fullscreen mode
  1. Scan for leaked secrets everywhere a developer's hands go, not just main (OWASP NHI2). Secret leakage is the #2 NHI risk because credentials don't stay in vaults: they get hard-coded in source, baked into container layers, echoed into CI logs, and pasted into Slack threads. Run scanning in pre-commit so the leak never lands, and run it server-side too, including build logs and image history. The pre-commit hook is the cheap win; the server-side scan is what catches the laptop that skipped the hook.
   # Pre-commit scan of staged changes only, blocks the leak before the push
   gitleaks protect --staged --redact -v
Enter fullscreen mode Exit fullscreen mode
  1. Treat every exposed secret as live until you prove it dead. "We rotated it" is not closure on a leaked key. The GitGuardian State of Secrets Sprawl 2026 work spells out the real sequence: confirm whether the credential still authenticates, find the owner, revoke or rotate it, then comb the logs for abuse across the entire exposure window. A key that was rotated after it was already used is an incident, not a tidy cleanup ticket, and the difference is in the logs.

  2. Make an owner mandatory at creation and reject anything untagged. Sprawl exists for one reason: no human is accountable for any single machine identity, so nobody rotates, reviews, or retires it. Enforce an owner tag as a creation-time policy rather than a documentation wish, because retroactively assigning owners to a thousand orphans is the worst afternoon of your quarter. Fail the apply if the field is empty.

   # Terraform: refuse a service account with no declared owner
   variable "owner" {
     validation {
       condition     = length(var.owner) > 0
       error_message = "Every service account must declare an owner."
     }
   }
Enter fullscreen mode Exit fullscreen mode
  1. Right-size privileges from real usage data, not from what felt safe at 2am. Blanket *:* and roles/editor grants are the norm, not the exception, and 70% of AI systems are handed more access than a human in the same role would get. Pull last-used permission data, strip anything untouched for 90 days, then rebuild from deny and add back only what the workload actually called. Generating the policy from CloudTrail beats guessing, and it gives you an artifact to show the auditor.
   # AWS IAM Access Analyzer: build a least-privilege policy from real CloudTrail usage
   aws accessanalyzer start-policy-generation \
     --policy-generation-details '{"principalArn":"arn:aws:iam::111122223333:role/data-job"}'
Enter fullscreen mode Exit fullscreen mode
  1. Offboard NHIs the way you offboard people (OWASP's #1 risk). Improper offboarding tops the 2025 list: the app a credential served gets decommissioned, but the identity keeps its full access and waits. Tie each NHI's lifecycle to the thing it serves so that retiring a repo, app, or pipeline takes its identities down with it. Back that with a monthly "last used more than 90 days ago" sweep to catch whatever slipped through, because something always does.

  2. Use workload identity for service-to-service auth instead of passing secrets around. Minting an API key and shipping it between internal services just creates another thing to steal from a config file or an environment variable. SPIFFE/SPIRE issues short-lived, cryptographically verifiable identities (SVIDs) based on what a workload is rather than a secret it holds, so there's nothing static to exfiltrate. This is heavier to stand up than OIDC in CI, so save it for east-west traffic that genuinely warrants it.

   # Fetch a workload's SVID from the SPIRE agent, no static secret involved
   spire-agent api fetch x509 -socketPath /run/spire/sockets/agent.sock
Enter fullscreen mode Exit fullscreen mode
  1. Govern AI agents as first-class NHIs with just-in-time credentials. Agentic systems do things older NHIs never did: acquire credentials on their own, chain across multiple agents, and escalate permissions at runtime, and only 13% of organizations feel ready for it. Never hand an agent a standing god-token; issue a narrowly scoped credential per task, evaluate the request when it's made, and revoke the moment the task ends. The working pattern is: identify the workload, issue a scoped short-lived credential, evaluate at runtime, revoke on completion.

  2. Fold NHIs into the access reviews and posture management you already run. Auditors increasingly expect machine identities inside the same governance you apply to humans, and a SOC 2 review that only covers human users is a finding waiting to be written. Add NHIs to the quarterly access review, then stand up Identity Security Posture Management (ISPM) so stale, orphaned, and over-privileged identities surface continuously instead of once a year when someone remembers to look.

Wrap-up

If you only get budget for one of these, do the first one and do it completely: build the inventory and put a named owner on every machine identity. Rotation, least privilege, offboarding, and agent governance all assume you know the credential exists and who answers for it, and none of them work without that. Sprawl happened because accountability was nobody's job. Governance starts the moment it becomes someone's. Inventory first, owner always, short-lived by default.

Sources

Top comments (0)