DEV Community

Cover image for SSH Key Management at Scale: Generating, Rotating, and Revoking Keys Across Teams
Mahafuzur Rahaman
Mahafuzur Rahaman

Posted on

SSH Key Management at Scale: Generating, Rotating, and Revoking Keys Across Teams

Most teams treat SSH keys like passwords from 2010 — created once, never rotated, and scattered everywhere. Here's how to fix that.


You onboard a new engineer. They generate an SSH key, paste the public key into five servers, and get to work. Six months later they leave the company. You remember to remove their key from two of the five servers. Maybe three.

This is how breaches happen. Not through sophisticated attacks — through forgotten keys on forgotten servers, quietly waiting.

SSH key management sounds boring until it isn't. This article covers everything you need to do it properly: key generation best practices, how to organize keys across teams, rotation strategies that won't break production, and clean revocation when someone leaves.


Why SSH Key Management Breaks Down

SSH keys feel low-maintenance because they mostly work silently. That silence is the problem.

Unlike passwords, keys don't expire by default. Unlike OAuth tokens, there's no central dashboard showing you who has access to what. Unlike certificates, there's no built-in revocation mechanism.

The result is what security teams call key sprawl: hundreds of authorized_keys entries across dozens of servers, with no inventory, no ownership records, and no expiry dates. Surveys consistently find that large organizations have more SSH keys than employees — often by an order of magnitude.

Key sprawl creates three risks:

  • Orphaned access — keys belonging to former employees, contractors, or decommissioned systems still granting entry
  • Unknown exposure — no one knows which keys can reach which servers
  • Audit failure — you can't prove compliance if you can't show who had access to what, when

The fix isn't a new tool. It's a discipline — applied consistently.


Part 1: Generating Keys the Right Way

Choose the Right Algorithm

Not all SSH key types are equal in 2024. Here's where things stand:

Algorithm Key Size Recommendation
ed25519 256-bit (fixed) ✅ Use this. Fast, secure, compact.
ecdsa 256/384/521-bit ⚠️ Fine, but ed25519 is better
rsa 2048–4096-bit ⚠️ Legacy systems only. Use 4096-bit minimum.
dsa 1024-bit ❌ Never. Broken and disabled in modern OpenSSH.

For anything modern, ed25519 is the answer.

ssh-keygen -t ed25519 -a 100 -C "alice@example.com" -f ~/.ssh/id_ed25519
Enter fullscreen mode Exit fullscreen mode

Flags explained:

  • -t ed25519 — algorithm
  • -a 100 — number of KDF rounds for the passphrase (higher = slower to brute-force)
  • -C "alice@example.com" — comment; use email or a descriptive label
  • -f ~/.ssh/id_ed25519 — output file path

For legacy systems that only accept RSA:

ssh-keygen -t rsa -b 4096 -a 100 -C "alice@example.com" -f ~/.ssh/id_rsa_legacy
Enter fullscreen mode Exit fullscreen mode

Always Use a Passphrase

A passphrase encrypts the private key on disk. Without it, anyone who copies your key file has full access to everything that key unlocks. With a passphrase, they also need to know the secret to decrypt it.

The common objection: "but then I have to type it every time." The answer: ssh-agent.

eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
Enter fullscreen mode Exit fullscreen mode

ssh-agent holds your decrypted key in memory for the session. You type the passphrase once; the agent handles the rest. On macOS, keychain integration means you only type it once per login — or per reboot.

One Key Per Context, Not One Key for Everything

A single key that unlocks every server is a single point of failure. Instead, scope keys to contexts:

~/.ssh/
├── id_ed25519_personal       # Personal projects
├── id_ed25519_work           # Work infrastructure
├── id_ed25519_client_acme    # Client: ACME Corp
├── id_ed25519_deploy         # CI/CD deploy key (no passphrase, scoped permissions)
└── id_ed25519_prod           # Production servers (extra strong passphrase)
Enter fullscreen mode Exit fullscreen mode

Wire these up in ~/.ssh/config so the right key is used automatically:

Host *.acme.internal
    IdentityFile ~/.ssh/id_ed25519_client_acme
    IdentitiesOnly yes

Host bastion.prod.example.com
    IdentityFile ~/.ssh/id_ed25519_prod
    IdentitiesOnly yes
Enter fullscreen mode Exit fullscreen mode

IdentitiesOnly yes prevents SSH from trying other keys in your agent — important when servers have MaxAuthTries set low.


Part 2: Organizing Keys Across a Team

The Baseline: A Git-Managed Key Registry

For small to medium teams (under ~50 engineers), a Git repository containing public keys and server manifests is a practical starting point.

ssh-keys/
├── users/
│   ├── alice.pub
│   ├── bob.pub
│   └── carol.pub
├── servers/
│   ├── web-prod.txt       # Lists which users have access
│   ├── db-prod.txt
│   └── bastion.txt
└── deploy-keys/
    ├── github-actions.pub
    └── jenkins.pub
Enter fullscreen mode Exit fullscreen mode

Rules:

  • Public keys only — never commit private keys
  • Every key has an owner and a date in a comment: ssh-ed25519 AAAA... alice@example.com 2024-01
  • PRs required to add or remove keys — creates an audit trail
  • A simple script syncs authorized_keys on servers from the registry

This isn't enterprise-grade, but it's infinitely better than ad-hoc key distribution with no inventory.

Structuring authorized_keys With Restrictions

authorized_keys supports per-key restrictions that limit what a key can do, even after it's been granted access. Use them.

# Full access
ssh-ed25519 AAAA... alice@example.com

# Read-only deploy key — can only run one specific command
command="/usr/local/bin/deploy.sh",no-pty,no-agent-forwarding,no-x11-forwarding ssh-ed25519 AAAA... deploy-key

# Tunnel-only key — can only forward one specific port
restrict,port-forwarding,permitopen="db.internal:5432" ssh-ed25519 AAAA... tunnel-key

# IP-restricted key
from="203.0.113.0/24" ssh-ed25519 AAAA... office-access-key
Enter fullscreen mode Exit fullscreen mode

These restrictions are enforced server-side, regardless of what the client attempts.

Tools for Larger Teams

Once you're managing keys across dozens of servers and dozens of engineers, manual management doesn't scale. Consider:

HashiCorp Vault SSH Secrets Engine
Vault can act as an SSH Certificate Authority, issuing signed, short-lived certificates instead of static keys. Engineers authenticate to Vault, receive a certificate valid for (say) 8 hours, and use it to access servers. No long-lived keys. No key sprawl. Full audit log. This is the gold standard for larger teams.

Teleport
Open-source access plane for SSH, Kubernetes, and databases. Handles key/certificate lifecycle, session recording, and access policies in one tool.

AWS EC2 Instance Connect / GCP OS Login
Cloud-native solutions that push temporary public keys to instances for the duration of a connection. No persistent authorized_keys at all.

Smallstep
Open-source certificate authority with SSH support. Easier to self-host than Vault if certificates are the only goal.


Part 3: Rotation — The Step Most Teams Skip

Key rotation means replacing existing keys with new ones on a scheduled basis. It limits the exposure window if a key is compromised without your knowledge.

When to Rotate

  • Scheduled: Annually at minimum, quarterly for sensitive systems
  • Triggered: After a security incident, after a team member's access level changes, after a laptop is lost or stolen, after a suspected compromise
  • On offboarding: Always — see Part 4

How to Rotate Without Breaking Things

Rotation fails when it's done carelessly. The safe approach is additive first, then remove.

Step 1: Generate the new key

ssh-keygen -t ed25519 -a 100 -C "alice@example.com-2024-rotation" -f ~/.ssh/id_ed25519_new
Enter fullscreen mode Exit fullscreen mode

Step 2: Add the new key alongside the old one

cat ~/.ssh/id_ed25519_new.pub | ssh user@server "cat >> ~/.ssh/authorized_keys"
Enter fullscreen mode Exit fullscreen mode

Step 3: Verify the new key works

ssh -i ~/.ssh/id_ed25519_new user@server "echo connected"
Enter fullscreen mode Exit fullscreen mode

Step 4: Remove the old key

# On the server, edit ~/.ssh/authorized_keys and delete the old key's line
ssh -i ~/.ssh/id_ed25519_new user@server "sed -i '/OLD_KEY_COMMENT/d' ~/.ssh/authorized_keys"
Enter fullscreen mode Exit fullscreen mode

Step 5: Update all references~/.ssh/config, CI/CD secrets, documentation.

Automating Rotation at Scale

For many servers, do this with Ansible:

- name: Add new SSH key
  authorized_key:
    user: "{{ item.user }}"
    key: "{{ lookup('file', 'keys/new/{{ item.user }}.pub') }}"
    state: present
  loop: "{{ team_members }}"

- name: Remove old SSH key
  authorized_key:
    user: "{{ item.user }}"
    key: "{{ lookup('file', 'keys/old/{{ item.user }}.pub') }}"
    state: absent
  loop: "{{ team_members }}"
Enter fullscreen mode Exit fullscreen mode

Run the "add" play first, verify access, then run the "remove" play. Never both in a single run without testing in between.


Part 4: Revocation — When Someone Leaves

This is where key management most visibly fails. An engineer leaves; their key stays. Weeks or months later, an audit finds it still granting access to production systems.

The Offboarding Checklist

When anyone loses access (resignation, termination, end of contract):

[ ] Identify all keys belonging to this person
[ ] List all servers and services they had access to
[ ] Remove keys from all authorized_keys files
[ ] Rotate any shared/service account keys they had access to
[ ] Revoke access to key management tools (Vault, etc.)
[ ] Remove from any team-level access groups
[ ] Document the revocation with timestamp
Enter fullscreen mode Exit fullscreen mode

The hardest part is step two: knowing everywhere they had access. This is why the Git-managed key registry matters — it's your inventory.

Doing It Fast With Ansible

# Revoke a specific user's key everywhere
ansible all -m authorized_key -a "user=ubuntu key='{{ lookup('file', 'keys/alice.pub') }}' state=absent"
Enter fullscreen mode Exit fullscreen mode

Run against your entire inventory. Done in seconds.

Using authorized_keys Comments as Metadata

Make revocation easier by putting searchable metadata in key comments:

ssh-ed25519 AAAA... alice@example.com|team:backend|added:2024-01-15|expires:2025-01-15
Enter fullscreen mode Exit fullscreen mode

A simple script can scan all authorized_keys files and flag keys past their expiry date — giving you automated rotation reminders and an audit trail.


Part 5: Auditing What You Have

Before you can manage your keys, you need to know what exists.

Scan Your Servers

# Find all authorized_keys files on a server
find /home /root -name "authorized_keys" 2>/dev/null

# List all keys with their fingerprints
while read key; do
    echo "$key" | ssh-keygen -l -f -
done < ~/.ssh/authorized_keys
Enter fullscreen mode Exit fullscreen mode

Inventory Your Local Keys

# List all key fingerprints in your local .ssh directory
for key in ~/.ssh/*.pub; do
    echo -n "$key: "
    ssh-keygen -l -f "$key"
done
Enter fullscreen mode Exit fullscreen mode

Check Key Age

If your keys have date metadata in comments, a quick grep tells you what's overdue:

grep -r "authorized_keys" /home/*/  | awk -F'|' '/expires/ {print $4, $0}' | sort
Enter fullscreen mode Exit fullscreen mode

The One Habit That Changes Everything

Audit your SSH keys on a schedule. Put it in the calendar. Once a quarter, run through every server, list every authorized key, verify every key has a known owner, and remove anything that doesn't.

It takes an hour. It's the single highest-value SSH security activity most teams never do.

The goal isn't a perfect system from day one — it's incremental improvement: better key generation today, an inventory this week, automated revocation next month.

SSH key management isn't exciting. But discovering a former employee's key on a production database server at 2 AM definitely is.


Found this useful? Follow for more practical deep-dives into security and infrastructure fundamentals.

Top comments (0)