Here’s a question most CISOs can’t answer: how many SSH keys exist in your infrastructure right now?
Not how many users have SSH access. How many actual key pairs — authorized_keys entries on servers, private keys on laptops, service account keys in CI/CD pipelines, keys embedded in scripts, keys on jump boxes, keys that former employees generated 3 years ago that still grant production access.
The typical enterprise has 5-10x more SSH keys than they estimate. A 2023 Ponemon study found organizations average 23,000 SSH keys, with 90% unmanaged. These aren’t theoretical risks — SSH keys are permanent credentials with no expiry, no MFA, and often no audit trail.
Why SSH Key Management Is Broken
The Root Cause: No Lifecycle
Human identities have HR-driven lifecycle events: hire → provision access → role change → termination → revoke access. SSH keys have none of this:
Day 1: Engineer generates key, deploys to 15 servers
Day 365: Engineer changes teams (key still on all 15 servers)
Day 730: Engineer leaves company (key STILL on all 15 servers)
Day 1095: Key is 3 years old, grants production access, nobody knows it exists
There’s no “SSH key offboarding” in most organizations. HR disables the AD account, revokes SSO, and considers the person offboarded. The SSH keys persist indefinitely.
The Scale Problem
Each engineer typically has:
- 1-3 personal SSH keys (work laptop, home machine, phone)
- Access to 10-50 servers (authorized_keys entries on each)
- 1-5 service account keys (CI/CD, automation, cron jobs)
For a 200-person engineering team: 200 engineers × 30 servers × 2 keys = 12,000 authorized_keys entries to manage. Plus service accounts. Plus keys from people who left.
The Visibility Problem
SSH keys are decentralized by design. There’s no central directory of “all SSH keys in the organization.” They live in:
~/.ssh/authorized_keyson every server (scattered across hundreds of machines)- Developer laptops (private keys)
- CI/CD secrets (GitHub Actions, GitLab CI, Jenkins)
- Configuration management (Ansible vault, Puppet, Chef)
- Jump boxes and bastion hosts
- Backup systems
- Documentation (yes, people put private keys in wikis)
The Risk: What Can Go Wrong
Scenario 1: The Departed Engineer
An engineer leaves. Their AD account is disabled. But their SSH key remains in authorized_keys on 30 production servers. Six months later, their laptop is stolen (or they’re disgruntled). The SSH key still works — it bypasses SSO, MFA, and every identity control you’ve built.
Scenario 2: The Shared Service Account Key
A single SSH key is used by a CI/CD pipeline to deploy to 50 servers. The private key is stored as a “secret” in the CI system. But it’s also in:
- The original engineer’s laptop (who generated it)
- A backup of the CI system from 2 years ago
- A Slack message where someone shared it “temporarily”
- The documentation wiki (for “onboarding new team members”)
If any of these locations is compromised, the attacker has deploy access to 50 production servers.
Scenario 3: The Lateral Movement
An attacker compromises one server (via application vulnerability). They find SSH private keys on that server (used for connecting to other servers). They pivot laterally across the infrastructure using those keys — no passwords needed, no MFA challenged, no alerts triggered (SSH key auth looks identical to legitimate access).
The Solution: A Practical SSH Key Management Program
Step 1: Discovery (Find Every Key)
You can’t manage what you can’t see. Scan for SSH keys across all infrastructure:
# Find all authorized_keys files across servers
find / -name "authorized_keys" -type f 2>/dev/null
# Count entries per file
for f in $(find /home -name authorized_keys); do
echo "$f: $(wc -l < $f) keys"
done
# Extract key fingerprints for deduplication
ssh-keygen -lf /home/user/.ssh/authorized_keys
# Find private keys (potential exposure)
find / -name "id_rsa" -o -name "id_ed25519" -o -name "*.pem" 2>/dev/null | \
grep -v "/proc"
For enterprise-scale discovery, use configuration management (Ansible) to scan all servers simultaneously:
- name: Collect SSH key inventory
hosts: all
tasks:
- name: Find authorized_keys files
find:
paths: ["/home", "/root"]
patterns: "authorized_keys"
recurse: yes
register: auth_keys_files
- name: Read key entries
command: "ssh-keygen -lf {{ item.path }}"
loop: "{{ auth_keys_files.files }}"
register: key_fingerprints
Step 2: Inventory and Ownership
For every key discovered, determine:
- Who owns it? (match key fingerprint to a person or service)
- What does it access? (which servers trust this key)
- Is it still needed? (is the owner still active? Is the service still running?)
- When was it created? (SSH keys don’t have creation dates — check file timestamps)
- Is it protected? (passphrase-encrypted? On an HSM? Or plaintext on disk?)
Step 3: Establish Policy
Define and enforce:
SSH Key Policy:
- Algorithm: Ed25519 only (RSA-4096 for legacy systems)
- Passphrase: Required for all interactive keys
- Maximum age: 12 months (rotate annually)
- Service account keys: 6 months maximum
- Shared keys: Prohibited (one key per person/service)
- Key deployment: Via configuration management only (no manual ssh-copy-id)
- Offboarding: All keys removed within 24 hours of termination
- Audit: Quarterly review of all authorized_keys entries
Step 4: Automate Rotation
Manual rotation doesn’t happen. Automate it:
# Ansible playbook: rotate SSH keys quarterly
- name: SSH Key Rotation
hosts: all
tasks:
- name: Deploy current authorized keys (from central source of truth)
authorized_key:
user: "{{ item.user }}"
key: "{{ item.public_key }}"
state: present
exclusive: yes # REMOVES any key not in this list
loop: "{{ approved_ssh_keys }}"
The exclusive: yes flag is critical — it removes any key not in your approved list. This automatically handles offboarding (remove the person from the list, next rotation removes their key from all servers).
Step 5: Move to SSH Certificates (The End Game)
SSH certificates solve the fundamental problem: keys have no expiry. Certificates do.
# Issue a short-lived SSH certificate (8 hours)
ssh-keygen -s /etc/ssh/ca_key \
-I "alice@example.com" \
-n "ubuntu,deploy" \
-V "+8h" \
~/.ssh/id_ed25519.pub
# Server trusts the CA, not individual keys
# /etc/ssh/sshd_config:
TrustedUserCAKeys /etc/ssh/ca.pub
# After 8 hours: certificate expires, access ends automatically
# No authorized_keys to manage. No keys to rotate. No offboarding to forget.
With SSH certificates:
- No authorized_keys files on any server
- Access expires automatically (8-24 hour certificates)
- Centralized access control (CA decides who gets certificates)
- Complete audit trail (CA logs every certificate issued)
- Instant revocation (stop issuing certificates = access revoked)
Tools for SSH Key Management
| Tool | Type | Best For |
|---|---|---|
| Teleport | SSH certificate platform | Full SSH access management with SSO integration |
| Smallstep | SSH CA | Lightweight SSH certificate issuance |
| HashiCorp Vault | SSH secrets engine | Dynamic SSH credentials (OTP or signed keys) |
| Ansible | Configuration management | Deploying/rotating authorized_keys at scale |
| CyberArk | PAM | Privileged access management with SSH key vaulting |
| QCecuring SSH KLM | Lifecycle management | Discovery, inventory, rotation, compliance |
Compliance Requirements
| Framework | SSH Key Requirement |
|---|---|
| SOC 2 | Access credentials must be inventoried and reviewed periodically |
| ISO 27001 | A.9.2.4: Management of secret authentication information |
| PCI DSS 4.0 | 8.3.6: Passwords/passphrases for application and system accounts managed |
| NIST 800-53 | IA-5: Authenticator management (includes SSH keys) |
| CIS Benchmarks | 5.2.x: SSH server configuration and key management |
Auditors increasingly ask: “Show me your SSH key inventory. When were keys last rotated? How do you handle offboarding?” If you can’t answer, it’s a finding.
Quick Wins (Do This Week)
- Disable password authentication on all servers (
PasswordAuthentication noin sshd_config). Forces key-based auth. - Scan for authorized_keys across all servers. Count total entries. Compare to active employee count.
- Remove keys for departed employees (check HR termination list against key owners).
- Disable root SSH access (
PermitRootLogin no). Force named accounts. - Enable SSH logging (
LogLevel VERBOSEin sshd_config). Know who connects and when.
FAQ
Q: How do I know which keys belong to former employees?
A: SSH public keys often have a comment field (e.g., ssh-ed25519 AAAA... alice@company-laptop). Match comments to your employee directory. Keys without identifiable comments are the dangerous ones — they could belong to anyone.
Q: Should I use SSH keys or SSH certificates? A: If you’re starting fresh or have <50 servers: SSH certificates (Teleport, Smallstep, Vault). If you have existing infrastructure with hundreds of servers: start with key management (inventory, rotation, policy), then migrate to certificates over 6-12 months.
Q: How do I handle service account SSH keys? A: Service accounts should use dedicated keys (never shared with humans), stored in a secrets manager (Vault, AWS Secrets Manager), rotated on a 6-month schedule, and scoped to minimum necessary access (forced commands in authorized_keys where possible).
Q: What about SSH agent forwarding?
A: Avoid it. Agent forwarding exposes your SSH agent to the remote host — if that host is compromised, the attacker can use your agent to access other servers. Use ProxyJump (ssh -J bastion target) instead — it routes through the bastion without exposing your agent.