Ransomware is not just a “security incident.” It is an operations, finance, legal, and reputation event that must be handled with rigor. A good playbook makes roles and decisions predictable under stress, turns technical steps into repeatable procedures, and measures recovery in business terms. Below are the essentials for CISOs, CSOs, CIOs, and CTOs: how to plan through tabletops, how to staff, which remediation steps matter, and how to recover reliably. I also share what we implemented at Pynest and the results.
Planning: Build Muscle Memory With Realistic Tabletops
Plan as if you’ll be hit tomorrow. Keep plans short, specific, and owned by people with names. Each scenario should include: entry vector, first signals, containment approach, decision gates, roles, notifications, recovery route, and success criteria.
Three core scenarios to cover
- Credentialed compromise that spreads quickly
- Double extortion with data theft
- Backup corruption or tampering
Run tabletops quarterly. Rotate times (including nights/weekends). Invite business owners, not only security and IT. Keep scenarios slightly ambiguous to force evidence-based decisions. End with a “hot wash,” extract 5–10 improvements, assign owners and due dates, and track closure.
“Security is not a product, but a process.” — Bruce Schneier
Decide authority in advance. Specify who can isolate identity providers, revoke OAuth apps, shut down SSO or VPN for affected segments, block command-and-control, and trigger clean-room recovery. Name alternates.
Staffing and Skills: Design for Coverage, Not Heroics
Your roster should cover identity, endpoints, backups, networks, and communications, with clear primary and backup owners.
Minimum functional roles
- Incident Commander: runs the war room, keeps a decision log, reports to executives
- Identity Owner: AD/Entra/Okta, conditional access, token revocation, service principals, vault rotation
- Endpoint/EDR Lead: quarantine at scale, tuning, clean forensics collection
- Backup & DR Owner: immutability policy, restore tests, service restore priorities
- Network/SecOps: emergency segmentation, C2 blocking, clean networks for rebuild
- Legal & Communications: notification rules and language for regulators, customers, partners
- External IR Retainer: pre-contracted DFIR with access and NDAs in place
“We cannot allow avoidable cyber disruption to cost human lives.” — Jen Easterly
Shift skills left. Train non-security owners on the exact runbooks they must execute. A backup admin who can run a restore drill on their own is more valuable than any policy binder.
Remediation: Contain Fast, Eradicate Thoroughly, Verify Like a Skeptic
Containment checklist
- Isolate compromised devices and accounts
- Suspend risky trust: SSO, high-risk OAuth apps, unused service accounts
- Block known command-and-control routes and lateral movement protocols where feasible
- Preserve evidence: memory, disk, identity audit logs, cloud admin actions
- Communicate on a pre-agreed, safe channel and timestamp decisions
Eradication checklist
- Rebuild from golden images
- Rotate everything: passwords, API keys, OAuth secrets, certificates, service credentials
- Patch initial access vectors and exploited weaknesses
- Hunt for persistence: scheduled tasks, GPO backdoors, rogue identity apps, modified conditional access, startup services
Verification checklist
- Rescan and risk-rate results
- Use canary files and tokens to catch residual encryption behavior
- Run contract tests at service boundaries to confirm critical flows work
“Quantify the risk. Identify key assets, make sure they are backed up and that the backups are secure.” — Kevin Mandia
This is the right mental model for remediation: measure what matters, protect what pays the bills, and assume attackers targeted your recovery path as well.
Recovery: Engineer for Predictability
Backups that survive the attacker
- Enforce 3-2-1-1-0: three copies, two media, one off-site, one immutable, zero errors in restore tests
- Keep at least one copy that cannot be altered even by admins
- Test restores on a schedule and publish pass/fail to executives
Clean-room rebuild
- Stand up an isolated environment, restore core identity and networking, then a minimal set of business services
- Validate data integrity and configurations before reconnecting
- Reintroduce segments in phases and run contract tests after each step
Measure recovery like operations
- Time to isolate risky accounts
- Time to restore crown-jewel services
- Restore pass rate and integrity checks
- User-visible service levels during staged return
Governance and Communications: Decide Before the Crisis
- Ransom stance: document your position on paying or not paying, who advises, and applicable legal or insurance constraints
- Notification logic: decision trees for regulators, customers, partners, and the board; pre-approved language
- Executive reporting: a short, consistent dashboard (detection time, isolation completeness, restore pass rate, data exposure status, next milestones)
What We Implemented at Pynest
A year ago, we rewrote our ransomware playbook to cut containment time and raise confidence in recovery. We focused on people, drills, and measurable outcomes.
Tabletops that change behavior
- Quarterly runs, including nights/weekends
- Each ends with a hot wash and an improvement backlog with owners and deadlines
- Leadership reviews closure monthly
- Cultural outcome: teams expect ambiguity and ask for evidence first
Identity and endpoint readiness
- Standard runbooks to revoke tokens, disable suspicious identity apps, rotate vault items, and quarantine at scale
- Pre-approved authority to shut down SSO/VPN for affected segments
- Backup and network owners trained to act in parallel with security
Backups built for real incidents
- 3-2-1-1-0 with immutable copies
- Weekly restore tests into a clean room, with results visible to the board
- Crown-jewel catalog mapped to explicit RTO/RPO for business-first recovery order
Runbook-as-code
- Versioned, clickable runbooks with decision gates and owner call trees
- “Dry-run” scripts for identity and network controls to practice safely and often
What changed
- Time to isolate high-risk accounts in drills dropped from hours to under 20 minutes
- Restore test pass rate climbed from the high seventies to the high nineties
- Executives get a consistent one-pager during exercises and discuss tradeoffs by business impact
- The board sees trend lines, not anecdotes, and backs investments tied to measurable gains
The Playbook, Section by Section
Preparation and tabletops
- Map business processes to systems, data, and owners
- Define triggers that start the ransomware playbook
- Publish roles with names, alternates, and escalations
- Run quarterly tabletops and track findings to closure
Staffing and skills
- On-call rotations for identity, endpoint, backup/DR, network, and communications
- External DFIR retainer with access and NDAs ready
Detection and containment
- Isolate devices and accounts
- Suspend risky trust for affected segments
- Block command-and-control and protect clean networks
- Collect and store evidence away from rebuild paths
Eradication
- Rebuild from golden images
- Rotate keys, tokens, and certificates
- Patch initial access and exploited weaknesses
- Hunt for persistence
- Verify with rescans and canaries
Recovery
- Prioritize services by business impact
- Use a clean room to restore and validate
- Reconnect in phases with contract tests
Governance and communications
- Decide ransom policy and advisors
- Pre-write regulatory and customer notifications
- Report the same small set of metrics every time
A Short Checklist You Can Adopt Today
- Name your Incident Commander and alternates. Grant authority to isolate identity, revoke trust, and trigger clean-room rebuilds.
- Schedule the next tabletop now. Vary time and scenario. Run a hot wash and track fixes to closure.
- Assume monitoring gaps. Layer controls across identity, endpoints, networks, and data.
- Enforce 3-2-1-1-0 backups. Test restores on a schedule and report results to leadership.
- Measure recovery in business terms. Track time to isolate, time to restore crown jewels, and restore pass rates.
- Decide your ransom stance and notification rules ahead of time. Keep legal and communications in the loop.
Bottom line: a strong ransomware playbook creates predictable behavior under pressure. Clear owners, tested decisions, and verifiable clean rebuilds turn a chaotic breach into a managed recovery. Prepare now so your first real rehearsal is not the day your business is on the line.
Top comments (0)