DevOps is good for business - this is undeniable. Elite performers consistently outperform their peers in the market - regardless of the market segment.
The DevOps Handbook, Second Edition - How to Create World-Class Agility, Reliability, & Security in Technology Organizations, by Gene Kim and Jez Humble, is a guided tour through the world of DevOps. The book provides principles, proven practices, and case studies for organizations that want to build safe and resilient systems, with minimal maintenance cost, and unimaginable levels of business agility.
DevOps isn't about automation,
just as astronomy isn't about telescopes
Fast flow of planned work to production
How technology work is managed and perform as leading indicator of market success
Downward Spiral - traditional goals of operations (stability) and development (change) in opposition
Cannot see the big picture when pickled
Creating systems that cause feelings of powerlessness is one of the most damaging things we can do to fellow human beings
Deploy during business hours,
lead times in minutes to hours
Empirically great for business
Proven better business outcomes Accelerate
Loosely coupled teams scale
Part I - The Three Ways
- Flow (capabilities and practices)
- Feedback and monitoring
- Continuous learning (generative culture)
10+ Deploys Per Day - 2009 John Allspaw
Configuration and Infrastructure managed using craftsmanship principles
Improvement Kata - daily practice improves outcomes...
Avoid cargo cult Lean/Agile
Value Stream - sequence of activities necessary to produce value
Deployment Lead Time - time from Dev Complete to Production
Shift Quality Left - build quality in, not tacked on at the end
Process Time (Touch Time) = Lead Time - Wait Time
The First Way
- increase flow
- make work visible - Kanban
- theory of constraints Goal
- reduce batch size
- eliminate waste
The Second Way
- fast feedback loops
- establish cause and effect
- enables learning
Continually automate manual tasks - self service
Technical Andon Cord - defect / second opinion - safety culture
Push decision making down to where work is
The Third Way
- continuous learning
Transform local discoveries into global improvements
Blameless Post Mortem - learn from failures, not blame
More important than daily work,
Improvement of daily work
Resilience through anti-fragility
Experiment iteratively towards True North goal
Part II - Where to Start
Thought process to guide decisions, actual steps to be taken, and case studies to visualize
Start with enthusiastic early adopters - find early winds - land and expand
Little fish learn to be big fish in little ponds
Focus on wait times and rework
Invest 20% (at least) in the "ilities" (NFRs)
Shared pain reinforces shared goals
Design systems with Conway's Law in mind
Part III - Technical Practices of Flow
Environments like cattle, not snowflakes
Production like on demand
Design systems/architecture for testability
Immutable Infrastructure - all artifacts, source, and configuration in version control - removes variance
"Done" includes potentially shippable code deployed in a production-like environment
Automated testing is principle/foundational - manual testing cannot scale
CI+ - Continuous Integration built upon DevOps practices
Slow and periodic feedback kills development effectiveness
... especially at scale
Non-idempotent test should be rewritten, or removed
Optimize for team productivity
- Continuous Integration
- Trunk Based Development
Deploy to production as frequently as possible
Every commit for single piece flow
Boring deployments lead to high agility
Decouple deployment and release
- blue/green (environment)
- feature flags (code)
Continuous Integration <
Continuous Delivery <
Continuous Deployment
Resource Utilization Trap
Flow Efficiency over
Resource Efficiency
Don't Rewrite,
Delegate and Decommission
Strangler Fig Pattern
Part IV - The Second Way - Technical practices of feedback
Logging, monitoring, and visualizing telemetry/events as a first principle
Make tracking anything easy
Application Performance Monitoring - if it's important enough to implement, it's important enough to instrument
Allow self service of metrics
Overlay system events - such as deployments, incidents, and/or maintenance periods - on top of business metrics
Use means and standard deviations to detect anomalies in clusters - do not need to know what normal is, just what it is not
Rotate Pager Duty through the entire team - empathy and visualization - Goldilocks alerting (just right)
Validate assumptions empirically in lowest fidelity possible
Part V - The Third Way - The Technical Practices of Continual Learning and Experimentation
Pair programming is live peer review
- one repository to rule them all
- codify non-functional requirements
- create run books for manual operations
- guidance over Governance on architectural decisions
In complex systems errors are inevitable, utilize resilience engineering practices to accommodate - enable chat rooms to trigger events, notify status/results, and alerts
Perform Blameless Postmortems after production incidents (and near misses) - lower failure signals as error detection decreases
Retrospect Early, Retrospect Often
Make results globally available
Transform documented knowledge, processes, and design standards into reusable code
Create regular space for improvement blitzes - explicit time dedicated to learning and/or improvement of daily work - ensure cross pollination of teams/business units
Allow everyone to teach and learn - most valuable thing employee can do is teach someone or learn something new
Remove "I do not have time to test" as an excuse - Make It Easy
Make the change easy,
Then make the easy change
Automated testing, and automation in general, is foundational to all other DevOps practices
Part VI - The Technical Practices of Integrating Information Security, Change Management, and Compliance
Integrate information security practices into daily work - shift left - compliance by demonstration
Spread OWASP Top 10 throughout organization - create security learning opportunities
Utilize static/dynamic security analysis inside delivery pipeline - source, dependencies, and sub-dependencies are scanned for vulnerabilities
Create paved road for developers to follow - all necessary information security checks - internal package management
Classify Change Types by risk
- routine
- standard
- urgent
Change Management Traceability - link deployments to commits, and commits to work tickets
Physically and logically separate components that require special compliance - do not force strict policies where not needed
Top comments (0)