DEV Community

Cover image for There's more to risk management than what engineers typically see
Filipe Ximenes
Filipe Ximenes

Posted on • Edited on

There's more to risk management than what engineers typically see

There's a lot more to managing risk in software beyond evaluating what can break and engineers frequently fail due to a lack of a better understanding of what risk management comprehends.

When we talk about software risk, engineers typically focus on functionality breaking or systems failing catastrophically. Although these situations deserve attention, this limited view of risk can severely impact our ability to evaluate options and lead to decisions that hurt both business and careers.

One critical risk I constantly consider relates to prioritization and cost-effectiveness. While it might seem unusual to frame this as risk, the connection is direct. Commercial software resources are limited - both in money and engineering time - and companies constantly compete to win market share. Delivering the right product at the right time is a competitive advantage that can win customers or prevent losing business to competitors. Sometimes, releasing a partially broken feature is actually less risky than delaying the release to get it right.

This ties directly to the risk of complexity and over-engineering. Our industry has excellent processes and tools for building and maintaining software - I'm constantly amazed by how much these have improved our work. However, this often leads to people reaching for tools far beyond their actual needs. Everyday I see one post about teams migrating to microservices, and two others about teams going back to monoliths. The best tool is the one that adequately solves your current objectives within your constraints. More software means more potential points of failure. Reducing code and dependencies is a risk mitigation strategy.

Recency bias presents another sneaky risk. We naturally give disproportionate attention to current events over past ones. Sure, it feels great to optimize that new feature to run under 10ms, but is it really more important than fixing that year-old query that is now taking 500ms? Effective risk management requires comparing and prioritizing - but before you can compare, you need visibility. Invest in tracking known issues, technical debt, and observability so you have the right information to guide how you invest your time.

To help engineers develop a more comprehensive approach to risk management, I've dedicated one of the four chapters of my book "Strategic Software Engineering: software engineering beyond the code" to this topic. Dive deeper into risk assessment, self management - which I consider to be a kind of "non-technical" risk management - and many other essential topics that make great engineers at https://a.co/d/8kLbqtJ

How else can we improve our technical risk management skills? Share your experiences below.

Top comments (0)