The VOID makes public software-related incident reports available to everyone, increasing understanding of software-based failures in order to make the internet a more resilient and safe place. After scrutinizing nearly 10,000 incidents, one thing is crystal clear: Resilience saves time. Taking the time to understand how to better respond when something green turns red—learning from the people, the processes, and the systems—will make your next incident smoother.

Key Findings

Duration Isn't Cut and Dry

Duration of incidents conveys little meaning about the incidents themselves, in part because it can be very tricky to attribute when incidents start or stop.

It's Time To Retire MTTR

Mean Time to Resolve (MTTR) isn’t a viable metric for the reliability of complex software systems for a myriad of reasons, particularly because averages of duration data lie.

Duration and Severity Aren't Related

We found that duration and severity are not correlated—companies can have long or short incidents that are very minor, existentially critical, and nearly every combination in between.

Root Cause Analysis Is On The Decline

Despite adding four times the number of incidents in 2022, the number of RCA-based reports didn't increase proportionally. We even saw a move away from RCA in large enterprise organizations, as they embrace more in-depth analyses.

