Operations | Monitoring | ITSM | DevOps | Cloud

Going beyond MTTx measuring what "good" incident management looks like

Traditional MTTx metrics have long been the go-to measure for incident management effectiveness, but they often fail to provide a full picture or drive meaningful improvements. We analyzed data from over 100,000 incidents to develop new industry benchmark metrics that better define what "good" incident management looks like.

Opsgenie is shutting down. Here's what that means, and how incident.io can help

Atlassian recently announced they’ll be shutting down Opsgenie, their popular on-call alerting tool. After June 4, 2025, no new Opsgenie accounts will be created, and by April 5, 2027, the service will shut down completely. Users don’t seem happy about it. If you’re currently using Opsgenie, this news is significant. A key part of your incident response process is disappearing, and Atlassian suggests moving to their other products, like Jira Service Management or Compass.

A seven-step framework for running incident debriefs

Ever wrapped up an incident, thought 'Phew, glad that’s over,' only to feel your stomach drop when you see the dreaded "Incident Debrief" on your calendar? We've all been there. Incident debriefs don't need to feel like sitting through your least favorite school subject. They can (and should!) actually be engaging and useful. At incident.io, we've found a simple, repeatable, and blameless framework.

Why engineering teams are moving from PagerDuty to incident.io On-Call

Recently, we hosted a webinar on migrating from PagerDuty, where we explored why so many engineering teams are rethinking their on-call tools. This blog post is based on that conversation, diving into the frustrations teams face with PagerDuty and how incident.io On-Call offers a better way forward.

Automated incident response: Why it matters and where it's headed

Incidents happen. Whether it’s a service outage, degraded performance, or an unexpected spike in errors, things will go wrong. The question isn’t if incidents will occur—it’s how quickly and effectively you can respond when they do. For years, incident response has been a mostly manual process: someone gets paged, scrambles to investigate, loops in the right people, and after some firefighting, hopefully resolves the issue before too many customers notice.

Overhauling PagerDuty's data model: a better way to route alerts

Since its launch in 2009, PagerDuty has been the go-to tool for organizations looking for a reliable paging and on-call management system. It’s been the operational backbone for anyone running an ‘always-on’ service, and it’s done the job well. Ask anyone about the product, and you’re all-but-guaranteed to hear the phrase “it’s incredibly reliable.” I agree. But reliability isn’t everything.