%term

The struggles of actually applying incident theory

Sep 21, 2023 By Incident.io In Incident.io

Chris explains his thoughts on the theory of learning from incidents, and why work needs to be done to close the gap and help folks actually trying to get their job done.

View Video

Incident.io

Incident Management

Read more about The struggles of actually applying incident theory

What's wrong with MTTR?

Sep 21, 2023 By Incident.io In Incident.io

Taken from our a full debrief on "Learning from incidents is not the goal", Chris walks through MTTR, the justifiable bad rap it has, and his thoughts on it as a measure.

View Video

Incident.io

Incident Management

Read more about What's wrong with MTTR?

Active and passive learning from incidents

Sep 21, 2023 By Incident.io In Incident.io

In this video, Chris shares his thoughts on the difference between active learning: writing and sharing debriefs, meeting to walk through an incident, etc., and passive learning: running incidents in the open, dynamic collaboration, reviewing past incidents.

View Video

Incident.io

Incident Management

Read more about Active and passive learning from incidents

The Debrief: Learning from incidents is not the goal

Sep 21, 2023 By Incident.io In Incident.io

In this video, incident.io co-founder and CPO Chris Evans walks through his blog post "Learning from incidents is not the goal". We cover why he wrote this, his thoughts on the gap between theory and practice, and how people can really learn from incidents.

View Video

Incident.io

Incident Management

Read more about The Debrief: Learning from incidents is not the goal

The balancing act of reliability and availability

Sep 19, 2023 By incident.io In Incident.io

As consumers, we expect the products and software we buy to work 100% of the time. Unfortunately, that’s impossible. Even the most reliable products and services experience some disruption in service. Crashes, bugs, timeouts. There are a ton of contributing factors, so it's impossible to distill disruptions down to a single cause. That said, technology is becoming more and more sophisticated, and so is the infrastructure that supports it.

Read Post

Incident.io

Read more about The balancing act of reliability and availability

The connection between incident management and problem management

Sep 15, 2023 By Luis Gonzalez In Incident.io

Sometimes, two concepts overlap so much that it’s hard to view them in isolation. Today, incident management and problem management fit this description to a tee. This wasn’t always the case. For a long time, these two ITIL concepts were seen as distinct—with specialized roles overseeing each. Incident management existed in one corner and problem management in the other. Then came the DevOps movement and the lines suddenly became blurred. So where do they stand today?

Read Post

Incident.io

Read more about The connection between incident management and problem management

Practical guidance for getting started as a site reliability engineer

Sep 8, 2023 By Ben Wheatley In Incident.io

At the beginning of May, I joined incident.io as the first site reliability engineer (SRE), a very exciting but slightly daunting move. With only some high-level knowledge of what the company and its systems looked like prior to this point, it’s fair to say that I didn’t have much certainty in what exactly I’d be working on or how I’d deliver it.

Read Post

Incident.io

Read more about Practical guidance for getting started as a site reliability engineer

July 2023 newsletter: Changelog-The Deluxe Edition

Aug 10, 2023 By incident.io In Incident.io

🎵 Gotta give the people, give the people what they want! 🎵 You've been asking. And we've been listening. Over the past few weeks, we've been shipping frequently requested features to help you bring your incident management to the next level. It may be the dog days of summer, but let's ignore that, yeah? Just take a look at this recent changelog. Note that this is the biggest one we've ever published.

Read Post

Incident.io

Read more about July 2023 newsletter: Changelog-The Deluxe Edition

incident.io: A scalable incident management solution built for enterprises

Aug 4, 2023 By Luis Gonzalez In Incident.io

For enterprise businesses, a lot is riding on the efficiency of their incident response. These organizations have large customer bases, complex products, and many incidents. They also have loads of incident responders across various roles, making it difficult to coordinate internally.

Read Post

Incident.io

Read more about incident.io: A scalable incident management solution built for enterprises

Why you need an internal status page

Aug 1, 2023 By Isaac Seymour In Incident.io

When we launched incident.io Status Pages a few months ago, we stressed the importance of communicating clearly with your customers about ongoing issues. To help with this, we spent a lot of time carefully designing a status page that’s easy to understand for everyone - whether they come from a technical background, work in a different area, or just want to get on with their day.

Read Post

Incident.io

Read more about Why you need an internal status page

Operations | Monitoring | ITSM | DevOps | Cloud

The struggles of actually applying incident theory

What's wrong with MTTR?

Active and passive learning from incidents

The Debrief: Learning from incidents is not the goal

The balancing act of reliability and availability

The connection between incident management and problem management

Practical guidance for getting started as a site reliability engineer

July 2023 newsletter: Changelog-The Deluxe Edition

incident.io: A scalable incident management solution built for enterprises

Why you need an internal status page

Monthly Archive

Follow Us