Operations | Monitoring | ITSM | DevOps | Cloud

Latest Videos

What is clinical troubleshooting? #incidentmanagement #incidentresponse #sitereliabilityengineering

In this clip, Dan Slimmons explains what this clinical troubleshooting framework entails. It’s no secret that teamwork is one of those things that, when done right, can make a world of a difference. So sometimes, when responding to a particularly complicated incident, it can be best to bring a team together to figure out what’s going on and work towards a fix. But it’s not enough to just jam a bunch of folks into a room and hope for the best. You need a framework in place to ensure that everyone stays focused, diagnoses the issue and resolves it as quickly as possible.

Learning is an iterative process #incidentmanagement #incidentresponse #sitereliabilityengineering

In this clip, Viktor Stanchev explains why it's important to remember that learning is an iterative process. Whether you’re a seasoned vet when it comes to incident response, or just getting started out, it can be easy to fall into the trap of doing too much all at once. And it just makes sense. Incident response is one of those things that doesn’t have a single, perfect formula, so teams can be left doing a little bit of everything in an effort to get it right.

It's better to declare incidents early #incidentmanagement #sitereliabilityengineering

In this clip, Viktor Stanchev explains why it's better to declare incidents early rather than too late. Whether you’re a seasoned vet when it comes to incident response, or just getting started out, it can be easy to fall into the trap of doing too much all at once. And it just makes sense. Incident response is one of those things that doesn’t have a single, perfect formula, so teams can be left doing a little bit of everything in an effort to get it right.

Mistakes happen for many reasons #incidentmanagement

In this clip, Dennis Henry of Okta explains why it's important to remember that mistakes happen for several reasons and don't have a single cause. In last week’s episode of The Debrief, we had on Colette Alexander, Director of Engineering at HashiCorp, to discuss some of the myths around incident response.

Why more low severity incidents can be a good thing #incidentmanagement

In this clip, Dennis Henry of Okta explains why having more low-severity incidents can be a good thing. In last week’s episode of The Debrief, we had on Colette Alexander, Director of Engineering at HashiCorp, to discuss some of the myths around incident response. In that conversation, one of the myths we spoke about was the idea that asking “why” is better than asking “how.” And how, in reality, asking "how" allows you to focus more on the contributing factors that led to an incident happening, whereas “why” tends to single out a person, which can lead to a lot of blame.

Why "why" is the wrong question to be asking after incidents with Dennis Henry of Okta

In last week’s episode of The Debrief, we had on Colette Alexander, Director of Engineering at HashiCorp, to discuss some of the myths around incident response. In that conversation, one of the myths we spoke about was the idea that asking “why” is better than asking “how.” And how, in reality, asking "how" allows you to focus more on the contributing factors that led to an incident happening, whereas “why” tends to single out a person, which can lead to a lot of blame.

Why action items shouldn't be the goal post-incident #incidentmanagement #podcast

In this clip, Colette explains why focusing on coming up with a list of action items post-incident is a big mistake. About the episode: What if we told you that everything you thought you knew about incident response was wrong. Well, at least some of it. That some of the things you’ve been doing for years might not actually be having the impact you thought they did. Or, even worse, that some of the assumptions you’ve been making have actually been having a negative impact on you, your team and your organization.