Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

[SRE: From Theory to Practice] What's difficult about problem detection?

Feb 7, 2023 By Blameless In Blameless

In this episode of FTTP, Kurt Andersen and Matt Davis are joined by Joanna Mazgaj and Laura Nolan to talk about the implications of and considerations for problem detection. Watch the full episode and hear them share personal stories about the types of challenges you might face. Ultimately, how do we explain and address the socio-technical concepts behind problem detection?

View Video

Blameless

Read more about [SRE: From Theory to Practice] What's difficult about problem detection?

[SRE: From Theory to Practice] What's difficult about incident command?

Feb 7, 2023 By Blameless In Blameless

Welcome back to our mini series of fireside chats with SRE experts talking about the realities of their day-to-day. Episode 2 gets intimate — What’s difficult about incident command? We invited Alyson van Hardenberg, Engineering Manager at Honeycomb.io, and Varun Pal, Staff SRE at Procore, to chat with Jake Englund and Matt Davis from the Blameless team. Watch the full conversation where they cover everything from methodologies and technical expertise to the human and social aspects of reliability engineering.

View Video

Blameless

Read more about [SRE: From Theory to Practice] What's difficult about incident command?

Manage incidents end-to-end with PagerDuty

Feb 7, 2023 By PagerDuty In PagerDuty

Incidents happen. You can have a better experience managing incidents end-to-end with PagerDuty. Learn how you can leverage automation, work where you want, and integrate with all your tools in under two minutes.

View Video

PagerDuty

Incident Management

Read more about Manage incidents end-to-end with PagerDuty

Using Tagging and Routing Rules in Squadcast I Incident Classification I Event Tagging I Squadcast

Feb 7, 2023 By Squadcast In Squadcast

Event Tagging is a rule-based, auto-tagging system with which you can define customized tags based on incident payloads, that get automatically assigned to incidents when they are triggered. This video explains how to create Tagging rules for efficient Incident Classification.

View Video

Squadcast

Read more about Using Tagging and Routing Rules in Squadcast I Incident Classification I Event Tagging I Squadcast

What is BigPanda?

Feb 7, 2023 By BigPanda In BigPanda

BigPanda transforms IT data into actionable intelligence and automation, enabling incident response teams to increase uptime, efficiency, and velocity.

View Video

BigPanda

Read more about What is BigPanda?

Maximizing IT Company Success through Effective On-Call Support

Feb 6, 2023 By emily In SIGNL4

Having your systems monitored by a reliable solution is important, but how do you ensure that the right people are informed about issues that arise? Identifying problems is the first step, but they also need to be routed to the appropriate individuals. Keep in mind that employees may not always be sitting in front of the dashboard. This means being available outside of normal working hours to quickly respond to emergencies and problems, including not only weeknights but also weekends and holidays.

Read Post

SIGNL4

Read more about Maximizing IT Company Success through Effective On-Call Support

Adding Incident Watchers in Squadcast | Incident Notifications and Updates | Squadcast

Feb 6, 2023 By Squadcast In Squadcast

This video talks about Squadcast's Incident Watchers Feature. In Squadcast, any user/stakeholder can subscribe to an Incident and act as a Watcher for an incident. Incident Watchers can choose to receive notifications for all the updates of an incident. This allows any user/stakeholder to act as an observer of the incident, even if they are not active responders. You can customize your watch options for the incident and receive notifications only for those updates.

View Video

Squadcast

Read more about Adding Incident Watchers in Squadcast | Incident Notifications and Updates | Squadcast

Enterprise Alert High Availability Installation and Settings

Feb 6, 2023 By Derdack In Derdack

A quick video showcasing Enterprise Alerts High Availability installation and settings to ensure you have the most complete Enterprise Alert setup possible with built in Disaster Recovery.

View Video

Derdack

Read more about Enterprise Alert High Availability Installation and Settings

GitLab and Mattermost Playbooks Demo

Feb 6, 2023 By Mattermost In Mattermost

In this demo, we'll show you some of the latest features in Mattermost and the GitLab plugin that help teams streamline workflows. Watch how pipeline notifications and a release playbook come together with task actions to automatically complete tasks and simplify repeatable processes.

View Video

Mattermost

Read more about GitLab and Mattermost Playbooks Demo

Common Incident Terminology

Feb 5, 2023 By Alain Troitter In Exigence

Operations, customer support, engineers and most groups use inconsistent language. This is a serious problem. Imagine NASA doing that with astronauts or a navy with ships talking to each other, but not using the same terms. Something very bad will happen. In our space of incident management, we use words like broke, failed, outage, doesn’t work, dead…all describing the same condition.

Read Post