Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.


Is it a ghost or is it Flow Designer?

Maybe it’s the time of year or the change in temperature, but sometimes using xMatters Flow Designer can seem a little… spooky? Maybe it’s the unlimited capability it offers, or maybe it’s that it can make changes for you without you being aware they’re taking place. But every once in a while, we’re not sure if we’ve just set up workflows too effectively, or that something a touch paranormal is happening with xMatters.


Improve your on-call experience with Datadog mobile dashboard widgets

Life happens—even when you’re on-call. You can’t take your laptop everywhere, but whether you’re on the train, at dinner, or at the gym, you can count on the Datadog mobile app for access to key data about the status and performance of your applications. Now, you can use Datadog mobile widgets to build an on-call mobile dashboard directly on your phone’s home screen, so it’s even easier to track the data you care about from anywhere.


Differences between Site Reliability Engineer Vs. Software Engineer Vs. Cloud Engineer Vs. DevOps Engineer

The evolution of Software Engineering over the last decade has lead to the emergence of numerous job roles. So how different is a Software Engineer, DevOps Engineer, Site Reliability Engineer and a Cloud Engineer from each other? In this blog, we drill down and compare the differences between these roles and their functions.


Strategies to Reduce Hospital Readmission Rates

The Centers for Medicare & Medicaid Services (CMS) scrutinizes hospital readmission rates across the U.S. each year, and it levies financial penalties on organizations that overshoot acceptable hospital readmission rates. As healthcare systems across the country embark on a journey to introduce patient-centric models to their organizations, they must align their resources with ever-changing regulations for them to thrive.


SRE and Fighting Games

When learning SRE, you might find its principles a bit unintuitive. For example, it might be difficult to learn why aiming for 100% reliability is wasteful, or how reliability isn’t the same as availability, or why failure ought to be celebrated. Believe it or not, there is a method to these ideas. My goal in this article is to shed light on the principles and to leave you a believer, such that you’ll take steps towards starting SRE practices.


Now Available: Private Slack Channels

Ever heard the saying “Too many cooks”? If you’ve responded to incidents, you’ll likely understand the parallels. There are cases when incident command on a public channel isn’t the best option: Whatever your reason, we’ve got you covered. Now available, users can spin up a private slack channel for an incident. Read more how to do this here.

Customer Service Ops & PagerDuty Zendesk Integration v3 Full Case Ownership Use Case

PagerDuty's Zendesk Integration enhances communication between engineering and support teams by providing visibility to high-impact incidents via the PagerDuty Status Dashboard that is integrated into the Zendesk interface. Automate workflows for a fast-paced support team and provide the right level of information so they can interact knowledgeably with their customers while also reducing time and effort.

PD, Salesforce Service Cloud, Slack: Proactive Case Escalation & Slack-First Intelligent Swarming

Learn about and see how PagerDuty, Salesforce Service Cloud, and Slack empower collaboration across your organization to accelerate time to resolution. Proactively improve customer satisfaction in real time and break down silos to connect customer service teams with engineering teams to address incidents quickly when seconds matter. Enjoy greater control when resolving issues and anticipating customers' needs through an incident command console that gives customer service agents and stakeholders instant updates on critical, customer-impacting issues.