Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

A "Single Source of Truth": New Tools for Fast, Efficient Customer Service

Customer-facing teams have their hands full doing whatever they can to address customer issues quickly. At PagerDuty, our goal is to ease the burden of these teams by giving them the tools and access they need to deliver excellent customer experiences. Over the last year, we have deepened our integration with Salesforce Service Cloud, allowing users to work directly within the platform, reducing the need to context switch.

The Future of Incident Response is Automated, Flexible, and Proactive

We know our customers rely on PagerDuty as the backbone of critical real-time operations, so we want to make sure each and every enhancement helps streamline incident response. How can we help our customers spend less time firefighting and more time innovating? One of PagerDuty’s values is Champion the Customer – and we take this very seriously. When building and improving features, we aim to keep a pulse on what’s going on with our customers: what’s keeping them up at night?

Declare early, declare often: why you shouldn't hesitate to raise an incident

My first incident.io-incident happened in my second week here, when I screwed up the process for requesting extra Slack permissions, which made it impossible to install our app for a few minutes. This was a bit embarrassing, but also simple to resolve for someone more familiar with the process, and declaring an incident meant we got there in just a few minutes. Declaring the first incident when you start a new job can be intimidating, but it really shouldn’t be.

What is Automated Diagnostics and Why Should You Care?

A lot of people in technology talk about the cost of an incident solely from the perspective of downtime, or the number of customers and employees impacted. And from the surface, oftentimes that is a fair angle to take. It makes the headlines, and customer reputation and trust are critical to the success of any business—obviously.

Evaluating xMatters Alternatives

The cost of IT downtimes is enormous as service breakdowns impact both the top-line and bottom-line growth. As the digital ecosystem continues to become complex and organizations continue to adopt additional tools and systems to scale their businesses, it’s imperative that they are equipped with incident response tools that can help drive accelerated incident response and mitigate expensive downtimes.

Squadcast + OSNexus QuantaStor Integration: Making Incident Management & Alerting more effective

Storage systems are an integral part of IT infrastructure. Given that modern markets are highly competitive and demanding, businesses strive for 24/7 availability. This in turn sets higher expectations for storage systems to be operational all the time. But just like other IT components, even storage systems are prone to incidents. Hence, it is important to have an efficient communication process, to manage alerts during system failures/disasters.

Real Talk webinar recap: analytics and reporting maturity

MTTR, or mean time to resolve, is an important key performance indicator for incident response teams to track, but it’s rarely useful for technological stakeholders or customers. To really make use of the data at their disposal, decision-makers must tailor the info they provide—and understand the scope and granularity of the data they have when they deploy an AIOps platform like ours. That’s the gist of our latest Real Talk webinar on analytics and reporting maturity.

How To Build an Escalation Policy for Effective Incident Management

Regardless of your organization’s size, industry, or security measures, you will inevitably face IT incidents. But what do you do if an incident affects a critical system and your on-call responders can’t resolve it? Does your team have a set of clearly outlined next steps they should take to handle the issue? Answering these questions can be complicated, even more so for large organizations that rely on cloud-based services to fuel their IT environment.

StatusCast expands product offering with Incident Management for IT Platform

May 31, 2022 – Columbia, MD – StatusCast today announced the release of its IT Incident Management service, expanding its flagship offering from best-of-breed Status Page services to include the full incident management life-cycle. The new offering goes beyond standard status updates, allowing IT teams to respond faster and with more effectiveness when systems fail or go offline.

What's New: Updates to Incident Response, AIOps, Pagerduty Process Automation, and More!

Summit’s right around the corner (have you registered yet?) but the shipping doesn’t stop! We’re excited to announce a new set of updates and enhancements to PagerDuty’s Digital Operations Platform. Recent updates from the product team include On-Call Management, Incident Response, Process Automation, and Integrations, to PagerDuty Community & Advocacy Events. New capabilities enable users and customers to resolve incidents faster, do the following, and more.