Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Using Squadcast's SLO Tracker | Error Budget | Setting up SLOs and configuring SLIs | Squadcast

With Squadcast, you can define and monitor Service Level Objects for your services. SLOs allow you to define and enforce an agreement between two parties regarding the delivery of a given service. A Service Level Objective (SLO) is a reliability target, measured by a Service Level Indicator (SLI), and sometimes serves as a safeguard for a Service Level Agreement (SLA). SLOs represent customer happiness and guide the development team’s velocity.

Interrupts in software teams: using unplanned work to your advantage

Interrupts are often seen as a problem that eats away at your team’s productivity, and gets in the way of shipping important things for your customers. It’s often consciously accrued from the tech debt we accept to ship features sooner. However when a team doesn’t have a good strategy for dealing with the consequences of those decisions, the pain is felt much more acutely and much sooner.

PagerDuty Debuts as a Leader in 2022 GigaOm Radar for AIOps Solutions

Every year there is a surprise in a Radar report. While it won’t be a surprise to our thousands of customers who are seeing tremendous benefits with us, PagerDuty is excited to be named a Leader in the 2022 GigaOm Radar for AIOps Solutions. GigaOm uses extensive criteria to evaluate vendors in their Radar.

PagerDuty Incident Response Demo (Extended)

Enjoy this demo that showcases a day in the life of a team handling an incident with PagerDuty's Automated Incident Response solution. PagerDuty enables teams to orchestrate the right response for every incident. It also helps organizations protect revenue and improve customer experiences by resolving critical incidents faster and preventing future occurrences. Now you can bring major incident best practices to your organization with end-to-end response automation and friction-free postmortems.

Arize integration with PagerDuty

Streamline Model Monitoring with Integrated Alerts Arize is an ML Observability platform aimed to detect, troubleshoot, and eliminate ML problems faster. Use Arize to monitor your production models and send alerts to PagerDuty when your models deviate from a certain threshold. Arize and PagerDuty help keep your teams in the loop, send more comprehensive metadata through alerts, and debug your models faster than ever before.

RESOLVE '22: How to get multi-cloud done right

Multi-cloud is inevitable. With AIOps, struggling in its complexity doesn’t need to be. Business technology stacks don’t appear out of a vacuum. For the modern cloud-enabled, cloud-dependent company (that is to say, most of them), the look from the inside looks more like an ongoing evolution than a monolithic choice.