Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

CEO Fireside at HumanX: Resilience at the Speed of Change

PagerDuty CEO and Chairperson Jennifer Tejada in conversation on April 8, 2026 at HumanX in San Francisco with Honeycomb CEO Christine Yen and journalist Jennifer Strong, show how observability and real-time response help builders spot issues sooner, fix them faster, and learn from every incident.

Best Emergency Mass Notification Solution for Businesses: OnPage (2026 guide)

When a critical incident or emergency strikes, businesses rely on well-defined incident response procedures to accelerate remediation. Incident response teams are on standby, and each responder understands their role in restoring services and minimizing customer impact. However, organizations often overlook an equally critical requirement: real-time communication with all stakeholders during incidents. This is not just an operational gap, it is increasingly a compliance and risk management requirement.

AI Didn't Change the Game, It Just Exposed Your Bottlenecks w/ Ganesh Datta (CTO, Cortex)

Every engineering org says they want to improve reliability — but most can't even agree on what "good" looks like. Ganesh Datta, Co-Founder and CTO of Cortex, has spent the better part of a decade helping companies confront that gap.

From Alerting Tool to Critical Communication Platform

Modern operations don’t break down only because alerts are misconfigured or missed. They break down when systems are difficult to manage, slow to adapt, or lack visibility into what’s actually happening in real time. Across industries, teams are managing an increasing volume of critical events. Critical System Alerts. After-hours urgent calls from patients, clients or even emergency lines. Voicemails. Answering service calls, Emergency notifications. Time-sensitive clinical communication.

How to Prevent and Resolve Incidents Using Model Context Protocol (MCP)

The rapid pace of modern software development, fueled by AI-driven coding and accelerated deployment cycles, has resurfaced a challenge that many development teams already struggled with: the speed of incident response must now match the speed of change. Every day, teams ship code faster than ever, which inevitably increases the risk of a new issue making it to production. The traditional approach—where engineers waste time jumping between disconnected tools—is no longer sustainable.

Updated Web Management Console Demo | On-Call Management, Hospital Communication & Call Routing

See the next-generation OnPage Enterprise Web Management Console in action, built to simplify on-call scheduling, incident alerting, critical communication workflows and post-event reporting. In this demo, we walk through how teams can: Manage on-call schedules and escalation pathsSend and track critical alerts in real timeGain visibility into alert activity, read rates, and response timelinesConfigure contact groups and communication workflowsUse the new Lines Management module to set up call routing, menus, and rules through a self-service interface.

Best Secure Messaging Apps for Healthcare Workers (2026 Buyer's Guide): OnPage

Secure messaging apps for healthcare workers are platforms designed to enable HIPAA-compliant communication, real-time collaboration and coordination, and urgent alerting across clinical teams for timely response. In modern hospitals, communication is no longer just about sending messages. It’s about ensuring the right person receives the right information and acts on it quickly.

Fear, Identity & Flaky Tests: AI in Reliability w/ Dana Lawson (CTO, Netlify)

The self-healing systems that SREs have dreamed about for a decade aren't a distant promise anymore — they're already being built, and the biggest barrier left is cultural. Dana Lawson, CTO at Netlify, has spent over 25 years in the trenches of developer infrastructure, from sysadmin roots to running the platform that powers 5% of the internet.

Incident Management in 2026: Best Practices, Tools Guide & More

When systems go down, every minute counts. You need more than just quick fixes. You need a solid system to spot problems early, take action fast, and learn from each incident to keep your users happy. That's what incident management is. In this guide, we'll walk through everything you need to know about incident management, from basic concepts to advanced strategies used by top DevOps teams.