Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Why OnPage Outperforms Epic Secure Chat for Critical Communication

Electronic Health Records (EHRs) like Epic are undoubtedly pivotal to modern healthcare. With their intuitive interfaces and deeply integrated clinical decision support systems, Epic has become a cornerstone in patient care. But when it comes to communication, particularly for urgent and critical workflows, Epic Secure Chat often leaves healthcare providers searching for alternatives. At OnPage, we’ve built a platform specifically designed to meet the nuanced demands of healthcare communication.

APAC Rundeck by PagerDuty Meetup - February 2025

Join us for an informal 1-hour virtual event where the open-source Rundeck by PagerDuty community comes together to share automation stories and use cases. Whether you're new to Rundeck or looking to elevate your automation game, this meetup is packed with valuable takeaways for everyone! Automating with Rundeck for Smarter Operations Jade Chen, Associate DevOps Engineer at MYOB, shares how Rundeck by PagerDuty is a powerful ally for enhancing team’s efficiency and improving customer service through automation features and remote API calls.

Demo Roundups! Security Incident Management

Cyber attacks can harm businesses operations, diminish brand reputation, and decrease revenue making a robust security strategy essential. PagerDuty Operations Cloud leverages the power of AI and automation to respond, automate, and remediate security incidents ensuring cyber resiliency. Host: Mandi Walls (DevOps Advocate @ PagerDuty) Guests: PagerDuty’s Casey Clems (Security Engineer) and Sam Ferguson (Principal Product Manager).

IT Service Management (ITSM): A Complete Guide

As digital transformation accelerates, organizations face increasing complexity, tighter budgets, and relentless pressure to provide exceptional service. This creates a constant challenge in balancing cost, stability, and service. IT Service Management (ITSM) strategically designs, delivers, manages, and improves IT services by aligning them with business goals and optimizing service delivery.

Best incident management tools in 2025 [45 analyzed, top 3 picks]

PagerDuty, Splunk, ServiceNow — with dozens of incident management tools on the market, how do you know which one to choose? Here's the reality — downtime costs organizations an average of $9,000 per minute. That's why companies are increasingly investing in incident management tools to reduce disruption and improve their incident response. But with the market evolving rapidly and new players emerging constantly, selecting the right tool has become more challenging than ever.

I Want My Shoes Fast! Observability, SRE Burnout, and OTel with Dynatrace's Adriana Villela

In this episode, we sit down with Adriana Villela, Principal DevRel at Dynatrace and OpenTelemetry contributor to break down how observability impacts reliability. We dive into what contributes to SRE burnout and how managers can create psychologically safer spaces for responders. Adriana also shares her perspective on AI as an observability-buddy to navigate incidents.

ITSM vs. ITOM: What are the key differences?

IT service management (ITSM) and IT operations management (ITOM) both have the mandate to ensure your organization’s IT systems and infrastructure run smoothly and efficiently. These two frameworks are essential for any modern IT environment, but their roles are often confused or misunderstood. Simply put, ITSM focuses on the user-facing side of IT, streamlining services and aligning IT processes with business objectives.

Shorten your MTTR with Checkly Traces

We all know that Checkly is a ‘secret weapon’ for engineering teams who want to shorten their mean time to detection (MTTD). With Checkly, you can know within minutes if your service is unavailable for users, or acting unexpectedly. In this article we’ll talk about how Checkly traces can help you expand on the benefits of Checkly, adding insights that will help you diagnose root causes, and further reduce your mean time to resolution (MTTR) for outages and other incidents.

Your New Retrospective Experience: More Collaborative, Customizable, and Powerful

Run smarter, more effective retros. Customize retros, collaborate in real time, and surface key insights faster with AI. The new experience empowers you to spend less time documenting and more time working together as a team to uncover the insights that lead to real improvements in your process, roles, and technology.