Operations | Monitoring | ITSM | DevOps | Cloud

Why you're (probably) doing service catalogs wrong

Service catalogs promise a lot of things: powerful automations, insights into your technology estate. But over the last few years, many of us have learned that setting up and maintaining a service catalog is really hard. Building out a catalog from a standing start can take months, or even years. Too many people get stuck in a chicken-and-egg situation, where you can’t deliver value because you don’t have the data in your catalog, and you can’t convince anyone to spend time helping you because the catalog doesn’t do anything yet.
Sponsored Post

Revamped Analytics: Unlock Deeper Insights with Squadcast

At Squadcast, we're always striving to make incident management seamless, efficient, and data-driven. That's why we're thrilled to announce the launch of our revamped Analytics feature! This major upgrade transforms the way you access metrics and insights about your incident response, offering a new level of granularity and speed.

Why we vibe coded a marketing campaign for Anthropic

Let’s start with the obvious: we’d like to have Anthropic as a customer. We greatly admire the work they are doing at the intersection of frontier models + safety. We use lots of different AI tooling at incident.io. We’re all-in at AI at incident.io, both to improve the productivity of our internal team and, more importantly, to provide our customers with superpowers in the form of an AI incident responder.

Accelerate Government IT Innovation

Government IT operations across public sector face unprecedented challenges this year. As digital demands intensify and legacy systems strain under pressure, agencies must accelerate IT innovation while delivering measurable ROI. The PagerDuty Operations Cloud emerges as the catalyst for government transformation, enabling agencies to revolutionize their digital operations while achieving operational excellence, according to The Government Guide for Agency Innovation ebook.

Bridging the Gap: How xMatters Aligns with ServiceNow's AI-Powered Future

Attending ServiceNow Knowledge 2025 in Las Vegas was an enlightening experience, especially for those of us invested in the xMatters platform. The conference’s emphasis on AI integration and agentic workflows provided valuable insights into how xMatters can evolve to better align with ServiceNow’s advancements. ServiceNow’s emphasis on integrating AI across enterprise workflows underscores the industry’s shift towards more autonomous and intelligent systems.

Unlock a new era of agentic, AI-powered IT operations with a modern data strategy

IT operations have reached a breaking point. Hybrid cloud and modern software architectures have led to unprecedented increases in the scale, complexity, and fragmentation of IT infrastructures. In their attempts to manage this complexity, enterprises invest billions into observability tools, IT Service Management (ITSM) platforms, and outsourced Managed Service Providers (MSPs).

Under the hood: Request coverage feature

‍ The ilert mobile app is primarily used by responders to receive notifications about critical alerts, react to them on the go, and check their current on-call status. It has various capabilities, including critical notifications via push, quick actions for alerts, and critical alert settings. The app enables responders to view their current on-call shifts and escalation policies, take on-call shifts from somebody else, and create coverage requests to ask for on-call shift handover from a colleague.

PagerDuty + Microsoft Build 2025: Transforming critical work with AI and automation

At Microsoft Build 2025, PagerDuty was featured in key announcements showcasing how intelligent agents and real-time automation redefine digital operations. From Microsoft Copilot to the launch of a new Azure SRE Agent, PagerDuty was highlighted as a strategic partner in enabling intelligent, scalable incident response.

Rollbar and ilert: Real-time error monitoring meets smart incident response

We’re excited to share that Rollbar is now part of the ilert integration catalog! This new technical partnership allows software teams to detect application errors in real time with Rollbar and instantly respond using ilert’s powerful alerting and incident management features.

Demo Roundups! What's New in Services

Deep dive into PagerDuty's latest service management capabilities. Discover how to easily move incidents across services with minimal clicks for streamlined triage, faster resolution, and accurate metrics for enhanced operational insights. Plus, get a sneak peek at our upcoming service custom fields. Watch these powerful features in action and learn how they can transform your incident management workflow.

Spend Holidays with Family, Even While On-Call | How SIGNL4 Filters Alert Noise

Being on-call during a holiday doesn’t have to mean missing out on quality time with family. With SIGNL4, you can stay connected to your IT operations while enjoying the moments that matter most. In this video, discover how SIGNL4 helps on-call IT professionals by: Filtering out non-critical alerts Delivering only high-priority notifications Reducing alert fatigue Ensuring fast response to real emergencies Supporting mobile alerting and escalation workflows.

What is Notification Software? From Routine Alerts to Emergency Response

Imagine this. You’re walking your dog, sipping a coffee, when your phone buzzes: Each ping comes from a different app, but they all have one thing in common: notification software. It’s woven into our everyday lives – quietly buzzing, blinking, or nudging us with information. But when it comes to critical operations in IT, healthcare, logistics, or utilities, not all notifications are created equal. Some are simply updates. Some are mass notifications. Others require immediate action.

What is AIOps? Use cases, benefits, and getting started

According to a recent survey by Enterprise Management Associates, IT outages can cost large enterprises more than $1.5 million per hour. AIOps offers a solution. With an effective AIOps platform in place, enterprises can decrease the frequency and cost of outages by 30% and reduce their duration to under an hour. AIOps means artificial intelligence for IT operations.

Accelerating AI-Powered Digital Operations: Integrating xMatters with ServiceNow's Latest Innovations

ServiceNow’s Knowledge 25 unveiled transformative innovations designed to embed AI deeply within every enterprise workflow. These advancements—like AI Agents, the AI Agent Fabric, and the AI Control Tower—mark a leap forward in digital operations. But what if you could amplify these capabilities even further, achieving truly proactive, automated, and seamless responses across your entire organization?

Healthcare and Crisis Teams Harness PagerDuty to Stay Ready and Resilient

For organizations providing vital mental health assistance, safety crisis services and delivering critical humanitarian support when disaster strikes, reliable digital infrastructure is essential. Whether connecting individuals to crisis counselors via text or coordinating face-to-face healthcare support, these digital services must operate seamlessly.

OnPage-ConnectWise Manage Integration (Updated)

Discover how the OnPage + ConnectWise bi-directional integration streamlines incident response and communication for IT teams. With this powerful integration, when issues are created in ConnectWise that match specific triggers, OnPage instantly generates a high-priority alert. The alert includes the ticket subject, detailed description, and a direct link back to the original ConnectWise ticket.

Navigating the Future of IT Incident Management and Digital Operations

The landscape of IT Incident Management and Digital Operations is changing rapidly. Atlassian’s recent announcement that Opsgenie is being phased out in favor of Jira Service Management (JSM) signals a critical turning point for businesses that rely on robust, efficient incident management solutions. Organizations now face the imperative task of evaluating alternatives that offer the adaptability, automation, and intelligence necessary to maintain operational excellence.

Preparing for Opsgenie End of Life: Why xMatters is the Ultimate Alternative

On April 5, 2027, Opsgenie will officially reach its end of life (EOL). For organizations relying on Opsgenie, this news indicates an impending need to reassess their incident and on-call management solutions. Organizations must weigh their options carefully, with Opsgenie pivoting to Jira Service Management (JSM) and Compass as replacements. Transitioning from Opsgenie to Jira Service Management or Compass may not be enough.

Your Observability Platform Has a Blind Spot: Don't Risk Your Operations on Bolt-on Incident Response Modules

Observability platforms want to do it all—from data collection to incident response. Their pitch is appealing: one platform to eliminate context switching and reduce overhead. But when critical systems fail—and they will fail—, add-on incident management modules won’t save you. You need an end-to-end system built specifically for high-stakes incident management.

When Minutes Matter: The Iberian Peninsula Outage and the Future of Digital Resilience

On April 28, 2025, Spain, Portugal, and briefly some parts of France experienced what would become one of Europe’s most significant power outages in recent history. As millions across the Iberian Peninsula found themselves suddenly disconnected, a stark reality emerged: in our interconnected world, the ripple effects of major incidents extend far beyond their immediate impact zone.

SIGNL4 April 2025 Release: Tiered Schedule and Call Routing

Our SIGNL4 April Update is here - packed with powerful new features! In this video, we explain how shift tiers with built-in escalation allow you to manage complex schedules with multiple layers. Automatic tier-to-tier escalation ensures the right person is alerted, delivering a reliable response when every second counts. Discover More Features Redesigned Signl-Center: Experience a sleek, modern interface with a bento-style layout, one-tap action buttons, streamlined status tabs, and modular detail views.

How Leading Companies Are Reimagining Operational Efficiency

Several factors—including AI adoption, investor expectations, and the rise of a new generation of innovative upstart companies—have driven a renewed focus on efficiency in every industry. But organizations that attempt to improve operational efficiency and drive profits via layoffs and short-term cost-cutting often end up hurting the business in the long run.

SIGNL4 April 2025 Release: Redesign Signl-Center

Our SIGNL4 April Update is here - packed with powerful new features! In this video, we introduce the redesigned Signl-Center featuring a sleek, modern interface with a bento-style layout, one-tap action buttons, streamlined status tabs, and modular detail views. Incident handling is now faster than ever, with enhanced notification tracking and easy CSV export.

SIGNL4 April 2025 Release: AI Shift Planning & Duty Scheduling

Our SIGNL4 April Update is here - packed with powerful new features! In this video, we showcase how our AI-powered shift and duty scheduling works. The SIGNL AI identifies rotation patterns and proactively assigns future shifts based on historical scheduling data. Redesigned Signl-Center: Experience a sleek, modern interface with a bento-style layout, one-tap action buttons, streamlined status tabs, and modular detail views.

What Is an API Outage? Why It Happens and How to Avoid It

APIs are a big part of how modern applications or services work. They act as bridges, allowing systems to talk to each other and share data. Whether it's logging into an app or making an online payment, an application programming interface helps make that process smooth. But what happens when an API suddenly stops working? Even a short outage can cause a disruption. It can break features, delay operations, and impact users and businesses alike.

Are You Getting the Full Value from Your Automation Strategy? Here's How to Find Out

Take our maturity quiz today and see where your automation maturity stacks up! Let’s call a spade a spade: Automation isn’t just a “nice to have” anymore. Automation is business-critical for speed, scalability, and resilience. It’s a mechanism of survival in today’s hyper-modern state of business. But the reality is that not every team is on the same page when it comes to automation.

An ultimate step-by-step guide on Checkmk Cloud Monitoring

Checkmk launched Checkmk Cloud (SaaS) in February 2025, which is a fully managed, cloud-based version of their monitoring technology. This solution, designed for ease of use, allows enterprises to start monitoring their IT infrastructure with no installation, maintenance, or manual upgrades required. The SaaS version is compatible with both cloud-based and on-premises systems, bringing them together under a single, straightforward platform.

What is PagerDuty? Key Features & Benefits Explained

PagerDuty. You’ve probably heard it mentioned during outages or seen it in tech forums. Maybe your DevOps team talks about it, or you found it while looking for ways to handle system failures. So, what is PagerDuty exactly? And why do teams rely on it? This post breaks down PagerDuty in simple terms, explores its key features and benefits, and shows you how to get started. We’ll also introduce you to a PagerDuty alternative that might work better for your team’s needs.

How Operational Resilience Can Help Build and Maintain Trust

In today’s business landscape, trust and reputation are the foundation upon which organizations are built. A single service outage or poor customer experience can severely damage both revenue and brand reputation. When customers or businesses encounter obstacles with their preferred vendor, they often turn to competitors – and these temporary shifts frequently become permanent changes in loyalty.

Enhancing Observability and Incident Response with Site24x7 and ilert

By integrating Site24x7 with ilert, companies can automate their incident response workflows, ensure that the right people are notified instantly, and reduce Mean Time to Resolution (MTTR). ‍ Site24x7 provides robust monitoring for servers, applications, networks, and cloud infrastructure, including application logs, giving teams visibility into their environments. But when things go wrong, a timely response is just as critical as visibility. This is where ilert comes in.

Get Set Up in 5 Minutes or Less: A Fresh, Seamless Onboarding Experience

When you’re up and running with FireHydrant, there’s no better incident management experience out there. We built it that way — fast, intuitive, reliable when it matters most. Now, the first five minutes are just as streamlined and enjoyable as the rest. We rebuilt our onboarding flow from the ground up and cut setup time by over 90% in the process. With the new onboarding experience, you get a guided experience to connect your tools and get the most out of FireHydrant.

Top 5 EHR Systems 2025

Electronic health records (EHRs) are real-time, digital records of patient health information that is maintained by their providers over time, detailing their healthcare journey at length. By using EHR systems, doctors can locate patient information, from anywhere they can connect to their system, including recently administered medications, past medical history, or chronic conditions.

Accelerating Velocity With AIOps in the Age Of AI-Everything

IT teams are inundated with an ever-expanding array of operational data. While collecting this data is straightforward, extracting meaningful insights that drive business value is anything but. This is where modern AIOps makes all the difference – and where PagerDuty stands apart. PagerDuty AIOps doesn’t contribute to tool sprawl and data overload, it tames it.

The EU AI Act and what it means for managing incidents

If you've been in earshot of tech leadership lately, you've probably heard the words 'EU,' 'AI,' and 'compliance' in conversation. The EU AI Act is officially upon us, and with it comes a whole new set of incident response and reporting requirements that might feel like a yet another bureaucratic set of requirements to worry about. But there's a different way to look at this legislation.

3 Ways to Use FinOps Automation for Cloud Cost Optimization

The cloud is the backbone of modern businesses, revolutionizing the trajectory of innovation, technology and business itself. While its promise of instant scalability and flexibility drives unprecedented growth, these same advantages can become a double-edged sword. The ease of spinning up new resources, automating deployments, and expanding services across regions—all of which make the cloud so powerful—can quickly lead to sprawling infrastructure and runaway costs if not carefully managed.

What's New: Gentle High Priority Alerts

A calmer way to respond quickly, without the shock. I’m really excited to share a new feature that’s been close to our hearts (and ears ): Gentle High Priority Alerts. This one’s for everyone who’s ever been jolted out of sleep, or even deep focus, by a high-priority notification/”page” that felt more like an alarm clock than an alert.

April Wrap-Up: Product Updates Across the PagerDuty Operations Cloud

PagerDuty is committed to redefining what digital operations look like in the era of AI and automation. This vision drives us to continuously innovate and enhance the PagerDuty Operations Cloud, ensuring every update brings our customers closer to achieving operational excellence. Building on the momentum of our Spring product launch at PagerDuty On Tour, we’re excited to showcase what we’ve shipped this quarter.