Operations | Monitoring | ITSM | DevOps | Cloud

AWS re:Invent 2025 - From Alert to Action: AWS + PagerDuty Agentic Ops

Hear how AWS and PagerDuty are transforming incident management with agentic & generative AI. Learn how agents within AWS Quick Suite and PagerDuty work together to detect, diagnose, and resolve incidents with less toil and swivel chair. This session explores how AI collaboration is reshaping resilience across cloud environments.

AWS re:Invent 2025 - Smarter Incident Response with Logz.io and PagerDuty

In this session, Jacky Leybman from PagerDuty and David Lotan Bolotnikoff from Logz.io showcase how PagerDuty and Logz.io combine generative AI with rich historical context to automate root cause analysis and accelerate incident response. By correlating real-time telemetry with prior incidents and runbooks, teams reduce manual toil and MTTR while maintaining human-in-the-loop oversight and transparent reasoning.

AWS re:Invent 2025 AI-First Incident Management in Slack

Jacky Leybman from PagerDuty and Kaninie Knight from Slack share how their integration streamlines incident response and real-time collaboration. This session highlights practical workflows and measurable gains – such as faster triage and lower MTTR – achieved by connecting on-call operations directly in Slack.

What NVIDIA, Okta, and Warner Bros. Discovery Learned About Scaling AI Operations Beyond the Pilot Phase

One key takeaway from AWS re:Invent 2025 was that a clear gap has emerged between teams still experimenting with AI and those seeing measurable value at scale. In two sessions, PagerDuty customers joined us onstage to explain how they’ve scaled pilots into successful AI operations.

How Forward-Looking Institutions are Benefiting from Agentic AI

Today’s higher education institutions operate complex digital ecosystems that were unimaginable a decade ago. Behind every college lies a portal of interconnected systems for registration, financial aid, course management, and campus services. The students using those systems are digital natives who can order food in seconds on their phones or have packages delivered the same day they order them.

PagerDuty Becomes Newest AWS Software Partner to Earn Resilience Competency

As enterprise system failures cost businesses an estimated $400 billion annually in lost revenue and productivity, PagerDuty announced it has achieved the Amazon Web Services (AWS) Resilience Services Competency in the software category - becoming one of the first AWS Software Partners to earn the designation. This achievement validates PagerDuty's ability to help enterprises architect, deploy and maintain mission-critical systems that can withstand failures and recover rapidly with minimal business disruption.

Turning Incidents Into Insight: The Continuous AI Operations Loop Explained

Modern systems generate enormous volumes of operational data. Yet, most incident workflows still treat every outage like a one‑off fire drill: an alert fires, responders scramble, the issue is resolved, the status page goes green—and the organization learns almost nothing from the experience. Meanwhile, the same patterns quietly repeat in code releases, logs, traces, and support tickets until they erupt into the next ‘unexpected’ incident.