Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on AIOps, alerting in complex systems and related technologies.

Bridging IT and OT: Lessons from the Factory Floor with Steve Goudreau

Everyone’s rushing to AI, but few have the foundation to make it work. In this episode of Next Gen Network Heroes, Bob sits down with Steve Goudreau, Director of IT at Ice Industries, to explore what it really takes to lead in today’s evolving technology landscape. With over three decades of experience, spanning military service, financial services, and manufacturing, Steve brings a grounded, people-first perspective to an industry often obsessed with tools and trends.

Why Threshold Monitoring Fails in Distributed Systems

For years, infrastructure stability could be approximated through static limits. If CPU utilization exceeded a defined percentage or response time crossed a fixed boundary, risk was assumed to increase in a predictable way. Monitoring systems were designed around that assumption, and for contained environments, it largely held true.

Frontline Truths: 100+ Network War Stories on the Path to Autonomous Operations - Eric Chou

The path to intelligent network operations isn’t a straight line. In this session from AI for Network Leaders – Powered by Selector, Eric Chou shares hard-earned lessons from over 100 conversations with network engineers and operators navigating automation, complexity, and the shift toward AI-driven operations. He covers: This session is a practical field guide for teams looking to move from reactive firefighting to building an AI-ready network foundation.

You Don't Have an AIOps Problem-You Have a Data Opportunity - Michael Wynston

AI can’t fix bad data. In this session from AI for Network Leaders – Powered by Selector, Michael Wynston breaks down a critical truth: the success of AIOps depends on the quality, consistency, and trustworthiness of your network data. Using real-world lessons from Fiserv’s large-scale network transformation, he explores how teams can build a strong data foundation that enables AI to deliver meaningful, low-noise outcomes.

Inside the AI Agents Transforming Network Operations - Joby Rudolph & James Schnebly | Selector

AI agents are becoming a core part of modern network operations — but what does it actually take to build and deploy them effectively? In this session from AI for Network Leaders – Powered by Selector, Joby Rudolph and James Schnebly break down how AI agents are designed, implemented, and applied in real-world network environments. They cover: This session provides a practical look at how AI agents are moving from concept to production — and what it takes to make them work at scale.

From Tools to Teammates: A Practical Framework for AI Agents in Network Operations - Du'An Lightfoot

AI agents are quickly moving from experimentation to real-world deployment in network operations — but how do you adopt them without introducing unnecessary risk? In this session from AI for Network Leaders – Powered by Selector, Du’An Lightfoot shares a practical framework for building and deploying AI agents in production network environments. He covers: This session cuts through the hype and provides a clear, actionable model for teams looking to move from AI as a tool to AI as a teammate.

Building the AI Stack for Modern Network Operations - Surya Nimmagadda

AI is rapidly transforming network operations — but what does it actually take to build an AI stack that works in production? In this session from AI for Network Leaders – Powered by Selector, Surya Nimmagadda breaks down how modern AI systems for network operations are designed, deployed, and used today. He covers: This session is designed for network engineers, architects, and operators looking to move beyond theory and understand how AI is being applied in real production environments.

AI Meeting Bots Were Just the Beginning. Meet the AI Collaborator

Why the next era of enterprise AI isn’t about note-taking — it’s about digital workers who actually show up and do the work. There’s a moment every IT operations leader knows well. A critical incident hits at 2 PM on a Tuesday. Within minutes, a war room meeting spins up — a Google Meet or Teams call crowded with network engineers, SRE leads, cloud architects, and storage admins, all staring at dashboards and talking over each other. Someone is manually pulling syslog data.