Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on AIOps, alerting in complex systems and related technologies.

Episode 1 - Preparing the workforce for AI | The Intelligent Enterprise

In our first podcast episode of The Intelligent Enterprise, Ricardo Costa, Senior Vice President and Chief Technology Officer at Purolator, gives us his views on how to prepare the workforce for AI. In his role as a technology "translator" connecting business strategies with tech implementations, Ricardo highlighted the importance of translating complex tech concepts into simple, understandable stories and addressing leadership challenges in preparing the workforce for AI, including upskilling and ethical considerations.

Five ITOps best practices to stay ahead during major third-party outages

When external providers fail—whether it was CrowdStrike outage last year, AWS outage last month, or the Cloudflare DNS outage yesterday—the symptoms inside your environment often look like internal issues: timeouts, login failures, API errors, service degradation, or sudden spikes in dependency-related alerts. It’s natural for teams to start searching through their own infrastructure first, but none of these symptoms clearly point to your systems as the root cause.

Navigating External Outages: How Selector Cuts Through the Cloudflare Noise

Yesterday’s widespread Cloudflare outage reminds us how crucial external dependencies are to the stability of our own applications. When a key edge provider like Cloudflare goes down, the impact on your internal monitoring systems can look like a catastrophic, internal system failure triggering a massive storm of alerts and sending engineering teams into frantic, misdirected debugging sessions.

Agents of IT podcast - Ep. 6 - What's real agentic AI and what's just hype?

Sean Heuer and Ari Stowe break down “agent washing,” governance, and what it really means for AI to take action instead of just chatting. In this clip from Agents of IT, they share practical ways to spot the difference between chatbots, scripted automations, and true agentic systems that can plan, reason, and execute autonomously. Watch the full episode to hear their perspective on.

AlOps - Laying a Strong Foundation with Full-Stack Observability

It is fair to say that AIOps is much more than just a catchy tagline; in fact, it is now a fundamental aspect of every enterprise looking to manage a modern, cloud-native architecture along with a distributed system. As AIOps becomes more widely adopted and organizations start expanding, the amount of logs, metrics and traces becomes too much for role-based tracking and monitoring tools. This is the moment in which full-stack observability tools are needed, providing valuable data that observability AIOps engines rely on for their predictive, proactive, and performance issue detection.