Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

The High Stakes of Aerospace Reliability

Aerospace systems operate in one of the most unforgiving environments imaginable. Each flight test, orbital maneuver, or satellite transmission subjects avionics, propulsion systems, sensors, and telemetry hardware to extreme conditions. Even a minor failure can cascade into grounded aircraft, interrupted communications, or compromised missions.

APM vs Observability: What comes next?

Remember how I said that blog was going to be my last entry on the topic of "APM vs Observability?" Well, it turns out I had a little more to say. I'd like to spend a few moments talking about the future of APM and Observability. I think it comes down to two major initiatives: AI and Open Telemetry. (NOTE: in this section, I'm using the word "observability" to refer to the discipline of monitoring and observability as a whole, rather than any specific tool, technique, or vendor-based solution.)

ignio AI Agent for IT Event Management | AI Agent for alert noise reduction

Discover how ignio’s AI-powered agents are transforming IT event and alert management by combining Agentic AI, AI/ML algorithms and automation. In this video, we introduce ignio AI Agent for IT Event Management — a purpose-built, autonomous agent designed to reduce alert noise, group related alerts and predict future events. Whether you’re managing a large-scale enterprise infrastructure, cloud-native environment, or hybrid IT setup, this AI agent empowers your SRE and IT operations (ITOps) teams with real-time observability, automated alert correlation and suppresion, and predictive intelligence What You’ll Learn in This Video.

Customer panel: Transforming IT & security

In an era where telemetry data grows at a 28% compound rate while budgets remain flat, traditional IT and Security approaches are facing unprecedented pressure. Join our distinguished customer panel as they share their transformative journeys with Cribl's data engine solutions. Our panelists will discuss how Cribl's vendor-neutral portfolio has enabled them to regain control over their data infrastructure, achieving both immediate operational improvements and strategic long-term advantages.

Unleashing Progress Flowmon 13: Speed, Smarts and Security Redefined

At Progress, we continue to develop and enhance the Progress Flowmon product family. The latest update brings the core Flowmon product to release 13.0, and it includes remarkable performance improvements, strengthened security and expanded protocol support. Full details of what’s new and improved in the latest release are available on the Flowmon product page. In this blog, we’re excited to highlight the newest features and improvements to the Flowmon solution.

Build Your Kubernetes Monitoring Foundation with kube-prometheus-stack

When you run Kubernetes at scale, one of the first challenges is understanding what the cluster is actually doing. Workloads shift around, pods restart for normal reasons, and traffic doesn't always follow the patterns you expect. Having clear signals makes day-to-day operations much easier. That's where kube-prometheus-stack helps. It brings Prometheus, Grafana, Alertmanager, and supporting components together as a single package.

How OpenTelemetry can enhance observability in distributed systems: Practical examples

Observability has become one of the fundamental elements of performance and reliability as modern applications move toward cloud-native architectures, microservices, and multi-cloud. Traditional monitoring techniques often fall short in such dynamic, distributed environments. That’s where OpenTelemetry (OTel) , an open-source observability framework comes into picture.