Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Introducing Seer Agent: The answer is already in Sentry. Now you can ask for it.

This is a story about an engineer’s night that could have been bad, but ended up… not so bad. A few weeks ago, on a Saturday, our AI debugger, Seer, started failing. Note the big scary spike on the right. The errors were generic failures from the LLM calls, nothing that pointed at a root cause. Most of the team wasn’t scheduled to be on this weekend, and it just so happened Indragie, our Head of AI, was online. He started paging engineers.

Certificate Discovery, Monitoring and Reporting | WhatsUp Gold 2026.0

Discover how WhatsUp Gold helps you identify and monitor certificates to reduce security risks, stay compliant, and avoid outages caused by expired or improperly configured certificates, featuring the latest reporting enhancements available in WhatsUp Gold version 2026.0.

Misconfigured Alert Detection: Find the Alerts That Need Tuning

Netdata ships with hundreds of stock alerts. They cover a wide range of infrastructure conditions and they’re designed with sensible defaults. But “sensible defaults” and “correct for your environment” are not the same thing. A CPU threshold that’s perfectly reasonable for a build server might generate constant noise on a machine running batch jobs.

What "AI-Ready Data" actually means for observability teams

Many organizations deploying AI are learning similar lessons right now: the challenge isn’t this or that AI model, it’s the data. According to Gartner, 60% of AI projects will be abandoned by organizations because of failures to support these projects with AI-ready data. Also, 63% of organizations either lack or aren’t sure they have the right data management practices to get there.

The New Kubernetes Monitoring Experience in Splunk Observability Cloud

In this video, I walk through the three main pieces of the new Kubernetes monitoring experience in Splunk Observability Cloud: the Kubernetes overview page for monitoring the status and top issues across your environment, the Kubernetes Entities page for troubleshooting individual instances with correlated metrics, logs, events, and configuration, and the Workload Optimization view for getting actionable recommendations on your CPU and memory resource allocation.

Demo - Selector Platform NOC Operator Workflow

See how Selector transforms NOC operations in real time. This demo walks through a typical workflow - from ingesting massive volumes of network and system data to automatically detecting anomalies, correlating events, and pinpointing true root cause. Instead of chasing alerts across siloed tools, Selector delivers a single, intelligent view - reducing noise, highlighting impact, and accelerating resolution.

Demo - Selector Platform CoPilot Diagnosis

See how Selector’s AI Copilot accelerates issue diagnosis in real time. In this demo, watch how natural language queries and AI-driven insights help teams quickly analyze incidents, surface root cause, and understand impact - without digging through multiple tools. Instead of manual investigation, Selector guides operators to answers faster, reducing noise and speeding up resolution. Built for network and operations teams who need clarity, speed, and smarter troubleshooting.

Demo - Selector Platform Dashboard Validation

See how Selector enables real-time validation and visibility through customizable dashboards. In this demo, watch how teams can quickly monitor network and system performance, validate changes, and track key metrics - all in one unified view. Instead of piecing together data across tools, Selector delivers clear, actionable insights that help teams stay aligned and make faster decisions. Built for network and operations teams who need instant visibility and confidence in their environment.