Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

What's New in InfluxDB 3.3: Managed Plugins, Explorer Updates, and More

InfluxDB 3.3 is now available for both Core and Enterprise, which introduces new managed plugins for the Processing Engine, making it easier to address common time series tasks with just a plugin. On top of that, 3.3 includes a wide range of performance improvements, feature updates, and bug fixes. InfluxDB 3 Core is free and open source, optimized for recent data, and licensed under MIT and Apache 2.

Building an Incident Response Playbook: Templates and Examples

An incident response playbook is your team's emergency manual when things go wrong. It's a documented set of procedures that guides your team through detecting, responding to, and resolving incidents efficiently. Without one, teams often scramble during outages, make inconsistent decisions, and take longer to restore service.

Azure native integration elevates Elastic Cloud Serverless experience

We're thrilled to announce a significant leap forward in making Elastic Cloud Serverless even more accessible and powerful for Azure users. With the general availability (GA) of Elastic Cloud Serverless on Azure, we've just released the Azure native integration for Elastic Cloud Serverless. This builds upon our existing Azure native integration for Elastic Cloud Hosted, allowing users to seamlessly discover and manage Elastic Cloud in a way that feels inherently part of the Azure ecosystem.

Bring high-performance observability to secure Kubernetes environments with Datadog's new CSI driver

In Kubernetes environments, applications often communicate with the Datadog Agent to send telemetry data such as custom metrics via DogStatsD or traces through Datadog APM. How this communication takes place depends on the communication mode set on the Datadog Cluster Agent's Admission Controller. With the sockets option, communication takes place through local inter-process communication via Unix domain sockets (UDS), whereas the service and default hostip options rely on network communication.

Datadog Disaster Recovery mitigates cloud provider outages

A loss in infrastructure and applications observability can leave SRE and DevOps teams without insight into the real-time state of their production systems, causing them to temporarily pause code deployments and limit their ability to troubleshoot issues or respond to critical alerts. In modern cloud environments, where services are distributed and deeply interconnected, this lack of visibility can escalate quickly.

The Network Impact on Job Completion Time in AI Model Training

In large-scale AI model training, network performance is no longer a supporting actor — it’s center stage. Job Completion Time (JCT), the key metric for measuring training efficiency, is heavily influenced by the network interconnecting thousands of GPUs. In this post, learn why JCT matters, how microbursts and GPU synchronization delays inflate it, and how platforms like Kentik give network engineers the visibility and intelligence they need to keep training jobs on schedule.

From Anomaly to Action: ScienceLogic's Role in Accelerating Zero Trust Response

In today’s threat landscape, cyber incidents unfold in seconds, not days. Federal agencies and critical infrastructure operators no longer have the luxury of slow detection or manual triage. As Zero Trust Architecture (ZTA) becomes the new security standard, one principle stands above all: time is risk. The faster an organization can detect, diagnose, and respond to anomalous activity, the greater its resilience. ScienceLogic plays a critical role in making that speed possible.
Sponsored Post

AIOps for SAP: From Ground to Cloud

Anyone working in the SAP market in 2025 is aware of two big topics: migration to cloud-based ERP and the end of many long-used tools for managing SAP operations including Focused Run, Landscape Manager and Solution Manager. Both are impossible to ignore. Cloud-based ERP presents a new era of business software possibilities, and with it the opportunities and complexities of migration, transformation, and leveraging the elastic capacity and scalability of cloud-based designs. But right behind it, the question becomes "how are we going to run and manage this?"