Operations | Monitoring | ITSM | DevOps | Cloud

Spotify outage on December 17, 2025

On December 15, 2025, Spotify experienced a widespread outage that disrupted playback, logins, and app functionality for users around the world. While Spotify’s official status page remained silent throughout the incident, StatusGator detected the problem early using real user signals and issued an Early Warning Signal within minutes.

Reflecting on a year of smarter network monitoring: 2025

This year, the world leaned heavily on words like reimagine, rebuild, renew, reshape, and reinvent, and the same spirit defined our journey. As promised last year, we reimagined key capabilities, reshaped workflows, and restructured critical parts of our network monitoring tool to meet modern demands. At the same time, we reinforced the core foundation you've trusted for more than a decade: delivering reliable, usable features with uncompromising security.

Elevate your MSP operations: Key Site24x7 features you shouldn't miss in 2025

Managing multiple customer accounts as an MSP can be overwhelming. With the constant demands of configuring monitors, generating reports, and maintaining security across numerous customers, efficiency becomes critical. Throughout this year, we've focused on making your life easier with powerful new features that automate repetitive tasks, enhance security, and give you better visibility into customer health.

Notes from the Field: Migrating from VMware to XenServer

The customer was already using Citrix Provisioning Services (PVS) to deliver Virtual Delivery Agent (VDA) machines. Rather than attempting to migrate existing VMware-based VDAs, which often introduces driver conflicts and legacy dependencies, we followed a proven best-practice approach. We provisioned new VDA machines directly on XenServer using the PVS Virtual Desktop Setup Wizard. This ensured clean builds, free from VMware-specific components, and fully optimized for the XenServer platform.

Why OpenTelemetry instrumentation needs both eBPF and SDKs

As a vendor-neutral open standard, OpenTelemetry has become the default choice for application instrumentation. However, it’s important to remember that OpenTelemetry isn’t a single technology — it’s an ecosystem. Under the hood, it provides multiple options for instrumenting your applications. In this blog post, we explore two instrumentation approaches: OpenTelemetry eBPF Instrumentation and runtime-specific OpenTelemetry SDKs, like the OpenTelemetry Java agent.

Capture high-value traces without managing a pipeline: Tail sampling with Adaptive Traces

Tracing is the richest observability signal in common use today. In distributed systems, it reveals how requests flow across multiple services, allowing you to uncover and address performance bottlenecks. Teams often scale back or abandon tracing altogether, however, because most successful requests produce redundant data that’s noisy and expensive to store.

KubeCon Atlanta 2025 & the AI-Native Shift

KubeCon + CloudNativeCon North America 2025 in Atlanta marked a definitive moment for cloud-native infrastructure. Over four days, celebrating the 10th anniversary of both CNCF and Kubernetes, more than 9,000 attendees witnessed the ecosystem’s evolution from container orchestration to AI-native operations. The conference delivered a clear message – AI workloads are no longer experimental.

What's New in InfluxDB 3.8: Linux Service Management, Kubernetes Helm Chart, and Smarter Ask AI

InfluxDB 3.8 is now available for both Core and Enterprise, alongside the 1.6 release of the InfluxDB 3 Explorer UI. This release is focused on operational maturity and making InfluxDB easier to deploy, manage, and run reliably in production. InfluxDB 3 Core remains free and open source under MIT and Apache 2 licenses, optimized for recent data. InfluxDB 3 Enterprise builds on that foundation with long-range querying, clustering, security, and full operational tooling.

AI for IT Operations: How AIOps is Transforming IT Performance & Service Reliability

Artificial Intelligence for IT Operations ingests telemetry across logs, traces, events, resource signals, runtime behavior, and application pathways. AI for IT operations reduces alert noise, correlates events into unified narratives, predicts degradation, and drives remediation logic with pattern-based execution. Telemetry growth makes manual triage slow, while inference scales linearly with data.

How To Connect Your Prometheus Server to a Grafana Datasource

Prometheus is one of the most popular open-source monitoring systems in the world. It’s lightweight, easy to deploy, and pairs beautifully with Grafana for dashboards and alerting. If you're running applications or infrastructure on Linux, Prometheus plus one of many Exporters (Redis, NVIDIA GPU, Nginx, etc.) gives you deep visibility into service performance - quickly and reliably.