Monthly Archive

Get Real-Time Third-Party Service Outage Alerts in Slack with StatusGator

Oct 31, 2025 By Colin Bartlett In StatusGator

When your team relies on multiple SaaS tools, even a small outage in a third-party service can disrupt workflows, slow down projects, and frustrate customers. Knowing about issues the moment they happen, and even before they’re officially reported. That’s where StatusGator’s Slack integration comes in. With StatusGator, you can receive real-time service status alerts in Slack.

Read Post

StatusGator

Read more about Get Real-Time Third-Party Service Outage Alerts in Slack with StatusGator

Get Third-Party Outage Alerts in Microsoft Teams with StatusGator

Oct 31, 2025 By Colin Bartlett In StatusGator

When your company depends on dozens of SaaS tools, such as AWS, Atlassian, Zoom, or Microsoft 365, any cloud outage can ripple through your entire operation. The faster your team learns about an external service disruption, the faster you can respond. With StatusGator’s Microsoft Teams integration, your team can receive real-time third-party outage alerts in Microsoft Teams. The service also includes Early Warning Signals that detect potential issues before providers officially announce them.

Read Post

StatusGator

Read more about Get Third-Party Outage Alerts in Microsoft Teams with StatusGator

Sponsored Post

8 Challenges of Microservices and Serverless Log Management

Oct 31, 2025 By David Bunting In ChaosSearch

As organizations increasingly adopt serverless architectures and embrace the benefits of microservices, managing logs in this dynamic environment presents unique challenges. In this blog, we're taking a closer look at the differences between serverless and traditional log management, as well as 8 challenges associated with log management for serverless microservices.

Read Post

ChaosSearch

Read more about 8 Challenges of Microservices and Serverless Log Management

Transforming Data into Business Impact with Splunk AppDynamics Business iQ

Oct 31, 2025 By Splunk In Splunk

Tired of observability tools that miss the important stuff? Splunk AppDynamics Business iQ gives you insight into how IT performance actually impacts your business.

View Video

Splunk

Read more about Transforming Data into Business Impact with Splunk AppDynamics Business iQ

MSP Dashboards That Deliver: HaloPSA + SquaredUp in Action

Oct 31, 2025 By SquaredUp In Squared Up

In this webinar HaloPSA and SquaredUp come together to show MSPs how to unlock real-time visibility, streamline reporting, and deliver dashboards that drive client satisfaction and operational efficiency. In this session, you’ll hear directly from both Halo and SquaredUp teams as they: Whether you're already using HaloPSA or exploring smarter ways to surface your service desk data, this webinar will give you the insights and tools to take your MSP reporting to the next level.

View Video

Squared Up

Read more about MSP Dashboards That Deliver: HaloPSA + SquaredUp in Action

Tech Talk - Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Oct 31, 2025 By Splunk In Splunk

Find out how Splunk Attack Analyzer can help you quickly and efficiently investigate potential malware and phishing incidents by automatically tracking each stage of complex attack chains and expediting your response efforts. Hear directly from Product Manager Aditya Raj as he demonstrates how to combine Splunk Attack Analyzer with Splunk Enterprise Security and Splunk SOAR for even greater threat detection and response power.

View Video

Splunk

Read more about Tech Talk - Leveraging Automated Threat Analysis Across the Splunk Ecosystem

4 Common OpenTelemetry Challenges and How Site24x7 Helps Overcome Them

Oct 31, 2025 By Kirubanandan RA In Site24x7

OpenTelemetry (OTel) is transforming observability by standardizing and unifying how telemetry data such as metrics, logs, and traces are collected from distributed systems. However, while it unlocks new opportunities for monitoring and troubleshooting, adopting and operating OpenTelemetry comes with real-world challenges. Here’s what you need to know about these limitations, and how Site24x7 provides a holistic, simplified observability solution for your organization.

Read Post

Site24x7

Read more about 4 Common OpenTelemetry Challenges and How Site24x7 Helps Overcome Them

Store and search logs at petabyte scale in your own infrastructure with Datadog CloudPrem

Oct 31, 2025 By François Massot In Datadog

As AI workloads and cloud-native applications expand, organizations are generating more log data than ever. Each service, container, and model inference produces continuous telemetry that must be stored, secured, and analyzed. As telemetry grows more complex, teams must balance full visibility with new retention and residency needs.

Read Post

Datadog

Read more about Store and search logs at petabyte scale in your own infrastructure with Datadog CloudPrem

Automating your synthetic test infrastructure with Datadog Synthetic Monitoring and Terraform

Oct 31, 2025 By Addie Beach In Datadog

Testing ecosystems contain massive amounts of data, including outlined test scenarios, prerequisite configurations, and the tests themselves. As a result, these ecosystems are prone to data sprawl. This makes it difficult to prevent configuration drift and quickly spin up new tests, especially at the frequency needed to support a fast-growing application. Teams can handle these challenges by treating their tests as part of their application infrastructure.

Read Post

Datadog

Read more about Automating your synthetic test infrastructure with Datadog Synthetic Monitoring and Terraform

Store and search logs at petabyte scale in your own infrastructure with Datadog BYOC Logs

Oct 31, 2025 By François Massot In Datadog

Read Post

Datadog

Read more about Store and search logs at petabyte scale in your own infrastructure with Datadog BYOC Logs

The Hidden Cost of "Modernization": When Upgrades Become Extortion

Oct 31, 2025 By Dallon Robinette In Selector

Across the IT and observability landscape, enterprise leaders are facing a troubling pattern. A trusted vendor announces a “modernization initiative,” often following a major acquisition or a shift in ownership. Overnight, pricing structures change, license models disappear, and long-time customers are pressured into multi-year bundles under the banner of innovation. What’s being framed as progress often feels more like pressure.

Read Post

Selector

Read more about The Hidden Cost of "Modernization": When Upgrades Become Extortion

Network Path Monitoring: How to Monitor Network Paths

Oct 31, 2025 By Andrii Kernitskyi In Obkio

Your users are complaining about slow application performance. Your monitoring dashboard shows all devices are green, routers operational, switches functioning, and bandwidth utilization is normal. Yet something is clearly wrong. The problem isn't your equipment; it's the path between your users and their destinations. This is where network path monitoring comes in.

Read Post

Obkio

Read more about Network Path Monitoring: How to Monitor Network Paths

Sumo Logic Dojo AI overview

Oct 30, 2025 By Sumo Logic, Inc. In Sumo Logic

Stop the firefighting and get instant answers with Sumo Logic Dojo AI. When a security incident hits, you risk losing money and time as you wait for investigations and troubleshooting. Discover how Dojo AI agents simplify investigations by surfacing potential threats, providing actionable insights, and guiding you to the root cause faster using natural language.

View Video

Sumo Logic

Read more about Sumo Logic Dojo AI overview

Introducing the Splunk Technology Add-on for Ollama: Illuminating Shadow AI Deployments

Oct 30, 2025 By Splunk In Splunk

Without strong visibility and governance, local LLMs risk replicating the fragmented, unsupervised sprawl once seen in shadow IT, complicating security postures and making it difficult for organizations to ensure proper oversight and compliance as these powerful AI tools become embedded in daily workflows. To address this challenge, The Splunk Threat Research Team has released the Splunk Technology Add-on for Ollama that provides comprehensive monitoring and observability capabilities specifically designed for local LLM deployments.

View Video

Splunk

Read more about Introducing the Splunk Technology Add-on for Ollama: Illuminating Shadow AI Deployments

Sumo Logic Academy - Training and Certification Overview

Oct 30, 2025 By Sumo Logic, Inc. In Sumo Logic

In 2025, Sumo Logic revamped its education and certification program, introducing industry-aligned assessments, digital badges and many free training offerings, including industry leading free instructor led classrooms and interactive hands-on labs. This video walks through all Sumo Logic Academy program offerings.

View Video

Sumo Logic

Read more about Sumo Logic Academy - Training and Certification Overview

From KubeCon EU to KubeCon NA: Bindplane's OpenTelemetry Contributions and Highlights (Mar-Oct 2025)

Oct 30, 2025 By Adnan Rahic In ObservIQ

Bindplane engineers have stayed deeply involved in the OpenTelemetry community this summer. With KubeCon+CloudNativeCon North America in Atlanta coming up I wanted to dive into all the work that has been done and give the engineers a well deserved shoutout. Here’s what we built, fixed, and contributed since KubeCon+CloudNativeCon Europe in London this March.

Read Post

ObservIQ

Read more about From KubeCon EU to KubeCon NA: Bindplane's OpenTelemetry Contributions and Highlights (Mar-Oct 2025)

What's New in InfluxDB 3.6: Ask AI, Simple Quick Start, and Smarter Automation

Oct 30, 2025 By Peter Barnett In InfluxData

InfluxDB 3.6 is now available for both Core and Enterprise. This release introduces the 1.4 update to InfluxDB 3 Explorer, featuring the beta launch of Ask AI, along with new capabilities for simple startup and expanded functionality in the Processing Engine. InfluxDB 3 Core is free and open source, optimized for recent data, and licensed under MIT and Apache 2. InfluxDB 3 Enterprise extends Core with long-term data retention, clustering, fine-grained security, and management capabilities.

Read Post

InfluxData

Read more about What's New in InfluxDB 3.6: Ask AI, Simple Quick Start, and Smarter Automation

5 Ways IT Can Increase Employee Productivity with AI

Oct 30, 2025 By Nexthink In Nexthink

The way employees work is being transformed by AI, expanding digital environments, and rapid organizational change. These shifts create new opportunities for performance, but they also make productivity fragile – easily disrupted by complexity, friction, and underutilized tools.

Read Post

Nexthink

Read more about 5 Ways IT Can Increase Employee Productivity with AI

HaloPSA Plugin Spotlight

Oct 30, 2025 By SquaredUp In Squared Up

Dan Watts, Dev Rel Engineer gives a brief demonstration of the SquaredUp plugin for HaloPSA.

View Video

Squared Up

Read more about HaloPSA Plugin Spotlight

SQL expressions in Grafana: Combine and manipulate data from multiple sources

Oct 30, 2025 By Sam Jewell In Grafana

One of Grafana’s greatest strengths is its ability to provide a consistent monitoring experience for all your data sources. But not everyone wants to go through the process of transforming that data and setting up a data warehouse to make that happen, especially for complex analyses.

Read Post

Grafana

Read more about SQL expressions in Grafana: Combine and manipulate data from multiple sources

Energy-Efficient Computing: How To Cut Costs and Scale Sustainably in 2026

Oct 30, 2025 By Chrissy Kidd In Splunk

With AI the centerpiece of technology and innovation today, energy efficient computing is quietly becoming one of the most urgent challenges. In this article, we will discuss what makes energy efficient computing relevant for your organization, especially when modern resource-intensive AI workloads play an important role in driving your business operations and services.

Read Post

Splunk

Read more about Energy-Efficient Computing: How To Cut Costs and Scale Sustainably in 2026

Observability 2025 Decoded: What the DZone Report Means for SLO-Driven Ops

Oct 30, 2025 By Gerardo Dada In Catchpoint

DZone’s 2025 Intelligent Observability Trend Report captures a real inflection point: teams are shifting from “more data” to outcome-driven practices that improve resilience and accountability. The survey was gathered between August 28 and September 25, 2025, from a global pool of developers, architects, and IT professionals.

Read Post

Catchpoint

Read more about Observability 2025 Decoded: What the DZone Report Means for SLO-Driven Ops

Prometheus native histograms in Grafana Cloud: Get more precision from your Grafana visualizations

Oct 30, 2025 By Gyorgy Krajcsovits In Grafana

In May, we announced the public preview of Prometheus native histograms in Grafana Cloud, unlocking greater precision, ease of use, and compatibility for analyzing latency, duration, and other distributions. Since then, we’ve seen incredible adoption across industries—from financial services companies to e-commerce platforms. Last week, during PromCon EU 2025, the Prometheus developers announced that native histograms are now stable, after three years of intense testing and improvements.

Read Post

Grafana

Read more about Prometheus native histograms in Grafana Cloud: Get more precision from your Grafana visualizations

Trust at First Prompt: The New Design Challenge of AI Interfaces

Oct 30, 2025 By Ivana Bilic In Honeycomb

A data analyst opens a new AI tool and spends 30 seconds generating a complex visualization of quarterly revenue trends. Chart duly generated, they have to decide: could they present this chart directly to their CEO?

Read Post

Honeycomb

Read more about Trust at First Prompt: The New Design Challenge of AI Interfaces

This Halloween, the Scariest Monsters Are in Your Network

Oct 30, 2025 By Yann Guernion In Broadcom

In the spirit of Halloween, let's talk about monsters. Not the kind that hide under your bed, but the ones that live inside your network infrastructure. For those responsible for keeping the lights on, these creatures aren't fictional; they are a daily reality. Your environment can feel like an episode from the Real Ghostbusters, teeming with things that snarl, bite, and cause chaos at the worst possible moments. Forget silver bullets; trying to fight them one by one is a losing battle.

Read Post

Broadcom

Read more about This Halloween, the Scariest Monsters Are in Your Network

Automating the First Hour of Troubleshooting with Netdata AI

Oct 30, 2025 By Netdata In netdata

Avoid the most expensive hour of incident response. Learn how Netdata AI uses hybrid AIOps to detect, reason, and summarize incidents.

View Video

netdata

Read more about Automating the First Hour of Troubleshooting with Netdata AI

How to Monitor Kubernetes With Grafana OSS or Grafana Cloud | Ask the Experts | Grafana Labs

Oct 30, 2025 By Grafana In Grafana

Wondering how to monitor Kubernetes with Grafana? In this “Ask the Experts” episode, Coleman walks through the easiest setup — from Grafana Cloud’s built-in Kubernetes Monitoring plugin to open source Grafana with Mixins and Helm. One command, and your cluster data comes alive.

View Video

Grafana

Read more about How to Monitor Kubernetes With Grafana OSS or Grafana Cloud | Ask the Experts | Grafana Labs

APM for Banks and Fintech: Ensuring Stability in High-Transaction Apps

Oct 30, 2025 By Mohana Ayeswariya J In Atatus

The financial services industry is undergoing a major transformation. According to the McKinsey & Company 2025 Global Payments Report, digital payments continue to dominate, generating approximately $2.5 trillion in revenue from around $2.0 quadrillion in value flows across 3.6 trillion transactions worldwide. In another survey conducted by JP Morgan says that, more than 30 percent of financial professionals reported that faster payments are having a positive impact on their organizations in 2025.

Read Post

Atatus

Read more about APM for Banks and Fintech: Ensuring Stability in High-Transaction Apps

Getting started with Site24x7 alert management

Oct 30, 2025 By ManageEngine Site24x7 In Site24x7

Struggling with alert overload or missed notifications? Learn how Site24x7 helps you manage alerts effectively, from setting thresholds and tracking key metrics to routing notifications, automating actions, and leveraging AI-powered Zia thresholds. Follow a real-world DevOps scenario to see how your team can respond faster, smarter, and more efficiently.

View Video

Site24x7

Read more about Getting started with Site24x7 alert management

Tech Talk - Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

Oct 30, 2025 By Splunk In Splunk

In this Tech Talk, discover how they’re leveraging Splunk Infrastructure Monitoring (IM) to supercharge their Kubernetes operations, detect issues within minutes, and resolve them 90% faster — all while optimizing and scaling like pros.

View Video

Splunk

Read more about Tech Talk - Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

Migrating from Librato to Hosted Graphite on Heroku - Full Tutorial

Oct 30, 2025 By MetricFire In MetricFire

Librato on Heroku is being sunsetted, so what's next? In this tutorial, we walk through: Why Hosted Graphite by MetricFire is the best upgrade from Librato on Heroku Step-by-step migration: move your Heroku dyno, router, Postgres, Redis & custom metrics into Hosted Graphite A side-by-side comparison: metrics ingestion, dashboards, alerts, and integrations.

View Video

MetricFire

Read more about Migrating from Librato to Hosted Graphite on Heroku - Full Tutorial

Deploying Loki on Kubernetes via Helm (Loki Community Call - October 2025)

Oct 30, 2025 By Grafana In Grafana

This Loki Community Call is about deploying Loki on Kubernetes via Helm charts. We talk about why you might want to use Helm to deploy on Kubernetes, best practices for deployment, and which Helm chart you should use! We are Jay Clifford and Nicole van der Hoeven, Developer Advocates at Grafana Labs, and we have invited Grafana Champion and Loki Helm Maintainer Jan-Otto Kröpke, Principal Cloud Architect at QualityOperations GmbH, to talk about the state of the Loki Helm Chart.

View Video

Grafana

Read more about Deploying Loki on Kubernetes via Helm (Loki Community Call - October 2025)

How to use the Grafana MCP Server with Cursor #grafana #coding #programming #cursor

Oct 30, 2025 By Grafana In Grafana

Watch the complete video about using the Grafana MCP Server.

View Video

Grafana

Read more about How to use the Grafana MCP Server with Cursor #grafana #coding #programming #cursor

From Maintenance to Monitoring: Digital Tools for Better AC Management

Oct 30, 2025 By OpsMatters In OpsMatters

The traditional approach to AC upkeep relied heavily on reactive maintenance, where technicians addressed issues only after they occurred. The digital age has ushered in an array of tools that make routine maintenance easy and employ advanced technologies for proactive monitoring. By integrating these tools into AC management practices, homeowners and businesses can improve system efficiency, reduce energy costs, and extend the lifespan of their equipment.

Read Post

OpsMatters

Read more about From Maintenance to Monitoring: Digital Tools for Better AC Management

FinOps for Hybrid IT: Extending Visibility Beyond the Cloud

Oct 29, 2025 By Marie Ashway In Galileo

Controlling IT spend used to mean managing cloud invoices. Today, it’s far more complex. Modern enterprises run workloads across multiple platforms — cloud, virtualized, and on-premises — each with its own cost structures and dependencies. That’s why FinOps for hybrid IT has become essential. Extending FinOps principles beyond cloud services enables organizations to see how every part of the infrastructure contributes to cost, efficiency, and business value.

Read Post

Galileo

Read more about FinOps for Hybrid IT: Extending Visibility Beyond the Cloud

Azure status integration is here!

Oct 29, 2025 By Valeria Kurolapova In StatusGator

We’re thrilled to announce the launch of our Azure status integration, bringing Microsoft Azure’s real-time service health and incident data directly into your StatusGator dashboard and status page. With this new integration, StatusGator automatically imports Azure outages and service status updates from your Azure subscription — giving you a complete, centralized view of your cloud infrastructure alongside every other service you monitor.

Read Post

StatusGator

Read more about Azure status integration is here!

A third of IT pros haunted by "user error"

Oct 29, 2025 By SolarWinds In SolarWinds

SolarWinds survey reveals the most common 'IT crime scenes' this Halloween.

Read Post

SolarWinds

Read more about A third of IT pros haunted by "user error"

Streamline IT Event Management

Oct 29, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Managing Microsoft SCOM events can often feel overwhelming, with countless alerts and notifications making it difficult to identify what truly matters. Working with our partner Kelverion, we are excited to introduce a Routing & Remediation solution for SCOrch that helps your teams focus on the issues that count, while automating the rest.

Read Post

NiCE IT Mgmt

Read more about Streamline IT Event Management

Playwright Check Suites Are Now GA - But What Does That Mean For You?

Oct 29, 2025 By Stefan Judis In Checkly

There are only a few companies that successfully invest in actively monitoring real user flows in production. I’ve been puzzled by the state of the art for many years, because I’m an anxious developer that always needs to know that production is “all right”. How can it be okay for all of us to wait for error logs, thrown exceptions or customer complains to learn about production issues?

Read Post

Checkly

Read more about Playwright Check Suites Are Now GA - But What Does That Mean For You?

5 Best Practices for Incorporating AI Into Your Team

Oct 29, 2025 By Rox Williams In Honeycomb

Honeycomb’s Jessica Kerr and Fred Hebert recently hosted a webinar with Courtney Nash of The VOID where they dug into one of the biggest questions in tech right now: How do we build systems (and teams) that actually learn with AI, not just use it? The conversation was surprisingly optimistic about what happens when we stop treating AI as a productivity tool and start seeing it as a teammate. You can watch the full webinar here, or read on below for a quick recap.

Read Post

Honeycomb

Read more about 5 Best Practices for Incorporating AI Into Your Team

SolarWinds Day Keynote: The Future of IT Has Arrived

Oct 29, 2025 By solarwindsinc In SolarWinds

Get a front-row seat to the future of IT operations — powered by AI, automation, and full-stack observability. During the SolarWinds Day Keynote, on October 8, we unveiled our latest innovations in AI and automation for predictive insights and end-to-end visibility. Discover how these advancements empower IT teams to optimize performance, streamline ITSM, and simplify database management—all without disrupting existing systems.

View Video

SolarWinds

Read more about SolarWinds Day Keynote: The Future of IT Has Arrived

Find and Fix Fastify Slowdowns with AppSignal for Node.js

Oct 29, 2025 By Damilola Olatunji In AppSignal

In part one of this series, we set up basic performance monitoring for our Fastify application using AppSignal and explored key performance indicators. Now that we have our monitoring foundation in place, it's time to leverage these insights to actively improve application performance. You'll learn how to detect performance regressions, find optimization opportunities, and implement custom instrumentation with OpenTelemetry.

Read Post

AppSignal

Read more about Find and Fix Fastify Slowdowns with AppSignal for Node.js

Sidecar or Agent for OpenTelemetry: How to Decide

Oct 29, 2025 By Anjali Udasi In Last9

Getting telemetry out of a distributed system isn’t the hard part. Getting it out cleanly, without noise, drop-offs, or odd performance side-effects — that’s where things get interesting. Before you worry about processors or storage costs, you need a clear plan for where the OTel Collector should run. Most teams narrow this down to two options: a sidecar that sits next to each service, or a node-level agent that handles data for everything running on the node. Both patterns are solid.

Read Post

Last9

Read more about Sidecar or Agent for OpenTelemetry: How to Decide

APM in 2026: The New Standard for Business Reliability and Growth

Oct 29, 2025 By Pavithra Parthiban In Atatus

Global IT spending is expected to reach a record $6.08 trillion by 2026, with software investments growing by 15.2%. This shows how critical application performance has become for businesses today. For almost 80% of companies, even one hour of downtime can cost more than $300,000. In a world where every digital experience affects your revenue and brand reputation, keeping your applications performing well is no longer optional.

Read Post

Atatus

Read more about APM in 2026: The New Standard for Business Reliability and Growth

Datadog named Leader in 2025 Gartner Magic Quadrant for Digital Experience Monitoring

Oct 29, 2025 By Yanbing Li In Datadog

We are thrilled to announce that, for the second consecutive year, Datadog has been named a Leader in the 2025 Gartner Magic Quadrant for Digital Experience Monitoring. We believe that this recognition reflects our continued focus on helping customers observe, secure, and act on everything that matters across their technology stack.

Read Post

Datadog

Read more about Datadog named Leader in 2025 Gartner Magic Quadrant for Digital Experience Monitoring

Rechain improves performance visibility and gets 4x faster issue resolution with Scout Monitoring

Oct 29, 2025 By Aspen Clevenger In Scout

Rechain is a SaaS Product Lifecycle Management (PLM) platform built with Ruby on Rails for fashion brands which helps modern apparel teams manage design, production, and supply chain workflows from one intuitive, cloud-based solution. ‍

Read Post

Scout

Read more about Rechain improves performance visibility and gets 4x faster issue resolution with Scout Monitoring

From Error to Insight: Our Brand Refresh

Oct 29, 2025 By Rollbar In Rollbar

Software teams do their best work when they can move quickly without losing control. That reality has shaped how our product has evolved, and it needed to shape how our brand shows up too. Our refresh is not a new coat of paint. It is an honest reflection of what Rollbar is today and where we are going: a code-first observability platform that helps builders understand what is happening in their code and why, so every release is better than the last.

Read Post

Rollbar

Read more about From Error to Insight: Our Brand Refresh

The Future of Network Configuration Management is Unified, Not Uncertain

Oct 29, 2025 By Mehul Patel In Broadcom

Since VMware joined Broadcom, there’s been some speculation in the industry about the future of Voyence, VMware's trusted network configuration management (NCM) solution. We want to address that speculation directly and share some exciting news.

Read Post

Broadcom

Read more about The Future of Network Configuration Management is Unified, Not Uncertain

Langchain Agent Monitoring with SigNoz

Oct 29, 2025 By SigNoz - Open Source Observability Platform In SigNoz

Langchain Agent Monitoring with SigNoz.

View Video

SigNoz

Read more about Langchain Agent Monitoring with SigNoz

Product Update - Turn Off Alerts, Use Microsoft Teams, and Custom Domains

Oct 29, 2025 By Hrishikesh Barua In IncidentHub

Over the last few months IncidentHub has added several new features to make it easier to fine tune your alerts. IncidentHub now also integrates with Microsoft Teams and supports custom domains for your public status pages. Let's take a comprehensive look at what's new.

Read Post

IncidentHub

Read more about Product Update - Turn Off Alerts, Use Microsoft Teams, and Custom Domains

Whose Fault Is It When the Cloud Fails? Does It Matter?

Oct 29, 2025 By Jeremy Rossbach In Broadcom

On Monday, October 20th, a significant portion of the digital services we use every day became inaccessible. For hours, banking, communication, and entertainment applications were unavailable. The root cause was later identified as a major outage within Amazon Web Services (AWS), the infrastructure that powers a vast number of online services. The initial response for any business affected by such an event is a frantic effort to diagnose the problem. Is it our application? Is our network down?

Read Post

Broadcom

Read more about Whose Fault Is It When the Cloud Fails? Does It Matter?

Your Root Cause Analysis is Flawed by Design

Oct 29, 2025 By Yann Guernion In Broadcom

There’s a nagging feeling of déjà vu that haunts every network operations leader. You invest significant time and resources to resolve a major performance issue. Your best engineers isolate a culprit—a misbehaving load balancer, perhaps—and after a frantic effort, service is restored. You close the ticket, confident the problem is solved. Then, two weeks later, it’s back.

Read Post

Broadcom

Read more about Your Root Cause Analysis is Flawed by Design

Top 4 Inefficiencies For Dev Teams Resolving Issues

Oct 29, 2025 By Jeff Zapotoczny In Lightrun

Every hour developers spend troubleshooting is an hour they’re not building features, innovating, or delivering value to customers. Yet in most organizations, issue management and debugging remains one of the biggest drains on productivity and release velocity. That frustration is exactly what led our founders, themselves developers, to create Lightrun.

Read Post

Lightrun

Read more about Top 4 Inefficiencies For Dev Teams Resolving Issues

How Generative AI is shaping the future of enterprise applications

Oct 29, 2025 By Elastic In Elastic

The next golden age of artificial intelligence has arrived, but the path forward is far from certain. Technology leaders are presented with a tremendous opportunity to revolutionize their business — that is, if they can find a way to tap into the full potential of their organization's data. In Episode 4 of Elastic's new limited series, Generation AI, Elastic's Sr. Director, Enterprise Applications, Jay Shah, shares how he believes generative AI will shape the future of enterprise applications.

View Video

Elastic

Read more about How Generative AI is shaping the future of enterprise applications

Sliding Through Log-Time Space

Oct 29, 2025 By The Graylog Team In Graylog

This post kicks off a new series written by the Graylog Development Team. In these updates, we’ll highlight the features and fixes that make daily work in Graylog smoother. We want to show the work we care so much about and present the challenges we faced and overcame. Today, we’re starting with one of those minor but functional enhancements: Graylog time-range stepping.

Read Post

Graylog

Read more about Sliding Through Log-Time Space

Digitate Unveils Industry's Most Comprehensive AI Agents, Driving Autonomous IT Operations and Business Resilience Across Enterprises

Oct 28, 2025 By Digitate In Digitate

New ignio™ release advances the vision of Ticketless IT and delivering measurable KPIs across the entire value chain for business functions and targeted verticals.

Read Post

Digitate

Read more about Digitate Unveils Industry's Most Comprehensive AI Agents, Driving Autonomous IT Operations and Business Resilience Across Enterprises

Your software deserves to be flawless

Oct 28, 2025 By Raygun In Raygun

Start your free 14-day Raygun trial today. Happy coding from the team at Raygun.

View Video

Raygun

Monitoring

Read more about Your software deserves to be flawless

Redefining Frontend Observability with Datadog RUM

Oct 28, 2025 By Datadog In Datadog

Discover how Datadog is redefining frontend observability with Real User Monitoring (RUM). In this demo, see how RUM helps teams detect, investigate, and resolve frontend issues that directly impact user experience and business outcomes. With RUM Without Limits, you get full visibility into every user session, giving you an accurate and comprehensive view of your users’ experiences. Monitor performance, track errors, and understand how your application behaves in real time.

View Video

Datadog

Read more about Redefining Frontend Observability with Datadog RUM

Scaling Java Web Applications: Choosing Between Microsoft Windows and Linux OS

Oct 28, 2025 By Pandian Ramaiah In eG Innovations

Java is one of the most widely used platforms for supporting web applications. According to RedMonk and TIOBE rankings, Java has consistently remained in the top 4 most popular programming languages worldwide, with millions of developers actively using it. Industry-standard application servers such as WebLogic, WebSphere, Tomcat, and JBoss all run on Java and power a large share of enterprise workloads and Java web applications.

Read Post

eG Innovations

Read more about Scaling Java Web Applications: Choosing Between Microsoft Windows and Linux OS

Why Brand Monitoring is Essential in the Age of Programmatic AI

Oct 28, 2025 By ChangeTower In ChangeTower

Recently, crowdfunding giant GoFundMe made headlines after (among other things) automatically generating some 1.4 million donation pages for U.S. 501(c)(3) nonprofit organisations. These pages were created without prior consent, using publicly available IRS data and partner-feeds. According to reports, many nonprofits discovered these pages only when alerted by a donor or curious patron; they had no advance knowledge and had to manually un-publish or “claim” the page.

Read Post

ChangeTower

Read more about Why Brand Monitoring is Essential in the Age of Programmatic AI

Monitor the Performance of Your Ecto for Elixir App with AppSignal

Oct 28, 2025 By Aestimo Kirina In AppSignal

In part one of this series, we learned how to implement batch updates and advanced inserts in Ecto to dramatically improve database performance. But implementing these optimizations is only the first step. Ensuring they continue to work effectively in production requires professional monitoring and observability. This guide will show you how to use AppSignal for Elixir to monitor your Ecto application's performance when dealing with batch data operations.

Read Post

AppSignal

Read more about Monitor the Performance of Your Ecto for Elixir App with AppSignal

Setting up your Private Location using Docker

Oct 28, 2025 By Uptime Website Monitoring In uptime

Learn how to setup your Private Location using Docker. This step-by-step guide will walk you through the process! We offer a 30 day money back guarantee for all new users.

View Video

uptime

Monitoring

Read more about Setting up your Private Location using Docker

Reality Bytes: The DEX Equation (Productivity Savings + Nexthink's $3bn Milestone)

Oct 28, 2025 By Nexthink In Nexthink

This week, Tim and Tom are joined by RB regular and DEX Hub editor Sean Malvey to unpack Nexthink’s first Workplace Productivity Report, “Cracking the DEX Equation.” Drawing on data from nine million endpoints, the report quantifies the real productivity impact of digital employee experience — revealing where enterprises lose nearly half a million hours a year to poor DEX, and how small score gains deliver measurable ROI.

View Video

Nexthink

Read more about Reality Bytes: The DEX Equation (Productivity Savings + Nexthink's $3bn Milestone)

Redefining NetOps: Agent Systems and Practical AI from the ONUG AI in Networking Summit

Oct 28, 2025 By Phil Gervasi In Kentik

AI in networking isn’t theoretical anymore. It’s here, reshaping how we operate. At the ONUG AI Networking Summit, we saw firsthand how agent systems are moving from hype to hands-on reality, from secure automation to data-driven root cause analysis. The future of NetOps isn’t dashboards and tickets — it’s intelligent agents, observability, and measurable business outcomes.

Read Post

Kentik

Read more about Redefining NetOps: Agent Systems and Practical AI from the ONUG AI in Networking Summit

Transform and Migrate Logs with Datadog Custom Processor

Oct 28, 2025 By Datadog In Datadog

See how Datadog’s new Custom Processor in Observability Pipelines helps you transform and migrate logs from platforms like Splunk and Sumo Logic with precision and control. This demo walks through real examples of using VRL (Vector Remap Language) to enrich log data, rewrite timestamps, apply quotas, and securely process archives.

View Video

Datadog

Read more about Transform and Migrate Logs with Datadog Custom Processor

Faster, more collaborative data exploration: Introducing saved queries in Grafana Cloud

Oct 28, 2025 By Daniel Moosdeen Mui In Grafana

Writing queries is one of Grafana’s most powerful features, but it can also be one of the most time-consuming. Whether you’re exploring logs or building new dashboards, you often find yourself and your team rewriting the same queries over and over again. This is why we rolled out saved queries, a feature that makes it easy for everyone on your team to save, share, and reuse queries, eliminating the need to start from scratch each time.

Read Post

Grafana

Read more about Faster, more collaborative data exploration: Introducing saved queries in Grafana Cloud

LogicMonitor Is FedRAMP Moderate Authorized: How We Support Federal IT

Oct 28, 2025 By Justin Fessler In LogicMonitor

Federal agencies need observability that doesn’t create new compliance problems. Today, that’s possible. LogicMonitor Envision is now FedRAMP Moderate Authorized with a formal Authorization to Operate (ATO). That means unified, AI-powered visibility across your hybrid infrastructure—on-prem, AWS GovCloud, Azure Government, and edge—without starting your security review from scratch.

Read Post

LogicMonitor

Read more about LogicMonitor Is FedRAMP Moderate Authorized: How We Support Federal IT

OTel Updates: Consistent Probability Sampling Fixes Fragmented Traces

Oct 28, 2025 By Anjali Udasi In Last9

You're sampling 1% of traces in production. A payment request fails at 3 AM. Logs show an error in order-service, but the full picture isn't there because different services made different sampling decisions. order-service kept the trace; payment-service didn't. So you end up checking logs and timestamps across a few services to piece things together. This happens because the usual probability sampling approach makes a separate choice at each service boundary.

Read Post

Last9

Read more about OTel Updates: Consistent Probability Sampling Fixes Fragmented Traces

You Came Looking for Restorepoint. You're in the Right Place.

Oct 28, 2025 By ScienceLogic In ScienceLogic

Restorepoint earned recognition for solving one of the hardest and most important challenges in network operations: keeping every device configuration backed up, verified, and ready to recover. It gave teams confidence that their infrastructure was consistent, compliant, and protected from the smallest misstep. ScienceLogic acquired Restorepoint in 2021 to build on that strength and extend it.

Read Post

ScienceLogic

Read more about You Came Looking for Restorepoint. You're in the Right Place.

Artificial Intelligence as a Service AIaaS (AIaaS): What is Cloud AI & How Does it Work?

Oct 28, 2025 By Muhammad Raza In Splunk

Today, organizations looking to build AI products and services using large language models (LLMs), agentic AI, and generative AI often start by investing in artificial intelligence as a service (AIaaS), also known as cloud AI. AIaaS provides a scalable, flexible, and cost-effective way for businesses of all sizes to access advanced AI technologies without the need for extensive in-house expertise or infrastructure.

Read Post

Splunk

Read more about Artificial Intelligence as a Service AIaaS (AIaaS): What is Cloud AI & How Does it Work?

Build and buy: Why a durable enterprise architecture delivers business impact at scale

Oct 28, 2025 By Chris Blaisure In Elastic

As a technology leader at Elastic, my team is responsible for building AI experiences on a centralized architecture to enable our sales and customer support teams to scale their impact and take advantage of AI faster.

Read Post

Elastic

Read more about Build and buy: Why a durable enterprise architecture delivers business impact at scale

k8s-monitoring-helm Chart Office Hours (October 2025)

Oct 28, 2025 By Grafana In Grafana

In the October edition of the Kubernetes Monitoring Helm chart office hours, we discuss the upcoming version 3.6 release as well as the plan for upcoming features.

View Video

Grafana

Read more about k8s-monitoring-helm Chart Office Hours (October 2025)

Webinar Snippet: How to Use Obkio's Visual Traceroutes

Oct 28, 2025 By Obkio In Obkio

Obkio recently launched the brand new Visual Traceroute Tool. In this snippet from the feature release webinar, Solution Architect, Sam, demonstrates how to use Obkio's integrated Visual Traceroutes in Monitoring Sessions. Watch as he walks through the timeline feature, hop-by-hop visualization, and path change detection, all designed to help you identify exactly where and when network issues occur.

View Video

Obkio

Read more about Webinar Snippet: How to Use Obkio's Visual Traceroutes

Webinar Snippet: How to Add Obkio's Visual Traceroutes in A Dashboard

Oct 28, 2025 By Obkio In Obkio

Obkio recently launched the brand-new Visual Traceroute Tool. In this snippet from the feature release webinar, Solution Architect, Sam, shows you how to integrate Visual Traceroutes directly into your custom dashboards for even more powerful network visibility. See how easy it is to add traceroute widgets to your monitoring dashboards, giving your team instant access to hop-by-hop analysis alongside your other key graphs and network metrics.

View Video

Obkio

Read more about Webinar Snippet: How to Add Obkio's Visual Traceroutes in A Dashboard

Webinar Snippet: How to Use Obkio's Network Map (Visual Traceroutes)

Oct 28, 2025 By Obkio In Obkio

Obkio recently launched the brand new Visual Traceroute Tool. In this snippet from our feature release webinar, Solution Architect, Sam, dives into Network Maps and shows you how to read and interpret the visual data to quickly identify network bottlenecks and performance issues. Learn how to leverage the map view to understand your network topology, spot problematic hops, and make faster troubleshooting decisions with visual context.

View Video

Obkio

Read more about Webinar Snippet: How to Use Obkio's Network Map (Visual Traceroutes)

Why Email Servers Get Blacklisted?

Oct 28, 2025 By Simon Rodgers In WebSitePulse

An email server gets blacklisted when it's identified as a potential source of spam, malware, or suspicious activity. Blacklists use automated systems and user reports to flag servers that violate mailing or security standards. Once listed, legitimate messages may bounce, land in spam folders, or never reach recipients at all. Understanding why this happens is essential to prevent future listings and protect the sender's reputation.

Read Post

WebSitePulse

Read more about Why Email Servers Get Blacklisted?

Datadog vs Grafana (2025) - Costs, Use Cases, and Key Differences

Oct 28, 2025 By Ankit Anand In SigNoz

When engineering teams evaluate observability tools, the "Datadog vs. Grafana" debate is one of the most common. The choice is difficult because they represent two fundamentally different philosophies. Datadog is a comprehensive, all-in-one, managed SaaS platform. It offers a "buy" solution where you get a unified experience for metrics, logs, and traces out of the box. Grafana is an open-source, highly flexible visualization layer.

Read Post

SigNoz

Read more about Datadog vs Grafana (2025) - Costs, Use Cases, and Key Differences

WhatsUp Gold System Health and Scaling

Oct 28, 2025 By Progress WhatsUp Gold In WhatsUp Gold

This video describes the importance of your WhatsUp Gold system health and the signs that it may be time to scale your system. You also learn about the actions you can take to scale and optimize your WhatsUp Gold deployment.

View Video

WhatsUp Gold

Read more about WhatsUp Gold System Health and Scaling

Sponsored Post

Avantra + Ansible: Better Together for Enterprise SAP Automation

Oct 27, 2025 By Avantra Team In Avantra

Enterprises trust Ansible for fast, reliable infrastructure automation, including terraform for automated cloud provisioning. Many organizations using Ansible leverage Ansible SAP playbooks for SAP infrastructure automation. Avantra extends the scope of SAP operations using Ansible, adding observability, ITSM and ALM solution integration, and orchestration across the SAP estate. Avantra and Ansible together provide a closed-loop solution where monitoring, automation and proof of outcome live in one place across on-premise, hyperscaler and private cloud ERP implementations.

Read Post

Avantra

Read more about Avantra + Ansible: Better Together for Enterprise SAP Automation

Observability Masterclass | AI-Driven Observability for Enhanced System Performance

Oct 27, 2025 By solarwindsinc In SolarWinds

Tuesday, October 28, 10:00 - 11:00am CDT In today’s relentless digital world, achieving peak system performance isn’t just a goal—it’s mission-critical. Join SolarWinds and GigaOm for an electrifying webcast featuring renowned Observability authority Jon Collins, VP of Engagement and Field CTO at GigaOm.

View Video

SolarWinds

Read more about Observability Masterclass | AI-Driven Observability for Enhanced System Performance

Shadow AI on Trial: The Phantom Threat to Compliance

Oct 27, 2025 By Teneo In Teneo

Every law firm I meet can explain its information security policy in minutes. Far fewer can tell me which AI tools their staff actually used last week, and what data those tools touched. That gap is where Shadow AI sits, such as unsanctioned, unmonitored use of generative AI slips in. It promises speed, but it quietly creates exposure: confidentiality breaches, weak auditability, and a risk to governance when the regulator (or a client’s GC) asks hard questions.

Read Post

Teneo

Read more about Shadow AI on Trial: The Phantom Threat to Compliance

A Visionary in the 2025 Gartner Magic Quadrant for DEM

Oct 27, 2025 By Uptrends In Uptrends

ITRS has been named a Visionary in the 2025 Gartner Magic Quadrant for Digital Experience Monitoring — for the second year running. We’ve spent another year working alongside you to solve real problems, scale what works, and prepare for what’s next. If you’re delivering resilient, high-performing digital experiences while meeting evolving compliance demands, our direction is shaped by your needs.

Read Post

Uptrends

Read more about A Visionary in the 2025 Gartner Magic Quadrant for DEM

AI Code Review: 30K Bugs Lighter, 50% faster

Oct 27, 2025 By Lindsay Piper In Sentry

Last month, we launched AI Code Review, our developer tool that automatically catches bugs, finds performance issues, and helps you ship PRs faster. 30 days later, here's what’s new.

Read Post

Sentry

Read more about AI Code Review: 30K Bugs Lighter, 50% faster

Query Distinct Tag Values in Under 30ms with the InfluxDB 3 Distinct Value Cache

Oct 27, 2025 By Scott Anderson In InfluxData

The Distinct Value Cache (DVC) available with InfluxDB 3 Core and InfluxDB 3 Enterprise lets you cache distinct values of specific columns and query those values in under 30ms. The DVC is an in-memory cache that stores distinct values of one or more columns in a table. It is typically used to cache distinct tag values, but you can also cache distinct field values.

Read Post

InfluxData

Read more about Query Distinct Tag Values in Under 30ms with the InfluxDB 3 Distinct Value Cache

Integration & Data Ingestion: Strengthening AIOps Observability

Oct 27, 2025 By david.arrowsmith In Interlink

Large enterprises face the challenge of managing high-volume, very diverse data streams that span both legacy and modern, digital systems and applications. To gain timely, accurate insight across this kind of complexity, IT teams need observability platforms that can do more than just monitor - they must also unify, contextualize and enrich data so teams can act effectively to protect the availability of the services their customers rely on.

Read Post

Interlink

Read more about Integration & Data Ingestion: Strengthening AIOps Observability

How to Setup Edge Processor in Splunk Enterprise

Oct 27, 2025 By Splunk In Splunk

Learn how to setup Splunk’s Edge Processor on-prem (Splunk Enterprise) in this two-step demo video. This video walks through how to setup the control plane with a data management application, followed by defining and installing an edge processor instance.

View Video

Splunk

Read more about How to Setup Edge Processor in Splunk Enterprise

Streams: Elastic's New AI That Turns Log Chaos into Clarity

Oct 27, 2025 By Elastic In Elastic

Elastic just made every SRE’s life easier. With the new Elastic Streams, AI automatically organizes, structures, and analyzes billions of logs, helping you find issues, detect anomalies, and fix problems in minutes, not hours. See how Elastic’s deep generative AI core turns chaos into clarity for Site Reliability Engineers and developers worldwide.

View Video

Elastic

Read more about Streams: Elastic's New AI That Turns Log Chaos into Clarity

External Request Monitoring: The Silent Pillar Every APM Needs

Oct 27, 2025 By Mohana Ayeswariya J In Atatus

The global market for application performance monitoring (APM) is growing fast. Market research shows the industry is expected to rise from about USD 7.52 billion in 2023 to nearly USD 19.62 billion by 2030, with a compound annual growth rate (CAGR) of around 15.1%. This rapid expansion reflects how digital transformation, hybrid cloud adoption, and third-party integrations are reshaping performance monitoring needs. It’s no longer enough to track just internal code paths and database queries.

Read Post

Atatus

Read more about External Request Monitoring: The Silent Pillar Every APM Needs

What is the Role of IT Ops? Key Responsibilities Explained

Oct 26, 2025 By Nuno Tomas In isDown

The IT ops role serves as the backbone of modern technology infrastructure, ensuring systems run smoothly, securely, and efficiently. IT operations teams manage everything from server maintenance to incident response, making them essential for business continuity. Understanding what IT operations professionals do helps organizations build stronger technical teams and improve their infrastructure management.

Read Post

isDown

Read more about What is the Role of IT Ops? Key Responsibilities Explained

Don't count integrations, count dashboards and alerts

Oct 26, 2025 By Ofri Grushka In Coralogix

Vendors often compete by saying how many extensions or quick start packs they have. The implicit promise is: more integrations equals better observability. But that misses the point. What really matters is the quality and coverage of dashboards and alerts that you actually use to maintain system health, prevent outages and improve user experience. At Coralogix we believe that what you do with integrations is far more important than how many you have.

Read Post

Coralogix

Read more about Don't count integrations, count dashboards and alerts

AI monitoring is coming to Oh Dear

Oct 26, 2025 By Freek Van der Herten In Oh Dear

Would you know if your checkout form stopped working overnight? Or if a recent deploy broke your login flow? Traditional monitoring can't catch these issues - it only tells you if your site is up, not if it actually works. AI monitoring lets you describe what should work in plain English, and we'll test it like a real user would - clicking buttons, filling forms, checking content. No scripts to maintain, no complex setup.

Read Post

Oh Dear

Read more about AI monitoring is coming to Oh Dear

Now in the API: History, Custom Monitors, and Subscribers

Oct 25, 2025 By Valeria Kurolapova In StatusGator

Last month, we introduced the StatusGator API v3, a complete overhaul of our API designed to give developers more flexibility, an improved data model, and deeper integration options for monitoring the status of hundreds of services. Today, we’re excited to share three major additions to v3: the Board History API, Custom Monitors API, and Status Page Subscribers API.

Read Post

StatusGator

Read more about Now in the API: History, Custom Monitors, and Subscribers

WebGL Application Monitoring: 3D Worlds, Games & Spaces

Oct 25, 2025 By Dotcom-Monitor In Dotcom-Monitor

WebGL has turned the browser into a real-time 3D engine. The same technology behind console-quality games now powers design platforms, architectural walkthroughs, and virtual conference spaces—all without a single plugin. These 3D experiences blur the line between web and desktop, blending high-fidelity rendering with persistent interactivity and complex real-time data streams. But with that complexity comes a new operational challenge: how do you monitor it?

Read Post

Dotcom-Monitor

Read more about WebGL Application Monitoring: 3D Worlds, Games & Spaces

Top tips for smoother IT incident management

Oct 24, 2025 By Nandini Malhotra In ManageEngine

Top tips is a weekly column where we highlight what’s trending in the tech world and share ways to stay ahead. This week, we’re talking about something every IT team knows too well—incidents. Whether it’s a sudden server crash, a network outage, or a system slowdown right before an important client call, incidents always seem to strike at the worst possible time. No matter how strong your IT setup is, issues are bound to happen.

Read Post

ManageEngine

Read more about Top tips for smoother IT incident management

What's New at Catchpoint Ep 3: AI Advisor

Oct 24, 2025 By Catchpoint In Catchpoint

Catchpoint’s new AI Advisor discovers unknown unknowns to improve your performance monitoring by seeing what you might’ve missed.

View Video

Catchpoint

Monitoring

Read more about What's New at Catchpoint Ep 3: AI Advisor

DNS Outages Expose Hidden Risks. Edwin AI Finds Them Faster.

Oct 24, 2025 By Margo Poda In LogicMonitor

The recent AWS outage exposed how fragile the internet remains. Amazon traced the hours-long disruption to a DNS error—a small failure with massive reach. For most organizations, DNS operates quietly in the background. When it fails, every digital service connected to it stops. One of LogicMonitor’s valued customers, IG Group, faced a similar event less than ten hours after enabling Edwin AI.

Read Post

LogicMonitor

Read more about DNS Outages Expose Hidden Risks. Edwin AI Finds Them Faster.

How to Use the Power BI Desktop InfluxDB 3 ODBC Connector

Oct 24, 2025 By Anais Dotis-Georgiou In InfluxData

The challenge of storing, processing, and alerting on your time series data is only part of the battle when it comes to deriving value from time-stamped data. While InfluxDB 3 addresses those hurdles with the database and Python processing engine, data analytics teams still need to be able to visualize their data and build dashboards to complete the time series story.

Read Post

InfluxData

Read more about How to Use the Power BI Desktop InfluxDB 3 ODBC Connector

[Workshop] Fixing Your Frontend: Performance Monitoring Best Practices

Oct 24, 2025 By Sentry In Sentry

The holiday season is here. Is your frontend ready for the traffic spike, or are you preparing for a debugging nightmare? In this live, hands-on workshop, we'll dive into the best practices for modern error and performance monitoring in Sentry. In this live hands on session, we’ll cover: Instrumenting Sentry and alert rules to surface and fix critical errors fast Optimizing site performance using Web Vitals like TTFB and LCP.

View Video

Sentry

Monitoring

Read more about [Workshop] Fixing Your Frontend: Performance Monitoring Best Practices

Why Your APM Needs Observability - Metrics, Logs, and Traces Explained

Oct 24, 2025 By Pavithra Parthiban In Atatus

Modern software applications are increasingly complex. Microservices, cloud infrastructure, and distributed architectures make it challenging for developers, DevOps engineers, and SREs to maintain high performance and a seamless user experience. Traditional Application Performance Monitoring (APM) provides critical insights into how applications perform, but alone, it often leaves blind spots when it comes to diagnosing issues or understanding the full system behavior.

Read Post

Atatus

Read more about Why Your APM Needs Observability - Metrics, Logs, and Traces Explained

Auvik Named a Leader Across G2's Fall 2025 Reports for Network Management

Oct 24, 2025 By Bob Wientzen In Auvik

In G2’s Fall 2025 reports, Auvik earned top recognition as a leader in network management tools across small-business, mid-market, and enterprise categories. IT professionals rated Auvik highly for implementation, usability, results, relationship, and overall Grid® performance, reflecting one thing above all: real-world trust from the IT professionals who use Auvik every day.

Read Post

Auvik

Read more about Auvik Named a Leader Across G2's Fall 2025 Reports for Network Management

Meet Olly - The Coralogix AI Observability Agent (Demo)

Oct 24, 2025 By Coralogix In Coralogix

Olly is Coralogix’s AI-native observability agent that makes observability data fast, accessible, and actionable—for everyone. Traditionally, teams have spent valuable time piecing together dashboards and writing queries to troubleshoot issues. Olly changes that by letting you ask real questions in natural language and delivering instant, intelligent answers from across your logs, metrics, and traces.

View Video

Coralogix

Read more about Meet Olly - The Coralogix AI Observability Agent (Demo)

OpenTelemetry Spans Explained: Deconstructing Distributed Tracing

Oct 24, 2025 By Anjali Udasi In Last9

In a microservices architecture, a single user request can pass through multiple services before completing. When performance drops or an error occurs, tracing that journey is the only way to locate the source. Distributed tracing provides that visibility. At its core are OpenTelemetry Spans — units of work that capture what each service does during a request.

Read Post

Last9

Read more about OpenTelemetry Spans Explained: Deconstructing Distributed Tracing

What is IT Ops? All You Need to Know in 2025

Oct 23, 2025 By Nuno Tomas In isDown

IT Operations, commonly known as IT Ops, forms the backbone of every organization's technology infrastructure. IT Ops teams ensure that all systems, networks, and services run smoothly, keeping businesses operational 24/7.

Read Post

isDown

Read more about What is IT Ops? All You Need to Know in 2025

The MCP Server Overview #grafana #ai #mcpserver

Oct 23, 2025 By Grafana In Grafana

Watch the complete video about using the Grafana MCP Server.

View Video

Grafana

Read more about The MCP Server Overview #grafana #ai #mcpserver

Introducing The Next Phase Of Synthetic Monitoring: Playwright Check Suites

Oct 23, 2025 By Hannes Lenke In Checkly

We've been running Playwright in production since the beginning. Today, we're going all in. When we first launched Browser Checks with Playwright support, we proved something critical: the most popular test automation framework since Selenium isn't just for testing—it's the foundation of modern production monitoring. But that was just the beginning. Today, we're announcing Playwright Check Suites—our bet on the future of monitoring and the most significant evolution in Checkly's history.

Read Post

Checkly

Read more about Introducing The Next Phase Of Synthetic Monitoring: Playwright Check Suites

Enhanced Flexibility and Security Monitoring - New in DataStream

Oct 23, 2025 By VirtualMetric In VirtualMetric

This update delivers significant advances in operational flexibility and security monitoring capabilities. It addresses the evolving needs of security teams across diverse deployment environments, from air-gapped networks to those prioritizing automation and simplicity, while expanding integration options and improving visibility into data flows.

Read Post

VirtualMetric

Read more about Enhanced Flexibility and Security Monitoring - New in DataStream

Why do you only use Playwright for pre-release testing and not for production monitoring, too?

Oct 23, 2025 By Checkly In Checkly

We've been running Playwright in production for years. Today, we, at Checkly, are going all in with Playwright Check Suites. Playwright Check Suites is our latest step towards uniting testing and monitoring into a single workflow. It's our biggest advancement yet! Here's why this matters: We're not adapting Playwright anymore. We're running it natively in production with full `playwright.config` support, complete custom dependency control, and support for every tag, spec, or configuration.

View Video

Checkly

Read more about Why do you only use Playwright for pre-release testing and not for production monitoring, too?

How to solve authentication failures when you have an Azure setup

Oct 23, 2025 By Geoffrin Edwin In Site24x7

It is not just your business. Enterprises worldwide face recurring technical issues related to authentication failures and access problems. These errors often pop up, especially in scenarios with service connection setups, pod/start failures, or integration issues. Most of the time, these errors indicated failed deployments, pods failing to pull images, or intermittent authentication/access errors.

Read Post

Site24x7

Read more about How to solve authentication failures when you have an Azure setup

How to Replace Synthetics with the httpcheck Receiver

Oct 23, 2025 By Mike Terhar In Honeycomb

A 200 OK doesn't always mean everything is okay. You've probably seen it: your health check endpoint returns success, but your users are staring at an error page. Maybe the database connection pool is exhausted, or a critical downstream service is timing out, but your API dutifully returns 200 because technically it responded. This is the reality of monitoring HTTP endpoints in production—status codes alone don't tell the whole story.

Read Post

Honeycomb

Read more about How to Replace Synthetics with the httpcheck Receiver

Unlocking the Power of Dashboards -- Customer Brown Bag -- October 23rd, 2025

Oct 23, 2025 By Sumo Logic, Inc. In Sumo Logic

Join us as Michael walks through how to build a dynamic dashboard, using a Node.js service as an example.

View Video

Sumo Logic

Read more about Unlocking the Power of Dashboards -- Customer Brown Bag -- October 23rd, 2025

10 Proven APM Best Practices to Reducing Latency and Improving Response Time

Oct 23, 2025 By Mohana Ayeswariya J In Atatus

Speed defines user loyalty. Recent market research indicates that organizations adopting advanced application performance monitoring (APM) tools are achieving measurable gains in user engagement, retention, and revenue. “ A 2025 performance study found that businesses tracking latency and response time proactively reduced customer churn by up to 30%. ” As applications expand across distributed architectures, microservices, and cloud environments, performance gaps become harder to diagnose.

Read Post

Atatus

Read more about 10 Proven APM Best Practices to Reducing Latency and Improving Response Time

Top 11 Ruby APM Tools for 2025: A Performance-Driven Selection

Oct 23, 2025 By Anjali Udasi In Last9

Observability has become a core part of running Ruby applications at scale. Knowing how your app performs — from request latency to background job execution — helps catch slowdowns early and improve reliability. This blog walks through some of the most useful APM tools for Ruby in 2025. Each section highlights what the tool does well, where it fits best, and what kind of visibility it brings to your application's performance.

Read Post

Last9

Read more about Top 11 Ruby APM Tools for 2025: A Performance-Driven Selection

Unpacking the Elements of Site Uptime (by way of Jeopardy!)

Oct 23, 2025 By AlertBot In AlertBot

Picture this: you’ve achieved your second lifelong dream of being a contestant on Jeopardy! Now it’s time for the fateful “final answer.” The good news? You’ve got a comfortable lead over your fellow contestants, and a correct response means eternal bragging rights. The bad news? Miss this one, and everyone — your family, coworkers, dentist, mechanic — will remind you of it forever. The lights dim. The audience holds its breath.

Read Post

AlertBot

Read more about Unpacking the Elements of Site Uptime (by way of Jeopardy!)

Declarative Configuration in OTel (Grafana OpenTelemetry Community Call #1)

Oct 23, 2025 By Grafana In Grafana

We’re kicking off a brand-new Grafana OpenTelemetry Community Call! Join us as we dive into getting observability into your apps and infrastructure with Grafana, powered by OpenTelemetry. In this session, we’ll dive into Declarative Config — the new way to make OpenTelemetry onboarding simple and powerful. Instead of juggling environment variables or boilerplate in your startup code, declarative config gives you a clean, language-agnostic approach that works across SDKs and unlocks future possibilities like remote configuration. Join us with Marylia Gutierrez (OTel JavaScript approver & core contributor) to explore.

View Video

Grafana

Read more about Declarative Configuration in OTel (Grafana OpenTelemetry Community Call #1)

How Atlassian built a smarter observability system with Grafana and OpenTelemetry

Oct 23, 2025 By Grafana In Grafana

Discover how Atlassian built OpsDeck, an observability platform powered by Grafana, to automate incident detection, improve response time, and reduce troubleshooting from one hour to under a minute. Hear how the Observability Insights team scaled OpenTelemetry, broke silos, and built smarter workflows for both engineers and support.

View Video

Grafana

Read more about How Atlassian built a smarter observability system with Grafana and OpenTelemetry

Demystifying WMI Permissions

Oct 23, 2025 By Greg Collins In WhatsUp Gold

Network administrators are always seeking to gain a deeper understanding of their Windows-based environments. Windows Management Instrumentation (WMI) enables their network monitoring tools to access system information, manage configurations and automate tasks. It provides a vital role in network monitoring by providing a standardized interface for querying and controlling system components. A complex set of permissions governs WMI access.

Read Post

WhatsUp Gold

Read more about Demystifying WMI Permissions

Kubernetes monitoring & observability trends 2026 | Future of Kubernetes observability

Oct 23, 2025 By Grace Nalini In Site24x7

Kubernetes continues to dominate as the container orchestration standard, but the way we monitor and observe clusters is rapidly evolving. As we head into 2026, Kubernetes monitoring is moving toward actionable insights, cost-aware observability, and security-first approaches. This blog dives deep into what engineers, architects, and platform teams should watch for in the year ahead — with real-world examples for context.

Read Post

Site24x7

Read more about Kubernetes monitoring & observability trends 2026 | Future of Kubernetes observability

Clarity in the Dojo: The power of the Summary Agent

Oct 23, 2025 By Christopher Beier In Sumo Logic

In the dojo, not every role is about throwing punches. Some roles are about awareness, the unmistakable voice that tells the fighter when to move, where the strike is coming from, and why the opponent matters. That’s the role of the Summary Agent in Sumo Logic Dojo AI. Unlike a traditional agent, it doesn’t launch queries or carry out actions on its own. Its purpose is to narrate, not act. In doing so, it becomes the foundation for every other decision in the dojo.

Read Post

Sumo Logic

Read more about Clarity in the Dojo: The power of the Summary Agent

Get organized, actionable insights from complex test environments with Datadog Test Suites

Oct 23, 2025 By Lauren Zuniga In Datadog

Modern teams often run hundreds of synthetic tests across multiple services, environments, and user journeys. While these tests provide deep visibility, managing them as a flat list can quickly become overwhelming, especially as organizations scale and teams specialize.

Read Post

Datadog

Read more about Get organized, actionable insights from complex test environments with Datadog Test Suites

What's New at Catchpoint Ep 3: Session Replay

Oct 23, 2025 By Catchpoint In Catchpoint

Get detailed data on how real users are using your applications in real time with Catchpoint’s Session Replay, with a user’s session replayed like a movie.

View Video

Catchpoint

Monitoring

Read more about What's New at Catchpoint Ep 3: Session Replay

The next evolution of WebPageTest has arrived, and it's a game-changer

Oct 23, 2025 By Piril Kavlak In Catchpoint

Now fully integrated into Catchpoint’s Internet Performance Monitoring (IPM) platform, WebPageTest is no longer just a testing tool; it’s your full-stack performance command center. From AI-powered insights to automation and Smartboards, the new WebPageTest gives digital experience teams everything they need to move beyond page speed and master end-to-end performance. Test smarter, detect faster, and optimize every layer of performance with a unified, AI-powered platform built for experts.

Read Post

Catchpoint

Read more about The next evolution of WebPageTest has arrived, and it's a game-changer

Grafana and Grafana Cloud release cycle: An end-of-year update

Oct 23, 2025 By Bruno Abrantes In Grafana

With the end of the year fast approaching, we want to let you know about some important dates for our upcoming release freezes. Our annual release freeze helps ensure stability for everyone during the holiday season, which is a critical time for many of our customers. This pause helps us protect our on-call teams and maintain a smooth experience for you.

Read Post

Grafana

Read more about Grafana and Grafana Cloud release cycle: An end-of-year update

AI Agent for Cloud Cost Optimization: From Blind Spots to Smarter Spend

Oct 23, 2025 By Sayali Pagrut In Digitate

Cloud has become the backbone of digital enterprises, but managing its cost footprint is proving increasingly difficult. With multiple providers, diverse pricing models, and ever-changing workloads, organizations often find themselves facing spend leakage and unanticipated overruns. The stakes are high—not only in terms of IT budgets but also in ensuring cloud resources deliver maximum business value.

Read Post

Digitate

Read more about AI Agent for Cloud Cost Optimization: From Blind Spots to Smarter Spend

How to Escape Legacy Monitoring (and Thrive in the Era of AI Networks)

Oct 22, 2025 By Phil Gervasi In Kentik

The network has transformed dramatically. Hybrid cloud architectures and AI-powered infrastructure seem to expand in scale and complexity every day. But legacy tools to monitor and observe these networks have not kept up.

Read Post

Kentik

Read more about How to Escape Legacy Monitoring (and Thrive in the Era of AI Networks)

See What Your Users Saw: Session Replay Is Here

Oct 22, 2025 By Rollbar In Rollbar

Debugging JavaScript errors just got easier. Today, we're launching Session Replay for Rollbar, giving you visual context for every error with customizable triggers and the only open-source MCP integration that lets AI analyze your sessions directly in your IDE.

Read Post

Rollbar

Read more about See What Your Users Saw: Session Replay Is Here

How to bridge speed and quality in experiments through unified data

Oct 22, 2025 By Addie Beach In Datadog

Metrics are fundamental to experimentation for two reasons: They set the basis for evaluating ideas and interventions, and they can suggest where to look next. As such, many teams collect a wide variety of metrics, from application performance data to revenue trends. However, doing so often means manually knitting together data from multiple sources and formats. Even then, data silos can make it challenging to understand the full impact of experimental changes. In this post, we’ll explore.

Read Post

Datadog

Read more about How to bridge speed and quality in experiments through unified data

The Network Engineers You Can't Hire? They Already Work for You

Oct 22, 2025 By Yann Guernion In Broadcom

In my conversations about managing large, complex networks, one topic is now constant. The issue isn't budgets or new technology; it's about personnel. Specifically, it's the increasing difficulty of finding and retaining skilled professionals. If you are feeling this pressure, you are not alone. The search for technical talent is a universal challenge.

Read Post

Broadcom

Read more about The Network Engineers You Can't Hire? They Already Work for You

What's New in Network Observability for Fall 2025

Oct 22, 2025 By Sean Armstrong In Broadcom

As your partner in network observability, we’ve worked together to help you manage an increasingly complex digital landscape. You’ve built a powerful monitoring foundation, but the pace of change doesn’t slow down. Your network continues to expand across hybrid clouds and multi-vendor SD-WAN, and the demands on your team grow with it.

Read Post

Broadcom

Read more about What's New in Network Observability for Fall 2025

Datadog Cloud Cost Management: Make cost a key metric for engineers

Oct 22, 2025 By Datadog In Datadog

See how Datadog Cloud Cost Management puts cost and efficiency KPIs directly in front of engineers in their daily workflows. In this short demo, you’ll learn how to: Datadog unifies cost, performance, and business metrics in one platform, so FinOps, engineering, and finance teams can make cost-aware decisions together.

View Video

Datadog

Read more about Datadog Cloud Cost Management: Make cost a key metric for engineers

5 Log Management Best Practices for Your Organization

Oct 22, 2025 By Logz.io In logz.io

At Logz.io, we speak with hundreds of companies every month. One thing is consistent across the board: everyone ships logs. But the challenges are equally common: What are the best practices for logging? How do we reduce noise? How should we architect our logs to make them truly useful? The reality is that logs are noisy for everyone. The best time to standardize your logging practices is when you write your first line of code—though that rarely happens. The second-best time is now.

Read Post

logz.io

Read more about 5 Log Management Best Practices for Your Organization

Grafana Tempo 2.9 release: MCP server support, TraceQL metrics sampling, and more

Oct 22, 2025 By Tiffany Jernigan In Grafana

Grafana Tempo 2.9 is now available, delivering MCP server support, TraceQL performance improvements, and more. Watch the video below to see the Tempo MCP server in action and learn how to speed up TraceQL metrics queries, or continue reading to get a quick overview of these and other updates. The Grafana Tempo 2.9 release notes and changelog provide more in-depth details and include all of the changes that came with this release.

Read Post

Grafana

Read more about Grafana Tempo 2.9 release: MCP server support, TraceQL metrics sampling, and more

Two Factors, Double Security?

Oct 22, 2025 By Johannes Rauh In Icinga

“Please enter the code we just sent you.” – most people have seen this message when logging into an online service. Two-Factor Authentication (2FA) is no longer reserved for banks or enterprises. It’s now common in email, social media, and shopping accounts. The idea is simple: in addition to a password, you need a second factor so that attackers can’t break in with just one piece of information. But what methods are actually used – and how secure are they really?

Read Post

Icinga

Read more about Two Factors, Double Security?

Your network isn't infrastructure anymore. It's a product.

Oct 22, 2025 By Yann Guernion In Broadcom

In my last blog, I’ve discussed a common problem: metrics like mean time to resolution (MTTR) mean nothing to business leaders. Celebrating a faster fix for an outage that still cost the company thousands in lost sales is a conversation that goes nowhere. You might as well be speaking a different language.

Read Post

Broadcom

Read more about Your network isn't infrastructure anymore. It's a product.

We've refreshed and expanded the StatusGator Help Center

Oct 22, 2025 By Valeria Kurolapova In StatusGator

We’re excited to share a major update to the StatusGator Help Center — redesigned to make finding answers and learning new features faster and easier than ever. We’ve reorganized our documentation, added new guides, and improved formatting so you can navigate with ease — whether you’re just getting started or managing advanced integrations.

Read Post

StatusGator

Read more about We've refreshed and expanded the StatusGator Help Center

Latency & Leadership with Mehdi Daoudi

Oct 22, 2025 By Catchpoint In Catchpoint

Leadership is about more than telling people what to do. It’s about inspiring belief in your vision for the future. Sometimes there’s a delay between the time you share the vision and when the rest of the team “gets it”. The Latency & Leadership series hopes to shorten that lag time by creating a platform for leaders in the tech space to share their ideas, their passion, and their vision.

View Video

Catchpoint

Monitoring

Read more about Latency & Leadership with Mehdi Daoudi

Elastic recognized as a finalist for Innovation in Customer Portals in 2025 TSIA STAR Awards

Oct 22, 2025 By Elastic In Elastic

We are proud to announce that Elastic has been named a finalist by the Technology & Service Industry Association (TSIA) in the 2025 STAR Awards program for Innovation in Customer Portals that Improve Digital Customer Experience. This award recognizes Elastic’s ability to embrace AI innovations to enhance our digital customer experience.

Read Post

Elastic

Read more about Elastic recognized as a finalist for Innovation in Customer Portals in 2025 TSIA STAR Awards

Splunk report shows observability is a business catalyst for AI adoption, customer experience, and product Innovation

Oct 21, 2025 By Splunk In Splunk

Findings show observability boosts employee productivity for nearly threequarters of respondents, and for nearly twothirds, it drives revenue growth and helps shape product roadmaps.

Read Post

Splunk

Read more about Splunk report shows observability is a business catalyst for AI adoption, customer experience, and product Innovation

GenAI significantly drops incident response time for ITSM teams, new SolarWinds report reveals

Oct 21, 2025 By SolarWinds In SolarWinds

Findings show an average of nearly 5 hours of time can be saved per incident using ITSM operations with GenAI.

Read Post

SolarWinds

Read more about GenAI significantly drops incident response time for ITSM teams, new SolarWinds report reveals

Sponsored Post

Hidden Cost of Siloed Monitoring Tools

Oct 21, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

In today's complex IT environments, organizations often rely on a patchwork of specialized monitoring tools. One platform might monitor databases, another cloud workloads, a third enterprise applications, and yet another the infrastructure itself. While each tool addresses a specific need, this fragmented approach introduces hidden costs that can undermine operational efficiency, inflate budgets, and slow response times when critical incidents occur.

Read Post

NiCE IT Mgmt

Read more about Hidden Cost of Siloed Monitoring Tools

The Hidden Risk of DNS - Lessons from the AWS Outage & Why You Need DNS Spy Monitoring NOW

Oct 21, 2025 By DNS Spy In DNS Spy

On October 20, 2025, much of the internet came to a halt. Apps wouldn’t load. Payments failed. Cloud dashboards went dark. From Fortnite to Alexa, Snapchat, and countless business platforms, users across the world were suddenly offline — all because DNS broke inside Amazon Web Services’ (AWS) US-East-1 region.

Read Post

DNS Spy

Read more about The Hidden Risk of DNS - Lessons from the AWS Outage & Why You Need DNS Spy Monitoring NOW

What's New at Catchpoint Ep 3: RUM Analysis

Oct 21, 2025 By Catchpoint In Catchpoint

Introducing Catchpoint’s brand new AI-driven automatic root cause analysis tool! Stack Map Root Cause Analysis AI Insights.

View Video

Catchpoint

Read more about What's New at Catchpoint Ep 3: RUM Analysis

Amazon Isn't Eating Its Own DNS Dog Food

Oct 21, 2025 By Matt Rideout In DNS Check

On October 19-20, 2025, Amazon Web Services (AWS) experienced a significant outage (AWS status) affecting its US-EAST-1 region in northern Virginia. The root cause was DNS resolution failures for DynamoDB’s API endpoints, which cascaded across AWS’s interconnected services, disrupting major platforms including Snapchat, McDonald’s, Disney+, Roblox, Coinbas, Reddit, and Amazon’s own services.

Read Post

DNS Check

Read more about Amazon Isn't Eating Its Own DNS Dog Food

How WWT Proves the Value of Agentic AIOps with LogicMonitor's Edwin AI

Oct 21, 2025 By Margo Poda In LogicMonitor

Agentic AI has entered day-to-day operations. Systems with the ability to act, learn, and adjust are already cutting noise, speeding remediation, and giving engineers time back for work that moves the business. In a recent webinar, Karthik SJ, General Manager, AI at LogicMonitor, and Mike Cervasio, Global Practice Manager, AIOps at World Wide Technology, explored what makes this new phase of AIOps actionable.

Read Post

LogicMonitor

Read more about How WWT Proves the Value of Agentic AIOps with LogicMonitor's Edwin AI

Live in Boston: Data, DEX, and a Few Fist Fights @ Nexthink Experience

Oct 21, 2025 By Nexthink In Nexthink

Tim and Tom host another special live edition of The DEX Show, this time from the Omni Boston Hotel, recorded during last week’s Experience Boston. Joined by Christina Lahr (Bayer), James Krick (Campbell’s), and Ryan Way (Warburg Pincus), the hosts dig into more real-world stories of data-led IT excellence, once again in-person. In between, listeners can learn a few unexpected facts about Tim — has he ever been in a fist fight, starred in a play, or been thrown out of a bar? Listen now to find out...

View Video

Nexthink

Read more about Live in Boston: Data, DEX, and a Few Fist Fights @ Nexthink Experience

What Is an Email Blacklist?

Oct 21, 2025 By Simon Rodgers In WebSitePulse

An email blacklist is a database that lists IP addresses or domains suspected of sending spam or malicious emails. Mail servers use these lists to decide whether to deliver or reject incoming messages. Understanding how blacklists work is essential for keeping your messages deliverable and your domain reputation intact.

Read Post

WebSitePulse

Read more about What Is an Email Blacklist?

Introducing Updog.ai: Real-time provider status from Datadog

Oct 21, 2025 By Brianne Bujnowski In Datadog

When external SaaS providers or cloud services degrade or go down, engineers often find themselves wondering if the issue they're encountering is local or more widespread. The answers they find are usually slow to surface, limited in detail, or entirely dependent on the provider's updates. Vendor-controlled status pages and third-party aggregators don’t provide the timely, independent visibility that's necessary to quickly and accurately identify the root cause of slowdowns.

Read Post

Datadog

Read more about Introducing Updog.ai: Real-time provider status from Datadog

What is Open Telemetry? The Future Is Here

Oct 21, 2025 By solarwindsinc In SolarWinds

Watch SolarWinds tech evangelist, Sascha Giese, dive into OpenTelemetry (OTel) and explain why a vendor-agnostic standard is the future of observability and application performance monitoring (APM). If you’ve ever wondered, what is OpenTelemetry? Sascha’s presentation is a great start or restart to diving back into the topic.

View Video

SolarWinds

Read more about What is Open Telemetry? The Future Is Here

Optimize HPC jobs and cluster utilization with Datadog

Oct 21, 2025 By Michael Cronk In Datadog

High-performance computing (HPC) environments support some of the most critical workloads in the world—from asset pricing models in financial institutions to molecular simulations in drug discovery. These workloads often span hundreds of thousands of cores, depend on specialized infrastructure such as GPUs, and run for extended periods. As a result, performance and efficiency are critical.

Read Post

Datadog

Read more about Optimize HPC jobs and cluster utilization with Datadog

Detect and map third-party outages with Datadog External Provider Status

Oct 21, 2025 By Brianne Bujnowski In Datadog

Modern applications depend on dozens of external cloud platforms, APIs, and SaaS services to function. But when those providers experience issues, engineers often spend valuable time asking a basic question: Is the problem with us or with them? Provider-maintained status pages are often slow to update, leaving teams waiting for confirmation while incidents escalate. This delay wastes valuable time, prolongs investigations, and risks customer trust.

Read Post

Datadog

Read more about Detect and map third-party outages with Datadog External Provider Status

Authentication Model in OpenTelemetry

Oct 21, 2025 By Elizabeth Mathew In SigNoz

In any type of software that involves the movement of data or information, there is a pressing need to make the passage of data secure. One way of achieving this is by authentication. You must have experience authenticating API calls or other data streams. In modern systems, where even a small mishap can wreak havoc and you might wake up to a $$$ bill the next day, we should do whatever is within our capacity to secure our systems.

Read Post

SigNoz

Read more about Authentication Model in OpenTelemetry

Traceroute vs. Ping: When to Use Each

Oct 21, 2025 By Andrii Kernitskyi In Obkio

Let’s talk about the most fundamental network diagnostic tools: ping and traceroute. These command-line utilities have been the backbone of network troubleshooting for decades, yet many IT professionals struggle to use them in the right context. Knowing which tool to use (and when) can mean the difference between a five-minute fix and hours of frustration. While both ping and traceroute help diagnose network connectivity issues, they serve distinctly different purposes.

Read Post

Obkio

Read more about Traceroute vs. Ping: When to Use Each

Network Monitoring for Data Centers

Oct 21, 2025 By Kentik In Kentik

Kentik NMS (Network Monitoring System), part of the Kentik Network Intelligence Platform, brings true visibility and context to network operations. See how device metrics, traffic data, and application insights come together to eliminate blind spots—so your critical workloads, like AI training and inference, run smoothly and reliably.

View Video

Kentik

Read more about Network Monitoring for Data Centers

The Monitoring Blind Spot That Could Cost You Black Friday

Oct 21, 2025 By Denton Chikura In Catchpoint

With Black Friday and the holiday season looming, IT teams everywhere are bracing themselves for what is, year after year, the most daunting stress test of your entire service delivery chain. Under relentless peak demand, every link in your digital experience is scrutinized by customers whose tolerance for friction is at an all-time low. It’s not just about uptime, monitoring dashboards, or technical metrics.

Read Post

Catchpoint

Read more about The Monitoring Blind Spot That Could Cost You Black Friday

AI Agent for Incident Resolution: Combining Intelligence with Autonomous Actions

Oct 21, 2025 By Somdipto Ghosh In Digitate

Incident management is a high-stakes function. IT operations teams and SRE teams may play different roles, but when a priority incident surfaces, it is often all-hands-on-deck to ensure it is resolved in minimal time. That’s because of the high impact of incidents-if not resolved in time, they can cascade and impact other IT systems, leading to downtime, business disruptions, monetary losses, and impacting brand value, compliance, and regulatory rules.

Read Post

Digitate

Read more about AI Agent for Incident Resolution: Combining Intelligence with Autonomous Actions

Fireside Chat: Innovation and Emerging Technologies

Oct 21, 2025 By Datadog In Datadog

Join Olivier Pomel, co-founder and CEO of Datadog, and Arthur Mensch, co-founder and CEO of Mistral AI, for a discussion on emerging technologies and innovation, their impact on businesses today, and the new opportunities they offer.

View Video

Datadog

Read more about Fireside Chat: Innovation and Emerging Technologies

Datadog Cloud Cost Management: Telemetry-driven cost allocation

Oct 21, 2025 By Datadog In Datadog

See why Datadog is a leader in cloud cost allocation. In this demo, learn how Datadog leverages high-resolution observability data to deliver accurate, dynamic cost attribution across clouds and containerized environments. You’ll see how Datadog: Discover how Datadog combines cost, performance, and business context to make cost reporting both accurate and actionable.

View Video

Datadog

Read more about Datadog Cloud Cost Management: Telemetry-driven cost allocation

The Agentic Enterprise Needs a Nervous System

Oct 21, 2025 By ScienceLogic In ScienceLogic

Over the weekend, when Salesforce introduced the concept of the Agentic Enterprise, it wasn’t defining a new market trend. It was signaling an inflection point. A moment when the conversation about artificial intelligence stopped being about tools and started being about trust. For the first time in decades, enterprise software isn’t simply enabling decisions. It’s making them. Systems are reasoning, choosing, and acting in real time across sprawling digital ecosystems.

Read Post

ScienceLogic

Read more about The Agentic Enterprise Needs a Nervous System

Bridging partners in pursuit of agentic AI - Part 2: How leaders can position themselves for the future

Oct 21, 2025 By Elastic In Elastic

From ecosystem foundations to future advantage In Part 1: Why partnerships matter for enterprise intelligence, we explored how enterprises are moving from experimentation to scalable impact with agentic AI and how ecosystems make that possible. But naturally, the next question is: Where do we go from here?

Read Post

Elastic

Read more about Bridging partners in pursuit of agentic AI - Part 2: How leaders can position themselves for the future

SquaredUp Live '25

Oct 21, 2025 By Squared Up In Squared Up

Building on the success of previous years, this year’s edition of our premier customer event dove even deeper into exploring how SquaredUp users can leverage the power of their data. Our sessions featured real-life use cases, product demos, dashboarding tips and tricks, and live discussions with the SquaredUp team and fellow users.

Read Post

Squared Up

Read more about SquaredUp Live '25

AI-Powered Translation Tools: A Hidden Asset for Scaling DevOps Globally

Oct 21, 2025 By OpsMatters In OpsMatters

DevOps or development (Dev) and IT operations (Ops) teams are no longer confined to single geographic locations or language groups. With over 80% of organizations now practicing DevOps (a figure projected to reach 94% in the near future), the challenge of scaling operations globally has never been more critical. Yet, one persistent bottleneck continues to slow down even the most sophisticated DevOps workflows: language barriers.

Read Post

OpsMatters

Read more about AI-Powered Translation Tools: A Hidden Asset for Scaling DevOps Globally

The Curse of Bad Digital Experience: When Client Trust Disappears

Oct 20, 2025 By Teneo In Teneo

Every law firm fears the same ghost – the one that silently drains client trust. It doesn’t haunt the courtroom; it lurks in your digital experience.

Read Post

Teneo

Read more about The Curse of Bad Digital Experience: When Client Trust Disappears

Making logs work smarter: Evolving your observability strategy

Oct 20, 2025 By Matt Wimpelberg In Grafana

When you start building an observability stack, it’s natural to reach for logs first. They’re familiar, easy to generate, and often already part of a developer’s workflow. And sending logs to a centralized system feels like a quick win, too. Simply add a log shipper, and voila, your application is observable.

Read Post

Grafana

Read more about Making logs work smarter: Evolving your observability strategy

REST API and Terraform Provider

Oct 20, 2025 By Uptime Website Monitoring In uptime

Learn how to utilize the Uptime.com REST API and Terraform Provider. This step-by-step guide will walk you through the process! We offer a 30 day money back guarantee for all new users.

View Video

uptime

Read more about REST API and Terraform Provider

Bridging partners in pursuit of agentic AI - Part 1: Why partnerships matter for enterprise intelligence

Oct 20, 2025 By Sunnie Weber In Elastic

The pace of change in AI development has been dizzying. In just a few years, we’ve moved from experimenting with AI, machine learning (ML), retrieval augmented generation (RAG), and agents to asking how these innovations can solve real business problems. Enterprises are no longer impressed by the novelty and possibilities; instead, they expect outcomes.

Read Post

Elastic

Read more about Bridging partners in pursuit of agentic AI - Part 1: Why partnerships matter for enterprise intelligence

Navigating the Database Ecosystem in 2025

Oct 20, 2025 By Charles Mahler In InfluxData

In 2025, the database ecosystem is more diverse and interconnected than ever before. From AI-assisted natural language queries that analyze your data to open table formats that make it easy to bridge systems, data infrastructure is moving towards openness, intelligence, and composability. Modern databases are no longer isolated systems; they are part of a broader ecosystem where interoperability is as important as performance.

Read Post

InfluxData

Read more about Navigating the Database Ecosystem in 2025

Lessons from the October 2025 AWS DNS Outage

Oct 20, 2025 By Matt Rideout In DNS Check

Early on October 20, 2025, Amazon Web Services (AWS) experienced a significant outage affecting its US-EAST-1 region in northern Virginia. The root cause was DNS resolution failures for DynamoDB’s API endpoints, which cascaded across AWS’s interconnected services.

Read Post

DNS Check

Read more about Lessons from the October 2025 AWS DNS Outage

RED Metrics & Monitoring: Using Rate, Errors, and Duration

Oct 20, 2025 By Stephen Watts In Splunk

The RED method is a streamlined approach for monitoring microservices and other request-driven applications, focusing on three critical metrics: Rate, Errors, and Duration. Originating from the principles established by Google's "Four Golden Signals," the RED monitoring framework offers a pragmatic and user-centric perspective on service assurance and service performance.

Read Post

Splunk

Read more about RED Metrics & Monitoring: Using Rate, Errors, and Duration

Get started with Grafana Alerting: Route alerts using dynamic labels

Oct 20, 2025 By Grafana In Grafana

In this tutorial you will learn how to configure notification policies for dynamic routing based on query values Don't miss the rest of the "Get started with Grafana Alerting" series! Each part dives into a different feature to help you get the most out of alerting in Grafana.

View Video

Grafana

Read more about Get started with Grafana Alerting: Route alerts using dynamic labels

Application Performance Monitoring (APM) Guide: Monitor and Optimize Application Performance

Oct 20, 2025 By Vaishnavi In Atatus

Every millisecond your application takes to respond can decide whether a user stays or leaves. But here’s the catch, you can’t improve what you can’t see. Behind every slow page load, failed API call, or random spike in latency lies a story your application is trying to tell. Application Performance Monitoring (APM) is how you listen to that story.

Read Post

Atatus

Read more about Application Performance Monitoring (APM) Guide: Monitor and Optimize Application Performance

Alert RCA with SigNoz MCP Server & Claude

Oct 20, 2025 By SigNoz - Open Source Observability Platform In SigNoz

Alert RCA with SigNoz MCP Server & Claude.

View Video

SigNoz

Read more about Alert RCA with SigNoz MCP Server & Claude

Demo of Raygun's remote MCP

Oct 20, 2025 By Raygun In Raygun

This Raygun remote MCP demo highlights the new depth of context available. The agent isn’t just fetching error lists. it’s reasoning through stack traces to find the issues. Combine this with the ability to now view associated deployment versions, browser information, breadcrumbs, customer data and more, the agent becomes infinitely more capable at solving errors. We’ve even heard of some of the early testers going from having errors in production to having them solved within minutes.

View Video

Raygun

Read more about Demo of Raygun's remote MCP

AWS Outage: How do you prepare for the failure of your own safety net?

Oct 20, 2025 By Denton Chikura In Catchpoint

When AWS’s massive outage struck, it didn’t just take down cloud services, apps, and enterprise platforms. It also knocked out many of the monitoring systems organizations depend on for real-time answers. Observability companies, including Datadog, New Relic, Checkly, Dynatrace, SpeedCurve, and Splunk Observability, lost visibility or functionality precisely when organizations needed them most.

Read Post

Catchpoint

Read more about AWS Outage: How do you prepare for the failure of your own safety net?

Unreal Engine crash reporting now available on gaming consoles with trace-connected logs

Oct 20, 2025 By Ivan Tustanivskyi In Sentry

With the first major release of the Sentry Unreal SDK (now on v1.2.0, and you can also explore in our interactive sandbox), we’ve made some important improvements to support cross-platform Unreal developers when it comes to platform coverage, debugging with user feedback, and performance monitoring improvements. Here’s what’s new.

Read Post

Sentry

Read more about Unreal Engine crash reporting now available on gaming consoles with trace-connected logs

10 Best Log Monitoring Tools

Oct 20, 2025 By OpsMatters In OpsMatters

Log monitoring stands as the backbone of resilient, secure, and high-performing digital operations. Every digital service, application, cloud platform, and network device leaves behind a trail of log files, containing raw, unstructured data that chronicles system events, user actions, errors, security activities, and business transactions. For organizations striving to achieve operational excellence, these logs are more than archives; they're the heartbeat of every mission-critical system.

Read Post

OpsMatters

Read more about 10 Best Log Monitoring Tools

Microsoft Teams Troubleshooting for Teams Performance and Connection Issues

Oct 19, 2025 By Alyssa Lamberti In Obkio

How many times has this happened? You're on a Microsoft Teams call, and your call disconnects, lags or freezes. so you go to Google to look up how to solve the problem. Well look no further! If you're using Microsoft Teams, there are proven ways to troubleshoot those pesky performance and connection issues that are putting a damper on your team's collaboration.

Read Post

Obkio

Read more about Microsoft Teams Troubleshooting for Teams Performance and Connection Issues

Self-Built vs Third-Party Management Packs for Microsoft SCOM

Oct 17, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Effective monitoring is crucial in today’s hybrid IT environments. SCOM 2025 enhances Windows Server 2025 and Linux agent support while enabling unified monitoring across on-premises and cloud workloads via Azure Arc.

Read Post

NiCE IT Mgmt

Read more about Self-Built vs Third-Party Management Packs for Microsoft SCOM

Show me the (meeting) money: How to monitor the real-time costs of a meeting in Grafana

Oct 17, 2025 By David Allen In Grafana

This meeting could’ve been an email. It’s a phrase most of us have said (or at least thought) at some point in our careers. For me, that realization hit years ago while working for a government organization. I’d frequently sit through long, agendaless meetings that seemingly went nowhere. I wasn’t sure why I was there. And because I’m an engineer at heart, I started to wonder: what were these meetings actually costing the organization?

Read Post

Grafana

Read more about Show me the (meeting) money: How to monitor the real-time costs of a meeting in Grafana

Introduction to SquaredUp

Oct 17, 2025 By SquaredUp In Squared Up

Adam Kinniburgh, VP Innovation at SquaredUp provides a quick overview and look at SquaredUp's smart dashboards.

View Video

Squared Up

Read more about Introduction to SquaredUp

A deep dive into Java garbage collectors

Oct 17, 2025 By Jean-Philippe Bempel In Datadog

Historically, developers have relied on languages like C and C++ for explicit control over memory allocation and deallocation. This approach can yield very low overhead and tight control over performance, but it also increases complexity and risk (e.g., memory leaks, dangling pointers, and double frees). This often results in runtime issues that are difficult to diagnose, which can become a drag on team velocity.

Read Post

Datadog

Read more about A deep dive into Java garbage collectors

Ingest OTLP metrics directly into Datadog with the new OTLP Metrics API

Oct 17, 2025 By Connor Ward In Datadog

Many organizations rely on OpenTelemetry (OTel) to standardize observability across distributed systems. These organizations are at varying stages of adoption and are implementing OTel in complex environments with diverse configurations. To support this range of use cases, Datadog offers many ways to use OpenTelemetry with Datadog.

Read Post

Datadog

Read more about Ingest OTLP metrics directly into Datadog with the new OTLP Metrics API

Track, debug, and roll back changes with Version History for Synthetic Monitoring tests

Oct 17, 2025 By Lauren Zuniga In Datadog

A synthetic test is only useful if you can trust what it’s telling you. When one fails, the reason may not be obvious. Was the application updated? Did the test change? Or both? As more people contribute and refine the same test, it becomes harder to understand what changed or restore a working version. Without clear visibility into those updates, teams can spend more time tracking down the cause of a failure than resolving it.

Read Post

Datadog

Read more about Track, debug, and roll back changes with Version History for Synthetic Monitoring tests

From court to code: Build an agentic RAG assistant with Elasticsearch

Oct 17, 2025 By Elastic In Elastic

Want to see what it really takes to build a smart AI assistant? How about one that can help you make the right fantasy basketball picks? In this live session, we’ll demonstrate how to instantly activate and ground a high-performance AI agent using the Elastic Agent Builder, and we’ll show how it powers real-world use cases like smarter player picks. Join JD Armada, developer advocate, for a 20-minute live coding session to learn about.

View Video

Elastic

Read more about From court to code: Build an agentic RAG assistant with Elasticsearch

How Leading Businesses Achieved Greater Uptime with Atatus Monitoring

Oct 17, 2025 By Pavithra Parthiban In Atatus

When every second of downtime can mean lost revenue and frustrated customers, leading businesses can’t afford to leave performance to chance. That’s why leading companies are turning to Application Performance Monitoring (APM) tools like Atatus, a Datadog alternative to keep their applications healthy, detect issues before customers do, and achieve higher uptime than ever. But how exactly are they doing it?

Read Post

Atatus

Read more about How Leading Businesses Achieved Greater Uptime with Atatus Monitoring

Powering Mexico's Digital Future: Expanded Internet Observability with Catchpoint

Oct 17, 2025 By Gael Hernandez In Catchpoint

As of 2025, more than 110 million Mexicans are online, putting digital‐access penetration at roughly 83% of the population. Mexico is already one of Latin America’s anchor markets, leading the region in startup momentum, cloud adoption, and cross-border digital trade. A few days ago, CloudHQ announced a $4.6B investment in Mexico to open multiple datacenters. Yet even with this scale, service quality still varies dramatically across cities, states, and ISPs.

Read Post

Catchpoint

Read more about Powering Mexico's Digital Future: Expanded Internet Observability with Catchpoint

SharePoint Server Monitoring: Uptime, Performance & SLAs

Oct 17, 2025 By Dotcom-Monitor In Dotcom-Monitor

SharePoint is the backbone of internal collaboration for countless organizations. It hosts documents, drives workflows, powers intranets, and underpins team communication across departments. But when it slows down—or worse, goes dark—productivity grinds to a halt. The problem is that most monitoring approaches treat SharePoint like a static website. They check availability, not experience.

Read Post

Dotcom-Monitor

Read more about SharePoint Server Monitoring: Uptime, Performance & SLAs

Onboarding Microsoft Sentinel data lake with DataStream

Oct 17, 2025 By VirtualMetric In VirtualMetric

Modern security operations teams face an overwhelming challenge: a rapidly growing volume of logs, alerts, and telemetry from cloud services, on-premises infrastructure, and third-party security tools. Traditional SIEM platforms often struggle to scale cost-effectively and provide the agility needed for advanced analytics and threat hunting.

Read Post

VirtualMetric

Read more about Onboarding Microsoft Sentinel data lake with DataStream

The Hidden Barrier to Network Automation Isn't Your AI - It's Your Data

Oct 17, 2025 By Dallon Robinette In Selector

For years, the promise of AI-driven network automation has loomed large. Vendors and analysts alike have painted a future where autonomous operations handle outages before they happen, root causes are explained instantly, and teams finally escape the endless cycle of alerts, tickets, and manual troubleshooting. But in practice, most automation initiatives stall long before they reach that vision.

Read Post

Selector

Read more about The Hidden Barrier to Network Automation Isn't Your AI - It's Your Data

Tech Talk #10 Building a VictoriaMetrics PaaS: Setting up Metrics, Logs, and Traces

Oct 17, 2025 By VictoriaMetrics In VictoriaMetrics

From Blueprint to Reality This episode is designed to be a practical, step-by-step guide. We will show you how to leverage the VictoriaMetrics Kubernetes Stack—our "easier button"—to simplify the deployment process and get your components running quickly.

View Video

VictoriaMetrics

Read more about Tech Talk #10 Building a VictoriaMetrics PaaS: Setting up Metrics, Logs, and Traces

Frictionless Monitoring: Why IT Teams Are Choosing Progress WhatsUp Gold Over SolarWinds

Oct 17, 2025 By Progress WhatsUp Gold In WhatsUp Gold

In this webinar, you’ll learn how WhatsUp Gold helps IT teams.

View Video

WhatsUp Gold

Read more about Frictionless Monitoring: Why IT Teams Are Choosing Progress WhatsUp Gold Over SolarWinds

From Datadog to Checkly in minutes

Oct 17, 2025 By Checkly In Checkly

Looking to cut your Datadog bill and modernize your monitoring workflow? In this session, Dan Giordano and Giovanni Rago show how to migrate your Datadog synthetic monitors to Checkly in minutes, unlocking Playwright, Monitoring-as-Code, and AI-powered automation. Timestamps: Intro — Why Migrate from Datadog Dan introduces the session, what will be covered, and who it’s for.

View Video

Checkly

Read more about From Datadog to Checkly in minutes

Introducing Obkio's Visual Traceroute Tool

Oct 17, 2025 By Obkio In Obkio

Introducing Obkio's New Visual Traceroute: See Your Network Issues, Not Just Hops After years of evolution, we're launching the most advanced Visual Traceroute we've ever built, now fully integrated into the Obkio app. The Journey: What's New:✓ Fully integrated visual network mapping✓ Historical timeline that actually remembers✓ Correlated with Network Performance and SNMP data✓ No extra setup required✓ See the complete story of network issues, not just individual hops.

View Video

Obkio

Read more about Introducing Obkio's Visual Traceroute Tool

What is an Anycast network and how does it help handle high volume network traffic and effective query resolution?

Oct 16, 2025 By adithya.lk@zohocorp.com In ManageEngine

When an organizations' DNS authoritative server faces high volume of network traffic from multiple client devices, they would need more than one DNS server to handle them. But manually routing the network queries to each DNS server in the network would be a tedious job for the network admins. And in turn, this would slow down the network service responses, leading to multiple delays and disruptions.

Read Post

ManageEngine

Read more about What is an Anycast network and how does it help handle high volume network traffic and effective query resolution?

Network Diagnostic Tools: What They Are, What They Do, and Why Network Pros Need Them

Oct 16, 2025 By Andrii Kernitskyi In Obkio

If you’ve ever been the “network person” in the room, you know how it goes: the moment something slows down or disconnects, everyone looks at you. The pressure’s on, and you need answers fast. Is it the Wi-Fi? The ISP? A misconfigured switch? Or maybe that new cloud app is hogging bandwidth? That’s where network diagnostic tools come in.

Read Post

Obkio

Read more about Network Diagnostic Tools: What They Are, What They Do, and Why Network Pros Need Them

25 Sumo Logic updates to better monitor and secure your Azure environments

Oct 16, 2025 By Margaret Selid In Sumo Logic

If you manage workloads across multiple clouds, you know how easy it is for critical alerts or performance issues to get lost in the noise. Switching between consoles, correlating logs, and tracking metrics across platforms can slow down troubleshooting, delaying incident resolution and increasing risk of missing critical alerts.

Read Post

Sumo Logic

Read more about 25 Sumo Logic updates to better monitor and secure your Azure environments

How Legal IT Can Escape the Graveyard of Recurring Tickets

Oct 16, 2025 By Teneo In Teneo

It’s 3:30 p.m. A partner’s laptop refuses to authenticate to the VDI. The urgent filing is in two hours. The ticket title reads like a headstone you’ve seen a hundred times: “Can’t connect, tried rebooting, please help.” Another “undead” incident claws its way out of the queue. By home time, the backlog becomes a graveyard of recurring tickets, and your team, although brilliant and capable, is exhausted and applying the same fixes again and again.

Read Post

Teneo

Read more about How Legal IT Can Escape the Graveyard of Recurring Tickets

ISP Monitoring Explained: How to Measure, Manage, and Improve Internet Performance

Oct 16, 2025 By Joseph Nduhiu In Splunk

Reliable internet connectivity isn’t a convenience. It’s mission-critical infrastructure for modern organizations. Every organization today depends on high-speed, reliable internet access for daily operations—from cloud collaboration and data transfer to streaming, remote work, and customer engagement. As digital transformation accelerates, the rise of AI, large language models (LLMs), IoT, and device sprawl has massively increased bandwidth demand and network complexity.

Read Post

Splunk

Read more about ISP Monitoring Explained: How to Measure, Manage, and Improve Internet Performance

AWS and InfluxData Expand Strategic Offering with InfluxDB 3 on Amazon Timestream for InfluxDB

Oct 16, 2025 By Company In InfluxData

InfluxDB 3 Core and Enterprise bring real-time performance, unlimited cardinality, and low-cost storage to AWS developers running time series workloads at scale.

Read Post

InfluxData

Read more about AWS and InfluxData Expand Strategic Offering with InfluxDB 3 on Amazon Timestream for InfluxDB

InfluxDB 3 on Amazon Timestream for InfluxDB: Real-Time Performance, Now Fully Managed on AWS

Oct 16, 2025 By Evan Kaplan In InfluxData

Today, we’re announcing a major milestone for developers building the next generation of intelligent, real-time systems: InfluxDB 3 is available on Amazon Timestream for InfluxDB, now the default time series database offered directly in the AWS Management Console. This brings InfluxDB 3, our next-generation time series database, directly into the AWS ecosystem for the first time.

Read Post

InfluxData

Read more about InfluxDB 3 on Amazon Timestream for InfluxDB: Real-Time Performance, Now Fully Managed on AWS

Network Intelligence in the AI Era #network #networktraffic

Oct 16, 2025 By Kentik In Kentik

Transform your network strategy from guesswork to data-driven. In this session, you'll learn how to: Build a peering & transit strategy that cuts costs. Model real connectivity costs per customer. Use dashboards to improve margins and renewals. Ask natural-language questions with Kentik AI. Join experts from Kentik, NetMavens, and Seaborn Networks, hosted by Capacity Media, to align your network reality with your commercial goals.

View Video

Kentik

Read more about Network Intelligence in the AI Era #network #networktraffic

SOC 2 Type 2: Netdata's Security Controls Validated Over Time

Oct 16, 2025 By Shyam Sreevalsan In netdata

We’re excited to share that Netdata has successfully achieved SOC 2 Type 2 attestation. Following a five-month audit conducted by Sensiba LLP, we can now confirm that our security controls work consistently in practice. The audit covered the period from April 1 to August 31, 2025, and tested whether our controls operated effectively throughout that entire timeframe.

Read Post

netdata

Read more about SOC 2 Type 2: Netdata's Security Controls Validated Over Time

From pillars to rings: How interconnected observability in Grafana Cloud optimizes performance and reduces telemetry waste

Oct 16, 2025 By Vasil Kaftandzhiev In Grafana

In observability, we’ve traditionally been taught to think in terms of pillars, namely logs, metrics, and traces (and more recently, profiles). But pillars are rigid and disconnected. They don’t reflect how modern systems actually work or how we troubleshoot in real time. So let’s change that.

Read Post

Grafana

Read more about From pillars to rings: How interconnected observability in Grafana Cloud optimizes performance and reduces telemetry waste

Top 9 APM Tools for Node.js Performance Monitoring

Oct 16, 2025 By Anjali Udasi In Last9

When a Node.js app slows down, you don’t get a clear picture right away. One service stalls, another spikes in CPU, and somewhere in between, requests start piling up. You can’t fix what you can’t see. Application Performance Monitoring (APM) tools close that gap. They capture request traces, latency, and errors across your stack — showing you what’s running slow and why.

Read Post

Last9

Read more about Top 9 APM Tools for Node.js Performance Monitoring

Obkio's Visual Traceroute Tool: Feature Release

Oct 15, 2025 By Alyssa Lamberti In Obkio

Today, Obkio’s Network Performance Monitoring solution is announcing the release of our all-new Visual Traceroute Tool integrated into Obkio’s application. This feature is a re-invention of Obkio’s standalone Visual Traceroute Tool (Obkio Vision), and has been transformed to help users better understand network path performance and the source of network issues.

Read Post

Obkio

Read more about Obkio's Visual Traceroute Tool: Feature Release

Implement Distributed Tracing with Spring Boot 3

Oct 15, 2025 By Anjali Udasi In Last9

A slow checkout request. A background job stuck waiting on another service. A log message that looks fine — until performance drops. In a Node.js microservices setup, these are the moments that test your observability. You know something's wrong, but tracing the request across dozens of services feels impossible. Distributed tracing changes that. It connects every span in the request's journey, showing exactly where time is spent and where things start to break down.

Read Post

Last9

Read more about Implement Distributed Tracing with Spring Boot 3

Reality Bytes: Jon Leighton Returns! How Community Continues to Shape DEX

Oct 15, 2025 By Nexthink In Nexthink

Head of Nexthink's Digital Community and User Groups Jon Leighton rejoins Reality Bytes with Tom, Sean, and Dina to explore how community remains the beating heart of Digital Employee Experience (DEX). Fresh from Experience London and heading into Experience Boston, Jon shares how Nexthink’s Ambassador Program, user groups, and learning initiatives empower practitioners to grow, collaborate, and lead change. From storytelling and communication to real-world impact and career development, this episode celebrates the people and connections driving DEX forward.

View Video

Nexthink

Read more about Reality Bytes: Jon Leighton Returns! How Community Continues to Shape DEX

The 2025 Guide to Open Source Status Page Software

Oct 15, 2025 By Hrishikesh Barua In IncidentHub

This is an updated version of the 2024 article. Maintaining transparent communication about service availability is crucial for businesses of all sizes. Status pages are an important part of your communication strategy during times of outages and maintenance events. You can choose to go with a fully managed status page provider or host an open-source one yourself.

Read Post

IncidentHub

Read more about The 2025 Guide to Open Source Status Page Software

Optimize OpenAI Costs with Cloud Cost Management

Oct 15, 2025 By Datadog In Datadog

Datadog now surfaces real-dollar OpenAI usage—down to individual prompts and token consumption—directly in Cloud Cost Management. Empower your team to catch cost drift early, assign spend to the right owner, and optimize with confidence.

View Video

Datadog

Read more about Optimize OpenAI Costs with Cloud Cost Management

Best APM Tool for Modern Teams | Site24x7's Application Performance Monitoring

Oct 15, 2025 By ManageEngine Site24x7 In Site24x7

Your apps are the heartbeat of your business. You risk user satisfaction when the app performance drops. ManageEngine Site24x7's Application Performance Monitoring (APM) is here to give you the visibility you need into your application environment. The features range wide--code-level insights, distributed tracing, centralized log management, and much more.

View Video

Site24x7

Monitoring

Read more about Best APM Tool for Modern Teams | Site24x7's Application Performance Monitoring

Last9 Named a Gartner Cool Vendor in AI for SRE and Observability

Oct 15, 2025 By Nishant Modak In Last9

Gartner recognizes Last9 in their latest Cool Vendor report for unified telemetry and agentic SDK—moving teams from reactive monitoring to proactive ops. Founder at Last9. Loves building dev tools and listening to The Beatles.

Read Post

Last9

Read more about Last9 Named a Gartner Cool Vendor in AI for SRE and Observability

CriblCon 25 Keynote Livestream

Oct 15, 2025 By Cribl In Cribl

IT and security data professionals stand at a crossroads. The practices and technologies that have served you for the last ten years are at their breaking point, facing an onslaught of data growth and complexity that will only accelerate as AI goes mainstream. You have a choice. Stay earthbound or take your telemetry to the stratosphere and beyond.

View Video

Cribl

Read more about CriblCon 25 Keynote Livestream

Monitor logs from Amazon EKS on Fargate with Datadog

Oct 15, 2025 By Justin Lesko In Datadog

Amazon EKS on Fargate is a managed service that reduces the operational overhead of maintaining a Kubernetes cluster by abstracting away the underlying infrastructure. In a serverless Fargate environment, each pod is assigned its own isolated compute resources; there is no direct host-level access.

Read Post

Datadog

Read more about Monitor logs from Amazon EKS on Fargate with Datadog

CIDR blocks vs. IP ranges: Aligning network discovery with business value

Oct 15, 2025 By Rama Venkatesan In Site24x7

At every turn, IT leaders are required to prove the value of every technology investment. Technology business management (TBM) practices encourage connecting tech spend directly to business outcomes, demanding accurate data about what’s in your network and how it supports the organization.

Read Post

Site24x7

Read more about CIDR blocks vs. IP ranges: Aligning network discovery with business value

Track Claude Costs in Datadog Cloud Cost Management

Oct 15, 2025 By Datadog In Datadog

Managing the cost of foundation models like Claude Opus can be complex and unpredictable. With Datadog Cloud Cost Management, you can now ingest Claude usage and cost data directly through the Anthropic Usage and Cost Admin API and visualize it alongside your cloud and SaaS spend.

View Video

Datadog

Read more about Track Claude Costs in Datadog Cloud Cost Management

How to Generate a WUG MD Report | WhatsUp Gold

Oct 15, 2025 By Progress WhatsUp Gold In WhatsUp Gold

Watch this video to learn how to generate a WUG MD file for WhatsUp Gold support and troubleshooting purposes.

View Video

WhatsUp Gold

Read more about How to Generate a WUG MD Report | WhatsUp Gold

AI Agent for IT Event Management: From Noise to Actionable Signals

Oct 15, 2025 By Somdipto Ghosh In Digitate

The IT landscape in enterprises today is complex, hybrid and dynamic and the complexity is also increasing rapidly due to increased containerization, microservice-based apps, and the overall scale of digital operations.

Read Post

Digitate

Read more about AI Agent for IT Event Management: From Noise to Actionable Signals

Baking in site reliability with observability and AI: How SpotOn uses Grafana Assistant to keep restaurants running

Oct 15, 2025 By Trevor Jones In Grafana

When you operate a restaurant, the last thing you want to do is shut your doors and turn away guests and staff because of some technology failure. And if you’re the one providing that tech, it’s your job to make sure that doesn’t happen. “For us, observability is about a lot more than just dashboards and alerts.

Read Post

Grafana

Read more about Baking in site reliability with observability and AI: How SpotOn uses Grafana Assistant to keep restaurants running

Kentik in Motion: Unlocking the Power of Data Explorer

Oct 15, 2025 By Kentik In Kentik

Kentik Data Explorer is the heart of Kentik, where raw network telemetry is transformed into actionable insights. Yet many users don’t realize just how much they can do with it, or how Data Explorer connects to other parts of the Kentik platform. In this session, we walk through the fundamentals of using Data Explorer effectively, provide real-world examples, and highlight how it ties into workflows such as alerting, dashboards, and troubleshooting.

View Video

Kentik

Read more about Kentik in Motion: Unlocking the Power of Data Explorer

Getting Started with Kubernetes Monitoring

Oct 15, 2025 By Clement Cavanier In Bleemeo

Kubernetes has become the de facto standard for container orchestration, but monitoring a Kubernetes cluster can be challenging. In this guide, we’ll walk through the essential steps to set up effective monitoring for your Kubernetes infrastructure.

Read Post

Bleemeo

Read more about Getting Started with Kubernetes Monitoring

Teams issues are inevitable - but your users don't need to know that

Oct 14, 2025 By Sara Purdon In Martello Technologies

Our previous blog gave a quick overview of an all-too real scenario involving poor Microsoft Teams performance and frustrated VIP users. The situation, picking up on our recent Power Moves webinar, centered on a big board meeting held over Teams that suffered from multiple call quality issues — spurring the CEO to pay a stormy visit to IT. In that case, the issue had already happened, and our point was that with native Microsoft tools, it can be hard to get to a precise root cause quickly.

Read Post

Martello Technologies

Read more about Teams issues are inevitable - but your users don't need to know that

APM vs Observability: Both-and, not either-or

Oct 14, 2025 By Leon Adato In Catchpoint

I'll start this, the third and final entry in my series on APM and Observability, which was originally inspired by my contribution to an APMdigest article, by once again pointing out that APM tools can be built with observability in mind. Many are, in fact. And the ones that aren’t don’t turn into a different type of tool. In my experience, it's more that there's a difference of mindset.

Read Post

Catchpoint

Read more about APM vs Observability: Both-and, not either-or

Rolling Out AI Application with Confidence: How Nexthink's AI Drive + Adopt Makes AI Compliant, Insightful, and Effective

Oct 14, 2025 By Shawn Lazarus In Nexthink

From Microsoft Copilot to ChatGPT, AI applications are quickly becoming everyday workplace tools. But for many organizations, turning on these capabilities isn’t as simple as flipping a switch. Enterprise licenses for AI tools can cost millions, yet few companies can confidently say employees are using them effectively, or safely. The reality is that most AI rollouts start strong but stall fast.

Read Post

Nexthink

Read more about Rolling Out AI Application with Confidence: How Nexthink's AI Drive + Adopt Makes AI Compliant, Insightful, and Effective

Distributed Historian Architecture with InfluxDB 3

Oct 14, 2025 By Allyson Boate In InfluxData

From pipelines to warehouses, modern operations generate more distributed data than ever, with equipment and connected devices spread across factories, grids, and remote sites. A single, centralized historian can no longer handle this volume or distribution. Without change, organizations risk fragmented visibility, higher costs, and slower responses.

Read Post

InfluxData

Read more about Distributed Historian Architecture with InfluxDB 3

Choosing the Right APM for Go: 11 Tools Worth Your Time

Oct 14, 2025 By Faiz Shaikh In Last9

If you’re building high-performance systems, Golang has probably earned a spot in your stack. Its speed, lightweight concurrency, and quick compile times make it ideal for scalable APIs, microservices, and distributed systems. But those same qualities that make Go powerful can make performance monitoring tricky. Goroutines run fast and in parallel, which means a simple CPU or memory graph doesn’t always tell you what’s slowing things down.

Read Post

Last9

Read more about Choosing the Right APM for Go: 11 Tools Worth Your Time

Launching an agentic SRE for root cause analysis

Oct 14, 2025 By Mezmo In Mezmo

Today, we’re excited to announce the launch of Mezmo’s AI-powered Site Reliability Engineering (SRE) agent for root cause analysis (RCA)—a transformative leap forward for engineering and operations teams facing the relentless complexity of modern cloud-native systems. ‍

Read Post

Mezmo

Read more about Launching an agentic SRE for root cause analysis

AI-First: Agentic AI needs a new architecture

Oct 14, 2025 By Clint Sharp In Cribl

At Cribl, we’ve talked a lot about epochs. A moment in time when there was a before and after. AI, and specifically agentic AI, is an epoch. The way we work is going to forever change. There have been many such events in our lifetimes: the PC, the Internet, and the smartphone. AI will change how we work forever. Prior to the PC, there were people whose jobs were literally titled “computer”.

Read Post

Cribl

Read more about AI-First: Agentic AI needs a new architecture

Introducing Cribl Notebooks: One Tab For Your Entire Investigation

Oct 14, 2025 By Nicholas Filippi and In Cribl

Investigations move fast. Data is messy. And today’s analysts are expected to connect the dots across massive datasets and various tools—while documenting every step and sharing results with stakeholders. What does that look like? A security investigation may involve 10 or more queries—each one filtering, transforming, and analyzing data from a different angle—duplicated across multiple browser tabs so nothing gets lost.

Read Post

Cribl

Read more about Introducing Cribl Notebooks: One Tab For Your Entire Investigation

Introducing Cribl Insights: A central hub for monitoring and alerts

Oct 14, 2025 By Felicia Dorng and In Cribl

What happens when your data pipelines slow down, drop volume, or quietly change shape? Most monitoring tools won’t catch those shifts until it’s too late—when downstream systems are already impacted, dashboards are broken, or critical information is missing. That’s why we’re excited to introduce Cribl Insights, to give you real-time visibility into every part of your Cribl environment: data flows, operations, processing, user activity, configuration changes, and more.

Read Post

Cribl

Read more about Introducing Cribl Insights: A central hub for monitoring and alerts

Managing observability costs at scale: A look at the latest cost management features in Grafana Cloud

Oct 14, 2025 By Kristin Knapp In Grafana

The benefits of observability are clear: deep visibility into system health, faster troubleshooting, and improved reliability (to name a few). But what’s equally clear is that, as organizations scale and evolve their observability strategies, they need a way to tap into these benefits without runaway costs. According to Grafana Labs’ 2025 Observability Survey, 74% of respondents say cost is a top priority for selecting tools.

Read Post

Grafana

Read more about Managing observability costs at scale: A look at the latest cost management features in Grafana Cloud

Optimize Cloud Costs with Datadog Cloud Cost Management

Oct 14, 2025 By Datadog In Datadog

Datadog Cloud Cost Management unifies observability and cost data so engineering and FinOps teams can drive efficiency together. In this demo, see how you can: Allocate cloud costs across AWS, Azure, Google Cloud, OCI, and SaaS providers with precision Empower engineers by surfacing costs in their daily workflows Automate recommendations to accelerate optimization Monitor your daily Datadog costs - at no additional charge.

View Video

Datadog

Read more about Optimize Cloud Costs with Datadog Cloud Cost Management

Break production less with AI code review

Oct 14, 2025 By Sentry In Sentry

Prod is down, the errors feed is on fire, and your code is to blame. You’ve got the info you need to debug, but it would’ve been nice to have before you shipped this mess. In this workshop, we’ll do a complete walkthrough of Sentry’s new AI code review features. This workshop will cover: How Sentry predicts errors to save you from shipping high-impact bugs Using Ai-powered PR review instead of making your teammates search for every typo Getting AI-generated unit tests that cover your changes and catch potential issues.

View Video

Sentry

Read more about Break production less with AI code review

Introducing Cribl Insights: A Central View for Real-time Monitoring and Alerts

Oct 14, 2025 By Cribl In Cribl

The easiest way to monitor, alert, and understand what’s happening across your Cribl environment. Reduce downtime, stay ahead of issues, and keep data flowing. Healthy pipelines mean happy data :)

View Video

Cribl

Read more about Introducing Cribl Insights: A Central View for Real-time Monitoring and Alerts

Introducing Cribl Notebooks: Investigate, Visualize, and Share - All in One Tab

Oct 14, 2025 By Cribl In Cribl

Run every part of an investigation in one workspace with Cribl Search’s new Notebooks feature. Bring queries, visualizations, and annotations together to make sharing and collaboration easier. Speed up investigations and turn complex workflows into narratives anyone can follow.

View Video

Cribl

Read more about Introducing Cribl Notebooks: Investigate, Visualize, and Share - All in One Tab

What Is SolarWinds, And Should You Use It?

Oct 14, 2025 By Rachel Whitener In CloudZero

Downtime is brutally expensive and damaging. Enterprises can lose about $9,000 every minute systems are down, while smaller businesses lose hundreds of dollars per minute. A single outage can often cost over $100,000, and nearly a third of companies lose customers due to downtime. That’s why many organizations turn to platforms like SolarWinds to maintain reliable systems and minimize the risk of costly disruptions.

Read Post

CloudZero

Read more about What Is SolarWinds, And Should You Use It?

Observability in Fraud Detection: How Transaction Monitoring Tools Can Help Spot Money Laundering

Oct 14, 2025 By OpsMatters In OpsMatters

In today's increasingly digital financial landscape, transaction monitoring has become a critical component of global fraud detection strategies. As financial crimes evolve in complexity, institutions must strengthen their ability to detect anomalies and uncover suspicious activity before it causes damage. Observability, a concept long used in IT and data operations is now emerging as a powerful approach for improving visibility into complex financial transactions.

Read Post

OpsMatters

Read more about Observability in Fraud Detection: How Transaction Monitoring Tools Can Help Spot Money Laundering

Real-Time Outage Alerts in Slack and 4 Ways To Set It Up

Oct 13, 2025 By Colin Bartlett In StatusGator

When a third-party service you depend on goes down, every minute counts. The sooner your team knows about the outage, the faster you can respond and reduce downtime. Since most IT and operations teams live in Slack, it makes sense to receive real-time outage notifications directly in Slack channels where you already collaborate. There are several ways to do this, from integrating an all-in-one status page aggregator like StatusGator, to setting up RSS feeds or building your own Slack app.

Read Post

StatusGator

Read more about Real-Time Outage Alerts in Slack and 4 Ways To Set It Up

From Idea to Deployment: How To Build a Practical AI Roadmap

Oct 13, 2025 By Shaun Quarton In Splunk

AI is being adopted at a faster rate than ever across the business world. According to Stanford, 78% of organizations had implemented AI in some form by 2024. And if that’s not convincing enough, 92% of companies plan to expand their AI investment over the next three years. Practically everyone, including your competitors, is already using AI to gain a competitive edge. If you don’t act soon, there's a real risk of falling behind.

Read Post

Splunk

Read more about From Idea to Deployment: How To Build a Practical AI Roadmap

The Next Chapter of WebPageTest: Your New Experience Starts Soon

Oct 13, 2025 By Piril Kavlak In Catchpoint

For years, WebPageTest has been the gold standard in web performance testing, trusted by developers, SEOs, and performance engineers worldwide to make the web faster and better. Now, we’re taking that mission even further.

Read Post

Catchpoint

Read more about The Next Chapter of WebPageTest: Your New Experience Starts Soon

9 Essential Network Administration Tools

Oct 13, 2025 By Jason Alberino In WhatsUp Gold

Network administration has become more complex than ever. IT professionals are tasked with managing sprawling infrastructures, maintaining uptime, optimizing performance and defending against increasingly sophisticated security threats. With hybrid environments, cloud integrations and remote workforces, the pressure to maintain seamless connectivity and security is relentless.

Read Post

WhatsUp Gold

Read more about 9 Essential Network Administration Tools

The Role of IT Monitoring in Certifications Like SOC 2 and ISO 27001

Oct 13, 2025 By Karthik G In eG Innovations

Organizations are increasingly looking to build quality and security into their systems and services by design, often by the adoption of frameworks, standards and certifications such as SOC 2 Type 2 audits and ISO/IEC 27001.

Read Post

eG Innovations

Read more about The Role of IT Monitoring in Certifications Like SOC 2 and ISO 27001

Simplify server issue diagnosis with service monitoring

Oct 13, 2025 By Geoffrin Edwin In Site24x7

It's well-known that an alert that just states “the server is down,” is not particularly helpful for your already overworked SysAdmins and SRE teams. Diagnosing why the server went down is their challenge. The problem is that memory spikes, CPU overload, failing services, or blocked ports can all look the same from a distance. Too often, these issues are responsible for delayed fixes, alert fatigue, and hours wasted switching between tools for data correlation.

Read Post

Site24x7

Read more about Simplify server issue diagnosis with service monitoring

Strengthen the server back end with server URL checks

Oct 13, 2025 By Geoffrin Edwin In Site24x7

In distributed architectures, the back-end service reliability of microservice endpoints and internal APIs relies on the health of local URLs. These local URLs are not exposed to the public internet and are essential for your IT infrastructure health and automation suites. Site24x7’s server URL check is engineered for operations teams that require immediate visibility into these server-level endpoints. These granular endpoints are often overlooked by traditional external monitoring tools.

Read Post

Site24x7

Read more about Strengthen the server back end with server URL checks

How We Saved 70% of CPU and 60% of Memory in Refinery's Go Code, No Rust Required

Oct 13, 2025 By Ian Wilkes In Honeycomb

We've just released Refinery 3.0, a performance-focused update which significantly improves Refinery's CPU and memory efficiency. Refinery has a big job: it performs dynamic, consistent tail-based sampling that maintains proportions across key fields, adjusts to changes in throughput, and reports accurate sampling rates.

Read Post

Honeycomb

Read more about How We Saved 70% of CPU and 60% of Memory in Refinery's Go Code, No Rust Required

Application Observability Done Right: Best Practices & Tips

Oct 12, 2025 By Asaf Yigal In logz.io

Companies invest millions of dollars in observability platforms, yet they often still struggle to get application monitoring right. This is because most organizations focus on the technology, while neglecting the business. In this article, we’ll show you how to combine business requirements with technological needs. As the CTO of Logz.io, these are based on my experience working with global companies on their application observability needs.

Read Post

logz.io

Read more about Application Observability Done Right: Best Practices & Tips

Big Week at Logz.io: Major Product Announcements Signal New Era of AI-First Observability

Oct 12, 2025 By Tomer Levy In logz.io

Four months ago, we announced our vision of AI-first observability. Today, we’re not just talking about the future, we’re shipping it. This week marks a significant milestone with several major product announcements that demonstrate our continued momentum as the industry’s leading AI-first observability platform.

Read Post

logz.io

Read more about Big Week at Logz.io: Major Product Announcements Signal New Era of AI-First Observability

How to Monitor Microsoft Teams Issues & Fix Microsoft Teams "We're sorry - we've run into an issue"

Oct 12, 2025 By Alyssa Lamberti In Obkio

Welcome to the world of Microsoft Teams! When it comes to video conferencing and messaging, Microsoft Teams is one of the most popular players in the game. When we get error messages like Microsoft Teams “We're sorry—we've run into an issue,” or “something went wrong,” it’s important to have a tool to help monitor and troubleshoot Microsoft Teams performance issues and connection issues.

Read Post

Obkio

Read more about How to Monitor Microsoft Teams Issues & Fix Microsoft Teams "We're sorry - we've run into an issue"

From Data to Dashboards: Building Streamlit Applications with InfluxDB 3

Oct 10, 2025 By Suyash Joshi In InfluxData

Python developers often reach for Streamlit when they need to construct compelling web applications quickly. It provides a fast way to transform Python scripts into interactive applications without complex web frameworks. When paired with InfluxDB 3 Core, the leading time series database, engineers can build powerful real-time analytics dashboards entirely in Python.

Read Post

InfluxData

Read more about From Data to Dashboards: Building Streamlit Applications with InfluxDB 3

Keynote: Clarity from chaos: turning data sprawl into operational intelligence

Oct 10, 2025 By SquaredUp In Squared Up

Join us as we explore how to cut through the chaos and transform fragmented data into a single source of truth. Discover how SquaredUp helps you visualize the bigger picture, connect the dots, and unlock operational intelligence that drives smarter, faster decisions.

View Video

Squared Up

Read more about Keynote: Clarity from chaos: turning data sprawl into operational intelligence

Clarity: Explore Out-of-the-Box Data for Smarter Reporting and Insights

Oct 10, 2025 By ValueOps by Broadcom In Broadcom

Good reporting starts with the right data — and with Clarity’s Out-of-the-Box Data, the heavy lifting is already done. This hands-on simulation gives you an inside look at Clarity’s built-in data features within the Reporting Workspace. Learn how to use preconfigured data to accelerate reporting, ensure governance, and drive faster insights. Whether you’re new to Clarity or looking to improve reporting efficiency, this video will show you how to build smarter, more reliable reports — without starting from scratch.

View Video

Broadcom

Read more about Clarity: Explore Out-of-the-Box Data for Smarter Reporting and Insights

15 PHP APM Tools Worth Using in 2025

Oct 10, 2025 By Faiz Shaikh In Last9

PHP powers a large swath of the web — from blogs to storefronts to APIs. But with microservices, third-party dependencies, and scaling complexity, performance can slip in subtle ways. Your app might mostly work, but small—noted delays, occasional spikes, or hidden bottlenecks build up. An APM tool helps you see inside the black box: which functions are slow, which DB queries are hogging time, which external calls are failing or stalling.

Read Post

Last9

Read more about 15 PHP APM Tools Worth Using in 2025

What Is Cloud Monitoring? Everything You Need To Know

Oct 10, 2025 By CloudZero In CloudZero

Cloud computing offers several undeniable benefits to businesses. Some of the biggest ones are agility, cost savings, data recovery, and developing new apps and services to meet changing customer needs. Despite these benefits, the cloud can be complex, demand specialized skills, and require companies to follow up-to-date cloud security best practices. Why?

Read Post

CloudZero

Read more about What Is Cloud Monitoring? Everything You Need To Know

Optimize your end user computing with M365 reports

Oct 10, 2025 By SquaredUp In Squared Up

Unlock the full potential of your end user computing with SquaredUp’s unified M365 dashboards—designed to empower you to make smarter, faster decisions. This session highlights the challenges of fragmented reporting across the M365 suite and discover how SquaredUp’s native M365 plugin leverages the power of MS Graph API to deliver unified dashboards.

View Video

Squared Up

Read more about Optimize your end user computing with M365 reports

Monitoring Kubernetes with Prometheus and SquaredUp

Oct 10, 2025 By SquaredUp In Squared Up

In this session we will show how you can combine the power of Prometheus and SquaredUp to gain deep and comprehensive insights into your Kubernetes environments. See how you can monitor Kubernetes clusters, nodes and pods without having to write a single line of PromQL!

View Video

Squared Up

Read more about Monitoring Kubernetes with Prometheus and SquaredUp

Micro Lesson: Sumo Logic Dojo AI Summary Agent

Oct 10, 2025 By Sumo Logic, Inc. In Sumo Logic

In this video, we'll introduce the new AI powered Summary Agent to help security teams using Cloud SIEM understand and prioritize cybersecurity insights in a faster and more efficient manner. The summary agent provides AI generated summaries of the component signals within an insight, giving analysts a clear view of the underlying evidence without having to spend time reviewing raw logs or multiple events individually. The summary agent is part of Sumo Logic's new Dojo AI platform, featuring a number of useful AI agents across all Sumo Logic products and services.

View Video

Sumo Logic

Read more about Micro Lesson: Sumo Logic Dojo AI Summary Agent

Best Practices for Public Status Pages

Oct 10, 2025 By Super Monitoring In Super Monitoring

When things go wrong, your public status page is the most important way to talk to people. Your users all want to know what’s going on and when they can get back to the site. A public status page that is well-made makes people trust, be open, and have faith in your brand. In this blog post, you’ll learn what a public status page is and how to make the best ones.

Read Post

Super Monitoring

Read more about Best Practices for Public Status Pages

NiCE VMware vSphere Management Pack 6 1 Walkthrough 2025Q4

Oct 10, 2025 By NiCE IT Management Solutions In NiCE IT Mgmt

The next generation of VMware monitoring with Microsoft SCOM is here! Watch the NiCE VMware vSphere Management Pack 6.1 walkthrough and see how we’ve re-architected VMware monitoring to be smarter, faster, and more secure. Native SCOM architecture – no more WMI or external services High availability & load balancing via SCOM Resource Pools Near real-time discovery of VMware changes for maximum accuracy Event-driven monitoring for faster, more reliable alerts Compliance & security monitoring built-in.

View Video

NiCE IT Mgmt

Read more about NiCE VMware vSphere Management Pack 6 1 Walkthrough 2025Q4

From SNMP to Modern Telemetry: A network Monitoring Journey

Oct 10, 2025 By VictoriaMetrics In VictoriaMetrics

Simple Network Management Protocol (#SNMP) has been a backbone for network management since the 1980s. While it’s still useful for remotely devices, it’s showing its age. This Short takes you through the evolution of SNMP and shows how newer, more efficient methods for collecting network data are changing the game. Learn how to move beyond older protocols and better monitor and manage your modern network environment.

View Video

VictoriaMetrics

Read more about From SNMP to Modern Telemetry: A network Monitoring Journey

What Is Email Blacklist Monitoring?

Oct 10, 2025 By Simon Rodgers In WebSitePulse

When legitimate emails start bouncing or disappearing into spam folders, the cause is often a hidden one: your domain or mail server has been blacklisted. Email blacklist monitoring is the process of continuously checking your domain and IP address against major spam-tracking databases. Its purpose is to detect blacklisting early, so you can act before it damages your communication, reputation, or revenue.

Read Post

WebSitePulse

Read more about What Is Email Blacklist Monitoring?

A serverless approach to CI/CD observability with GitLab and Grafana

Oct 10, 2025 By Daniel Fitzgerald In Grafana

In today’s fast-paced development environment, it’s critical that you understand what’s happening in your CI/CD pipeline. And yet, many teams struggle with fragmented tooling that makes it difficult to get a holistic view of their dev lifecycle. For example, if you’re using GitLab for CI/CD and Grafana for observability, you’ve probably faced this challenge: how do you bring your GitLab events into your existing observability and alerting infrastructure?

Read Post

Grafana

Read more about A serverless approach to CI/CD observability with GitLab and Grafana

How OpenTelemetry Auto-Instrumentation Works

Oct 10, 2025 By Anjali Udasi In Last9

Most developers use auto-instrumentation as it’s meant to be used — run the Java agent, add NODE_OPTIONS, and telemetry starts flowing. When it stops, though, figuring out why can be tricky. Maybe the agent didn’t load, maybe there’s a framework version mismatch, or something else entirely. Understanding how auto-instrumentation works makes it easier to spot and fix these issues.

Read Post

Last9

Read more about How OpenTelemetry Auto-Instrumentation Works

Troubleshooting Common Issues with Okta SSO

Oct 10, 2025 By Uptime Website Monitoring In uptime

Learn how to troubleshoot common Single Sign-On (SSO) issues with the Okta Identity Provider. This step-by-step guide will walk you through the process!

View Video

uptime

Monitoring

Read more about Troubleshooting Common Issues with Okta SSO

Gaming Latency Monitoring: How to Detect & Reduce Lag

Oct 10, 2025 By Dotcom-Monitor In Dotcom-Monitor

Latency isn’t just a technical metric in gaming—it’s an emotion. Players don’t measure milliseconds, they feel them. A button press that lands a fraction late, a flick shot that fires just off target, a character that rubber-bands at the worst possible time—all of it translates to frustration. In fast-paced multiplayer environments, a 50ms delay can decide outcomes, erode trust, and send players to competitors who seem “smoother.”

Read Post

Dotcom-Monitor

Read more about Gaming Latency Monitoring: How to Detect & Reduce Lag

NiCE VMware vSphere Management Pack 6.1

Oct 9, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

In times of rapid transformation within the VMware ecosystem, IT teams are reassessing how to best maintain virtual environments as stable, secure, and efficient as possible. With numerous monitoring options available on the market, the question arises: Why stick with Microsoft System Center Operations Manager (SCOM)?

Read Post

NiCE IT Mgmt

Read more about NiCE VMware vSphere Management Pack 6.1

You're in Good Microsoft SCOMpany

Oct 9, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

When it comes to enterprise monitoring, consistency and reliability matter more than ever. That’s why organizations across industries, from financial services to healthcare, manufacturing, and government, turn to NiCE IT Management Solutions to extend and optimize their Microsoft SCOM environments. And the results speak volumes.

Read Post

NiCE IT Mgmt

Read more about You're in Good Microsoft SCOMpany

5 simple habits to beat digital fatigue

Oct 9, 2025 By Harsitha P In ManageEngine

Top tips is a weekly column where we highlight what’s trending in the tech world today and list ways to explore these trends. This week, we’re tackling a common struggle for anyone living the digital life, how to beat digital fatigue and bring back real focus in a screen-heavy day. Ever hit that 3pm wall where your eyes sting from staring at the screen, your shoulders feel like bricks, and even the third cup of coffee isn't helping?

Read Post

ManageEngine

Read more about 5 simple habits to beat digital fatigue

Mobile session replay - now live in Coralogix

Oct 9, 2025 By Ofri Grushka In Coralogix

Coralogix Real User Monitoring (RUM) already gives teams a complete view of how users experience their websites. Now, that same visibility comes to mobile. With Session Replay for iOS and Android, you can watch real sessions unfold and understand exactly what users saw and did, without relying on vague support tickets or incomplete crash logs. Session replay captures exactly how users interact with your mobile app: taps, swipes, scrolls, and screen transitions.

Read Post

Coralogix

Read more about Mobile session replay - now live in Coralogix

Manage and optimize your OCI costs with Datadog Cloud Cost Management

Oct 9, 2025 By Patrick Krieger In Datadog

Engineering teams need to deliver reliable, secure, and high-performing applications, all while keeping costs under control. But engineers often lack visibility into cloud cost data, relying on finance-driven reports that they receive only after the billing cycle closes. Without daily cost insights alongside observability data, they don’t know until it’s too late that an infrastructure change caused a significant cost increase.

Read Post

Datadog

Read more about Manage and optimize your OCI costs with Datadog Cloud Cost Management

Stop decision overload: How discovery filters optimize device onboarding for efficient network monitoring

Oct 9, 2025 By Rama Venkatesan In Site24x7

Every network administrator encounters the same question during discovery scans: Should this device be monitored or ignored? Routers are critical, but what about test servers, lab switches, or that aging and unused printer still on the network? Manually deciding for each device creates decision overload and risks overlooking what really matters.

Read Post

Site24x7

Read more about Stop decision overload: How discovery filters optimize device onboarding for efficient network monitoring

How to Scale Prometheus APM for Modern Applications

Oct 9, 2025 By Anjali Udasi In Last9

When developers monitor application performance, they pick one of two paths: traditional APM tools with distributed tracing and code profilers, or metrics-driven monitoring with Prometheus. The second approach — Prometheus APM — tracks the signals that matter most: request rates, error rates, latency, and resource utilization. No agents to install, no per-host pricing, just exporters and PromQL. For most teams, Prometheus APM is where monitoring starts.

Read Post

Last9

Read more about How to Scale Prometheus APM for Modern Applications

Improving browser tracing step by step

Oct 9, 2025 By Lukas Stracke In Sentry

Browser tracing has always been one of those things that feels invisible until it isn’t. When it works well, you get clear, actionable insights into how your app is performing in the wild. When it doesn’t, you’re left staring at noisy data, gaps in traces, and spans that don’t quite tell the story. Over the last few months, we’ve been chipping away at that problem.

Read Post

Sentry

Read more about Improving browser tracing step by step

What's New at Catchpoint episode 3

Oct 9, 2025 By Catchpoint In Catchpoint

This month, we’re all about those features, baby! From AI-driven automatic root cause analysis; to playing back user RUM sessions like a movie; to discovering unknown unknowns with AI-driven advisors, Catchpoint has what you need to improve your IPM.

View Video

Catchpoint

Monitoring

Read more about What's New at Catchpoint episode 3

B2B Flow Intelligence in Transactions: Closing the Gaps That Cost Enterprises Millions

Oct 9, 2025 By Andrew Mallaband In meshIQ

Every enterprise depends on B2B transactions to keep business moving. From purchase orders to payments, they form the unseen network powering global commerce.

Read Post

meshIQ

Read more about B2B Flow Intelligence in Transactions: Closing the Gaps That Cost Enterprises Millions

Dashboarding OCI costs: A guide to building a usage API with OCI functions and SquaredUp

Oct 9, 2025 By Squared Up In Squared Up

Oracle Cloud Infrastructure (OCI) provides powerful tools for managing your cloud resources, but getting a clear, real-time view of your usage and costs can sometimes feel locked away behind complex reports. What if you could build beautiful, shareable dashboards that show you exactly what you're spending, where you're spending it, and how it trends over time? In this guide, we'll walk you through deploying a simple, secure OCI Function that acts as a proxy to OCI's Usage API.

Read Post

Squared Up

Read more about Dashboarding OCI costs: A guide to building a usage API with OCI functions and SquaredUp

How image generation models are creating new infrastructure demands for DevOps teams

Oct 9, 2025 By OpsMatters In OpsMatters

The rapid adoption of generative AI has moved far beyond research labs and creative studios. Image generation models, in particular, have become critical components in content production pipelines, marketing platforms, design workflows, and enterprise applications. What began as a novel way to create digital art has evolved into a class of workloads that behave very differently from traditional web services.

Read Post

OpsMatters

Read more about How image generation models are creating new infrastructure demands for DevOps teams

SolarWinds launches AI Agent alongside new capabilities to advance autonomous operational resilience

Oct 8, 2025 By SolarWinds In SolarWinds

These new offerings will help IT teams cut through complexity and respond faster to shift teams from firefighting to innovation.

Read Post

SolarWinds

Read more about SolarWinds launches AI Agent alongside new capabilities to advance autonomous operational resilience

Maximize data value and cut costs: Adaptive Telemetry for metrics, logs, traces, and profiles in Grafana Cloud

Oct 8, 2025 By Steven Dungan In Grafana

When it comes to observability, more data doesn’t always mean more clarity. In fact, as telemetry volumes grow, it only becomes more difficult to discern the signals from the noise and to keep overall costs in check. This is exactly why we built Adaptive Telemetry, a suite of features in Grafana Cloud that analyzes how your telemetry is used and then automatically recommends actions like aggregating, sampling, dropping, or reducing low-value data.

Read Post

Grafana

Read more about Maximize data value and cut costs: Adaptive Telemetry for metrics, logs, traces, and profiles in Grafana Cloud

How To Perform A DNS Check | Grafana Synthetic Monitoring

Oct 8, 2025 By Grafana In Grafana

Learn how to set up HTTP checks using Grafana Cloud Synthetic Monitoring. In this video, we walk through how to create a HTTP check and analyze test results.

View Video

Grafana

Read more about How To Perform A DNS Check | Grafana Synthetic Monitoring

Enhanced Icinga 2 Container Images

Oct 8, 2025 By Yonas Habteab In Icinga

As some of you might have already noticed, we recently gave our official Icinga 2 container image builds a complete overhaul. These new images are currently available only as snapshot builds but will replace the existing stable images with the next Icinga 2 v2.16.0 release. In this blog post, we’ll walk you through the key changes and improvements that come with the new images, as well as the reasons behind these changes.

Read Post

Icinga

Read more about Enhanced Icinga 2 Container Images

Nobody Cares About Your MTTR

Oct 8, 2025 By Yann Guernion In Broadcom

I’ve been in those late-night "war room" calls where, after hours of painstaking work, the team finally resolves a critical outage. The dashboards all turn green, a collective sigh of relief is shared, and the next day’s report highlights a victory: Mean time to resolution (MTTR) was reduced by 15% compared to the last major incident. It feels like a win.

Read Post

Broadcom

Read more about Nobody Cares About Your MTTR

Tag(ging)-You're It: How to Leverage AppNeta Monitoring Data for Maximum Insights

Oct 8, 2025 By Alec Pinkham In Broadcom

Today’s enterprise networks are a far cry from the centralized, predictable infrastructures of the past. Instead, they are sprawling, dynamic ecosystems that stretch across cloud services, SaaS applications, on-premises data centers, distributed branches, and thousands of end users connecting from every imaginable location. This complexity creates a huge challenge for IT and network operations teams: How do you get a clear, real-time view of what’s really happening?

Read Post

Broadcom

Read more about Tag(ging)-You're It: How to Leverage AppNeta Monitoring Data for Maximum Insights

ObservabilityCON 2025 Keynote: Grafana Assistant GA and Full-Stack Observability in Grafana Cloud

Oct 8, 2025 By Grafana In Grafana

Join Grafana Labs CEO Raj Dutt, CTO Tom Wilkie, and engineering leaders to kick off ObservabilityCON 2025 with the latest in AI-powered observability in Grafana Cloud. See how Grafana is making observability smarter, simpler, and more scalable. This ObservabilityCON 2025 keynote unveils: AI-powered observability → Grafana Assistant (GA) and Assistant Investigations (Public Preview). Observability at scale → The Adaptive Telemetry suite is now complete (Traces GA, Adaptive Profiles in Private Preview) plus BYOC for flexible, cost-efficient cloud deployment.

View Video

Grafana

Read more about ObservabilityCON 2025 Keynote: Grafana Assistant GA and Full-Stack Observability in Grafana Cloud

AI-powered observability: Resolve incidents faster, reduce alert fatigue, and expand access

Oct 8, 2025 By Ben Sully In Grafana

When an incident lands in your lap, you’ll often start with a lot of questions: Why is latency so high? What’s causing this outage? How much money are we losing at this very moment? The uncertainty—and the pressure to quickly find answers—has always been one of the more nerve wracking parts of being an on-call engineer, but it doesn’t have to be that way any more.

Read Post

Grafana

Read more about AI-powered observability: Resolve incidents faster, reduce alert fatigue, and expand access

Top 9 LLM Observability Tools in 2025

Oct 8, 2025 By Logz.io In logz.io

Organizations are adding GenAI to their current and future architectures and product roadmaps, requiring Ops teams to ensure LLMs are accurate, fast, secure and cost-efficient. LLM observability tools directly addresses these needs, helping identify and prevent common LLM errors and issues: LLM observability provides the telemetry data for this analysis. LLM observability tools trace requests end-to-end, evaluate outputs, and correlate quality with latency, cost, prompts, tools, and data sources.

Read Post

logz.io

Read more about Top 9 LLM Observability Tools in 2025

Vibe Coding: Closing The Feedback Loop With Traceability

Oct 8, 2025 By Kyle Tryon In Sentry

I have begun to truly embrace vibe coding over the last few months, using Cursor as my main code editor and Claude Sonnet 4 for my agent's LLM. It's an exciting time as a developer, we get to experiment with something that promises to 100x our productivity while pioneering the new workflows and strategies for implementing these tools. But, as most people who have done any extensive development with LLMs in a sufficiently sized code base knows, it's a bit like trying to herd scared cats.

Read Post

Sentry

Read more about Vibe Coding: Closing The Feedback Loop With Traceability

ObservabilityCON 2025: A guide to all the announcements from Grafana Labs

Oct 8, 2025 By Grafana Labs Team In Grafana

Today at ObservabilityCON 2025 in London, we unveiled a number of exciting announcements and updates to Grafana Cloud that reimagine SaaS economics, simplify the complexity of running your observability stack at scale, and provide AI tooling that’s actually useful. (Root cause analysis via chatbot? Yes, please!) Check out the keynote to learn more about how we’re helping you do more with the open observability cloud, and read on for a quick recap of all the news from ObservabilityCON 2025.

Read Post

Grafana

Read more about ObservabilityCON 2025: A guide to all the announcements from Grafana Labs

Visualizations with InfluxDB 3: More Options Than Ever Before

Oct 8, 2025 By Gary Fowler In InfluxData

We’ve been working on a number of items here at InfluxData to give you even more options for creating visualizations and dashboards for your time series data in InfluxDB 3.

Read Post

InfluxData

Read more about Visualizations with InfluxDB 3: More Options Than Ever Before

The Best Tools for Synthetic & Infrastructure Monitoring-A Comparative Guide

Oct 8, 2025 By Dotcom-Monitor In Dotcom-Monitor

Both user and server-side monitoring are important to make your apps better. Tools that offer monitoring of just one side leave gaps in your diagnosis, causing negative experiences and reliability issues. Here are the top 10 tools you should consider based on their benefits and coverage.

Read Post

Dotcom-Monitor

Read more about The Best Tools for Synthetic & Infrastructure Monitoring-A Comparative Guide

Agentic AI Explained: How Autonomous Systems Are Changing Cybersecurity

Oct 8, 2025 By Elastic In Elastic

Discover how agentic AI enhances cybersecurity by augmenting security teams’ existing security tools and workflows. See how Retrieval-Augmented Generation (RAG) enables faster threat detection, streamlined investigations, and smarter incident response — empowering SOC teams to work more effectively. Join cybersecurity experts Lisa Jones-Huff and Mohammed Anas Khatri to discover how agentic AI can help your security team multiply its impact.

View Video

Elastic

Read more about Agentic AI Explained: How Autonomous Systems Are Changing Cybersecurity

Complete guide to OpenTelemetry Tracing (with code examples)

Oct 8, 2025 By Ankit Anand In SigNoz

Distributed tracing is an essential technique for monitoring modern, cloud-native applications. It provides a holistic view of a request's entire journey as it propagates through a multi-service architecture, making it invaluable for performance optimization and root cause analysis. But how do you generate and collect this trace data in a standardized, vendor-agnostic way? That's where OpenTelemetry comes in.

Read Post

SigNoz

Read more about Complete guide to OpenTelemetry Tracing (with code examples)

Optimizing Your Cart with Signals: Smarter State, Better Debugging

Oct 8, 2025 By Sonu Kapoor In AppSignal

In the first two parts of this series, we introduced Angular Signals and built a reactive shopping cart. Our CartService already supports core operations like adding, removing, and clearing items, as well as computing total price and item count using computed(). All of this was done without touching RxJS, subscriptions, or change detection hacks. But a real-world cart does more than tally up numbers.

Read Post

AppSignal

Read more about Optimizing Your Cart with Signals: Smarter State, Better Debugging

OpenTelemetry + ignio: The Foundation for Intelligent, Unified Observability

Oct 8, 2025 By Amit Shastri In Digitate

In the previous post, What is OpenTelemetry?, we went over the What, Why, and the How of OpenTelemetry. We also went over the telemetry data lifecycle (data generation à collection à storage à usage) and how telemetry data (MELT) could be put to use to troubleshoot a representative web application scenario.

Read Post

Digitate

Read more about OpenTelemetry + ignio: The Foundation for Intelligent, Unified Observability

Closing Visibility Gaps in the Modern Data Center

Oct 8, 2025 By Phil Gervasi In Kentik

In today’s high-performance data centers, “all green” dashboards can mask catastrophic issues hiding just beneath the surface. If you’re missing the microbursts, hidden oversubscription, and routing imbalances that are devastating application performance, you’re flying blind. Learn how to close these visibility gaps and shift from reactive firefighting to proactive network intelligence.

Read Post

Kentik

Read more about Closing Visibility Gaps in the Modern Data Center

Python performance monitoring for Django, Flask, Celery, and more

Oct 8, 2025 By Joshua Wood In Honeybadger

Here's some excellent news for the Pythonistas in the room: You can now monitor the performance of your Python applications with Honeybadger. Last year, we launched Honeybadger Insights, a new logging and observability tool bundled with Honeybadger. Insights enables you to query your application logs and events to answer performance questions, perform root-cause analyses, and create charts and dashboards to see what's happening in real time.

Read Post

Honeybadger

Read more about Python performance monitoring for Django, Flask, Celery, and more

Telemetry Now Teaser: "Tracking the Red Sea Cable Cuts with Kentik's Cloud Latency Map"

Oct 8, 2025 By Kentik In Kentik

Go behind the scenes of a major internet analysis. When the recent Red Sea cable cuts disrupted global connectivity, Kentik's Director of Internet Analysis, Doug Madory, turned to the Cloud Latency Map to track the fallout in real-time. In this clip from the latest Telemetry Now podcast, Doug walks through how he identified the latency spikes and rerouting caused by the damage.

View Video

Kentik

Read more about Telemetry Now Teaser: "Tracking the Red Sea Cable Cuts with Kentik's Cloud Latency Map"

The ScienceLogic AI Platform

Oct 8, 2025 By ScienceLogic In ScienceLogic

The ScienceLogic AI Platform brings together observability, automation, compliance, and intelligence in one connected experience. Powered by our Skylar offerings, this unified platform helps IT teams see across their environments, automate with confidence, and make smarter decisions faster.

View Video

ScienceLogic

Read more about The ScienceLogic AI Platform

3 real-world generative AI strategies for executives

Oct 8, 2025 By Jay Shah In Elastic

Everyone is excited about AI, but few companies have successfully implemented it. While enthusiasm for generative AI (GenAI) has helped accelerate AI adoption across enterprises, the promises of artificial intelligence have yet to translate into measurable impact on most organizations’ bottom lines. The trouble isn’t the tech — it’s a lack of executive ownership.

Read Post

Elastic

Read more about 3 real-world generative AI strategies for executives

Real Estate App Development for Ops & Product Teams: From MVP to Scale

Oct 8, 2025 By OpsMatters In OpsMatters

In the competitive world of real estate technology, developing an app that can scale from a Minimum Viable Product (MVP) to a fully-fledged solution is crucial. For operations and product teams, this journey involves strategic planning and execution to ensure the app meets evolving market demands and user expectations.

Read Post

OpsMatters

Read more about Real Estate App Development for Ops & Product Teams: From MVP to Scale

Live in London: Adoption & AI Confessions @ Nexthink Experience

Oct 7, 2025 By Nexthink In Nexthink

Tim and Tom are back with another special DEX Show Live!—recorded last week at Nexthink Experience London at the Intercontinental by the iconic O2. Day 1 of the world's biggest DEX event saw over a thousand IT pros gather for two days of innovation, insight, and energy. In this lively episode, the hosts are joined by Guillaume Charles, Senior Director of Product Management (Diagnostics) at Nexthink, and Gabriela Moraes from Electrolux to explore the state of digital and AI adoption in the enterprise.

View Video

Nexthink

Read more about Live in London: Adoption & AI Confessions @ Nexthink Experience

Downtime on the Docket: The Death Sentence for Productivity in Legal Firms

Oct 7, 2025 By Teneo In Teneo

When minutes matter, IT leaders need more than quick fixes; they need foresight. That’s where Teneo’s Managed DEX (Digital Experience Monitoring) comes in. Managed DEX is designed to detect what legal teams can’t afford to miss. It monitors for “ghost traffic”- those eerie, unexplained signals of abnormal network activity that often signal compromise or instability- and other anomalous device behaviors that can precede full-blown outages or cyber incidents.

Read Post

Teneo

Read more about Downtime on the Docket: The Death Sentence for Productivity in Legal Firms

Beyond Crashes: Improve React Native Performance using Tracing and Logs

Oct 7, 2025 By Sentry In Sentry

In this hands-on workshop, we’ll show you how to connect the dots between slowdowns, crashes, and the user experience in your React Native app.

View Video

Sentry

Read more about Beyond Crashes: Improve React Native Performance using Tracing and Logs

Autodiscovering new devices added to your network

Oct 7, 2025 By Rama Venkatesan In Site24x7

For large and rapidly evolving environments, minor oversights can snowball into bigger performance and security issues—making logging devices manually unrealistic.

Read Post

Site24x7

Read more about Autodiscovering new devices added to your network

Announcing Honeycomb for Frontend Observability React Native Beta

Oct 7, 2025 By Elsie Phillips In Honeycomb

React Native apps straddle two worlds: JavaScript powering your UI and native modules running underneath. Add in backend services, and when something goes wrong, there are many possible culprits. Was it JS logic, the native bridge, the native API call, or a downstream API call? Most tools give you parts of the picture. A crash tool can tell you where the app failed but not what else happened in a session.

Read Post

Honeycomb

Read more about Announcing Honeycomb for Frontend Observability React Native Beta

SRE Report Retrospectives - Have AIOps Predictions Held Up?

Oct 7, 2025 By Leo Vasiliou In Catchpoint

Welcome to a new blog series where we take a candid look at the predictions, insights, and bold claims we've made in previous SRE Reports and ask the uncomfortable question: How did we do? For the uninitiated, Catchpoint's SRE Report is our annual, practitioner-driven effort to capture the pulse of the global reliability community.

Read Post

Catchpoint

Read more about SRE Report Retrospectives - Have AIOps Predictions Held Up?

Redis Performance Monitoring: Combine Logs and Metrics for Complete Visibility

Oct 7, 2025 By Benjamin Pitts In MetricFire

Redis earns its place in modern stacks because it’s an in-memory data store with microsecond latency and rich data structures, making it perfect for things like caching, sessions, and rate limiting. Since it often sits on the request path, small issues (connection churn, blocked commands, memory pressure) can quickly ripple into user-visible incidents.

Read Post

MetricFire

Read more about Redis Performance Monitoring: Combine Logs and Metrics for Complete Visibility

Using Sigma rules in EventSentry

Oct 7, 2025 By NETIKUS.NET LTD In EventSentry

Shows how to create EventSentry event log filters based on Sigma rules, along with a short overview of Sigma rules in general.

View Video

EventSentry

Read more about Using Sigma rules in EventSentry

Monitoring Encrypted Network Traffic

Oct 7, 2025 By VictoriaMetrics In VictoriaMetrics

How do you spy on a secret message? That's the challenge for network monitoring tools like Suricata today. Encryption is essential for privacy, but it creates massive blind spots for security. Dive into the modern-day cat-and-mouse game of monitoring encrypted traffic. How do you deal with security blind spots in your network?

View Video

VictoriaMetrics

Read more about Monitoring Encrypted Network Traffic

Ep 13: Everyone is winging it: Hope for an AI future

Oct 7, 2025 By Sumo Logic, Inc. In Sumo Logic

In this episode, we welcome Naomi Buckwalter, Sr. Director of Product Security at Contrast Security, to chat about the evolving landscape of security threats and the dual role of AI in both facilitating and combating these challenges. We explore the increasing sophistication of modern phishing attacks and discuss how security teams must rapidly adapt to stay ahead of emerging threats. We debate the transformative impact of AI on the future job market, where personal qualities and soft skills may increasingly take precedence over traditional technical competencies.

View Video

Sumo Logic

Read more about Ep 13: Everyone is winging it: Hope for an AI future

Happiest Minds boosts IT efficiency and service delivery with Site24x7

Oct 7, 2025 By ManageEngine Site24x7 In Site24x7

As a born-digital, born-agile IT services company, Happiest Minds delivers 24/7 strategic, transformation, and managed services across product digital engineering services, infrastructure management and security services, and generative AI business services. As its customer base and complexity grew, the company needed unified observability, multi-tenant monitoring, and real-time root cause analysis—without the burden of manual effort or siloed tools.

View Video

Site24x7

Monitoring

Read more about Happiest Minds boosts IT efficiency and service delivery with Site24x7

How we use Datadog to get comprehensive, fine-grained visibility into our email delivery system

Oct 7, 2025 By Alexa Liaskovski In Datadog

Visibility into email performance is indispensable to any organization that counts on its ability to reach people through their inboxes, including Datadog. SREs, FinOps, and many other teams rely on email as a critical channel for communications from our platform, including monitor alerts, usage reports, and service account notifications. At Datadog, we depend on the visibility provided by our integrations for Mailgun, SendGrid, and Amazon SES to optimize our email performance and ensure deliverability.

Read Post

Datadog

Read more about How we use Datadog to get comprehensive, fine-grained visibility into our email delivery system

Instantly respond to changes in your data with Datadog automation rules

Oct 7, 2025 By Barak Shoushan In Datadog

Datadog Workflow Automation can automate processes and reduce the amount of time spent on time-consuming, repetitive tasks. You can trigger these workflows in real time by tying them to alerts, dashboards, Slack messages, and other signals.

Read Post

Datadog

Read more about Instantly respond to changes in your data with Datadog automation rules

What's New in VictoriaMetrics Cloud Q3 2025? From new region in Asia to proactive alerts

Oct 7, 2025 By Jose Gomez-Selles In VictoriaMetrics

The third quarter of 2025 has been a busy one for VictoriaMetrics Cloud! We expanded globally, polished the user experience, introduced new enterprise debugging tools, and delivered smarter alerts to help users make the most of their observability data. If you missed our Quarterly Live Update, don’t worry! You can watch the full recording here: Let’s recap what’s new in VictoriaMetrics Cloud this quarter.

Read Post

VictoriaMetrics

Read more about What's New in VictoriaMetrics Cloud Q3 2025? From new region in Asia to proactive alerts

Sponsored Post

Stop Playing Defense: How SecOps Automation Neutralizes Cisco Zero-Days vulnerability at Machine Speed

Oct 6, 2025 By Shailesh Manjrekar In Fabrix

As up to 2 million Cisco devices face active exploitation, automated SecOps response becomes essential.

Read Post

Fabrix

Read more about Stop Playing Defense: How SecOps Automation Neutralizes Cisco Zero-Days vulnerability at Machine Speed

Get Third-Party Outage Alerts in Discord with StatusGator

Oct 6, 2025 By Colin Bartlett In StatusGator

When SaaS tools go down, teams need fast, reliable alerts right where they communicate. Now, with the StatusGator integration for Discord, you can receive real-time third-party outage alerts directly in your server. Whether you’re monitoring the status of AWS, Slack, GitHub, or Google Workspace, StatusGator keeps your team informed instantly when disruptions happen.

Read Post

StatusGator

Read more about Get Third-Party Outage Alerts in Discord with StatusGator

Zoom Troubleshooting Performance and Connection Issues: The Complete Guide

Oct 6, 2025 By Alyssa Lamberti In Obkio

In an era of remote work and virtual meetings, Zoom has emerged as a lifeline, connecting people across distances and facilitating seamless collaboration. However, like any technological tool, it's not without its fair share of challenges. From occasional performance hiccups to frustrating connection issues, navigating the world of Zoom can sometimes be a daunting task.

Read Post

Obkio

Read more about Zoom Troubleshooting Performance and Connection Issues: The Complete Guide

Observability-as-Code: Bring synthetic monitoring into your pipeline

Oct 6, 2025 By Uptrends In Uptrends

Your team just deployed to production. The infrastructure spun up in 90 seconds, but recreating your monitoring? That’ll take hours. It’s added late in the process, managed through dashboards, and prone to inconsistency. Short-term, this slows delivery and creates visibility gaps that surface only during incidents. Long-term, it leaves a business-critical capability out of your observability pipeline.

Read Post

Uptrends

Read more about Observability-as-Code: Bring synthetic monitoring into your pipeline

Datadog vs Splunk: A Side-by-Side Comparison [2025]

Oct 6, 2025 By Pavithra Parthiban In Atatus

Datadog and Splunk are both leading tools for monitoring and observability. Each offers a range of features designed to help you understand and manage your data. Datadog provides tools for tracking application performance and analyzing logs in real-time. Splunk, meanwhile, is known for its powerful log analysis and search capabilities. In this post, we will compare Datadog and Splunk on important aspects like APM, log management, search capabilities, and more.

Read Post

Atatus

Read more about Datadog vs Splunk: A Side-by-Side Comparison [2025]

SQL performance improvements: analysing & fixing the slow queries (part 2)

Oct 6, 2025 By Mattias Geniar In Oh Dear

This is part 2 of a 3-part series on SQL performance improvements. A few weeks ago, we massively improved the performance of the dashboard & website by optimizing some of our SQL queries. In this post, we'll dive deeper into the optimisations of queries with indexes.

Read Post

Oh Dear

Read more about SQL performance improvements: analysing & fixing the slow queries (part 2)

Scaling Datadog observability: 1,000 integrations and counting

Oct 6, 2025 By Alex Guo In Datadog

Integrations have always been central to the Datadog platform, enabling customers to collect the data they need directly from the technologies they use every day. By unifying signals from infrastructure and applications to security and SaaS applications, teams gain both high-level visibility and the ability to drill into the details that matter the most. With more than 1,000 integrations now available, the Datadog ecosystem continues to expand alongside the platforms our customers rely on.

Read Post

Datadog

Read more about Scaling Datadog observability: 1,000 integrations and counting

Pastries with SREs: Leveling up observability and donut dunkability

Oct 6, 2025 By Elastic In Elastic

In this episode of Pastries with SREs, we explore what it really means to shift left with observability, moving from reactive firefighting to proactive performance. And yes, it starts with donuts. We unpack how SREs and IT Ops teams are often stuck reacting to incidents, battling alert fatigue and swivel-chair triaging. But what if you could pull in developers earlier, and give everyone a unified view of observability data?

View Video

Elastic

Read more about Pastries with SREs: Leveling up observability and donut dunkability

How to Perform Ping Tests: Different Tools and Techniques

Oct 6, 2025 By Andrii Kernitskyi In Obkio

If you’re a remote worker struggling with video calls, or a gamer noticing lag, a quick Internet ping test using an online ping tester can give you a simple yes/no answer: Is my connection alive, and how fast does it respond?. But if you’re a network admin or IT professional, that’s just scratching the surface. Business networks are more complex beasts.

Read Post

Obkio

Read more about How to Perform Ping Tests: Different Tools and Techniques

How to automate sending SquaredUp dashboards to Slack with the Notification API

Oct 6, 2025 By Squared Up In Squared Up

SquaredUp's existing notifications fire when monitors change state. With Notification API, you control the trigger. Send dashboards on a schedule, before meetings, or on-demand through chat commands. In this step-by-step guide, you’ll learn how to automate sending SquaredUp dashboards to Slack. I’ll use Power Automate as the example, but the same approach works with other automation tools such as Zapier, Make, n8n, or even a custom script, as long as it can send an HTTP request.

Read Post

Squared Up

Read more about How to automate sending SquaredUp dashboards to Slack with the Notification API

LLM Observability Explained: Prevent Hallucinations, Manage Drift, Control Costs

Oct 6, 2025 By Chrissy Kidd In Splunk

Large Language Models (LLMs) are transforming how businesses interact with users, automate workflows, and deliver insights in real time. But as powerful as these models are, running them at scale comes with unique challenges, from hallucinations and latency spikes to cost overruns and user trust issues.

Read Post

Splunk

Read more about LLM Observability Explained: Prevent Hallucinations, Manage Drift, Control Costs

The observability maturity curve: How IT leaders are shifting from tools to outcomes

Oct 6, 2025 By Dave Russell In Grafana

Observability has come a long way from its origins in monitoring logs and metrics. Today, it sits on a maturity curve: Organizations move from fragmented tool stacks to unified platforms to proactive engineering practices that tie reliability to business outcomes. To better understand where IT leaders are on this curve, Grafana Labs surveyed 150 decision-makers across industries in advance of ObservabilityCON 2025.

Read Post

Grafana

Read more about The observability maturity curve: How IT leaders are shifting from tools to outcomes

Why DEX Scores Must Be Part of Every Total Cost of Ownership Study

Oct 6, 2025 By Jason Pascual In Nexthink

Price is not the same as cost. When organizations evaluate new end-user technology investments, whether that’s laptops, operating systems, or management tools the conversation inevitably turns to Total Cost of Ownership (TCO). TCO studies traditionally focus on direct, measurable costs: hardware procurement, software licensing, support contracts, and lifecycle services. But there’s a growing blind spot in these calculations: the employee experience.

Read Post

Nexthink

Read more about Why DEX Scores Must Be Part of Every Total Cost of Ownership Study

Keep stakeholders informed with Datadog Status Pages

Oct 6, 2025 By Curtis Maher In Datadog

When incidents occur, clear communication can be just as important as fast remediation. Your internal teams need timely updates to stay aligned, and your users want to know what is happening and when they can expect a fix. Without a reliable way to proactively share updates, support teams can get flooded with tickets and customer trust can erode. Datadog Status Pages, now generally available, makes it easy to keep everyone informed through a public or internal web page during outages.

Read Post

Datadog

Read more about Keep stakeholders informed with Datadog Status Pages

How DreamHost Slashed Memory Usage by 80% and Scaled to 76 Million Time Series

Oct 5, 2025 By Marc Sherwood In VictoriaMetrics

For any growing business, there comes a point where the tools that once worked perfectly begin to show their limits. This is especially true for monitoring infrastructure. As your user base, services, and data volumes expand, the pressure on your monitoring stack intensifies. For web hosting leader DreamHost, with over 1.5 million websites to manage, their existing open-source solutions simply couldn’t keep up.

Read Post

VictoriaMetrics

Read more about How DreamHost Slashed Memory Usage by 80% and Scaled to 76 Million Time Series

How Technology Improves Commercial HVAC Efficiency

Oct 5, 2025 By OpsMatters In OpsMatters

Efficient heating, ventilation, and air conditioning (HVAC) systems are important for maintaining comfortable, healthy, and cost-effective commercial spaces. As energy costs rise and environmental concerns grow, businesses are increasingly looking for innovative ways to optimize their HVAC operations. Technological advancements are transforming how systems are monitored, controlled, and maintained, resulting in improved performance and lower operating costs.

Read Post

OpsMatters

Read more about How Technology Improves Commercial HVAC Efficiency

How to Monitor Zoom Performance & Fix "Zoom Your Internet Connection is Unstable"

Oct 4, 2025 By Alyssa Lamberti In Obkio

Zoom calls have become a staple of modern life, connecting us with friends, family, and colleagues from all over the world. But have you ever experienced the frustration of a laggy, glitchy Zoom call that leaves you feeling like you're in a bad sci-fi movie? Laggy video, packet loss, and jitter make it difficult to have a clear and coherent conversation over Zoom - which is why it’s important to identify these Zoom issues before your next call.

Read Post

Obkio

Read more about How to Monitor Zoom Performance & Fix "Zoom Your Internet Connection is Unstable"

Sponsored Post

3 secure ways to handle user data in Raygun

Oct 3, 2025 By Zheng Li In Raygun

You know the feeling: You're right in the middle of cracking a really convoluted coding problem, when an urgent support ticket pops up. It's not just any ticket; it's from a VIP customer with a high-severity issue demanding resolution within an hour. You have to drop what you're doing and scramble, completely context-switching and losing all your momentum.

Read Post

Raygun

Read more about 3 secure ways to handle user data in Raygun

Sponsored Post

Top 10 Reasons Why You Need a Status Page Aggregator

Oct 3, 2025 By Nuno Tomas In isDown

Managing dependencies on multiple third-party services has become a critical challenge for modern engineering teams. A status page aggregator solves this by centralizing monitoring across all your vendors' status pages into a single dashboard, giving you real-time visibility into potential issues before they impact your users. Whether you're managing a complex microservices architecture or simply relying on various SaaS tools, understanding when and why your dependencies fail is crucial for maintaining service reliability.

Read Post

isDown

Read more about Top 10 Reasons Why You Need a Status Page Aggregator

Top tips: Mastering browser extensions without overwhelming yourself

Oct 3, 2025 By Shawn King Jason In ManageEngine

Top tips is a weekly column where we highlight what’s trending in the tech world today and list ways to explore these trends. This week, we’re looking at how browser extensions can boost productivity when used wisely—and how to avoid being overwhelmed by them. Extensions are like candy for your browser. One promises to save time, another blocks ads, a third manages your tabs, and before you know it, your browser looks like a Swiss army knife.

Read Post

ManageEngine

Read more about Top tips: Mastering browser extensions without overwhelming yourself

Mimir October 2025 Community Call

Oct 3, 2025 By Grafana In Grafana

Let's talk about discuss Mimir! Specifically we'll discuss Mimir 3.0, which is coming soon.

View Video

Grafana

Read more about Mimir October 2025 Community Call

How to Use Synthetic Monitoring in CI/CD Pipelines

Oct 3, 2025 By Dotcom-Monitor In Dotcom-Monitor

CI/CD pipelines are the heartbeat of modern software delivery. They automate builds, run unit tests, package applications, and deploy them to production with a speed that traditional release cycles could never match. For engineering teams under pressure to move fast, pipelines are the mechanism that makes agility possible.

Read Post

Dotcom-Monitor

Read more about How to Use Synthetic Monitoring in CI/CD Pipelines

Your big VIP Teams call just went south. Do you have the tools to troubleshoot - fast?

Oct 3, 2025 By Sara Purdon In Martello Technologies

Imagine you’re the IT lead responsible for your organization’s Microsoft Teams experience. A big call with the board comes up, loaded with company VIPs — and it’s chock full of issues. Lag, choppy audio, bad connections. After the call, there’s a knock at your door. Not a happy knock. You answer and standing there is your CEO, stamping her foot demanding to know what went wrong.

Read Post

Martello Technologies

Read more about Your big VIP Teams call just went south. Do you have the tools to troubleshoot - fast?

How to Identify Network Bottlenecks: From Snail Mail to Warp Speed

Oct 3, 2025 By Alyssa Lamberti In Obkio

Welcome, network admins and IT pros, to a world where network bottlenecks become nothing more than a distant memory. In an era where the need for speed is paramount, identifying and eliminating network bottlenecks is the key to achieving warp-speed connectivity. Your network is like a bustling metropolis, with data zipping through its veins like cars on a busy highway. But suddenly, the flow slows down to a snail's pace, causing frustration and hindering productivity.

Read Post

Obkio

Read more about How to Identify Network Bottlenecks: From Snail Mail to Warp Speed

New Dashboards and Reports for Kubernetes Monitoring

Oct 3, 2025 By Rachel Berry In eG Innovations

This is just a quick blog to draw attention to some new and enhanced monitoring dashboards and reports we have added to eG Enterprise in our latest release (v7.5) to provide quick and powerful overviews when monitoring a range of Kubernetes technologies. As with all our dashboards, color-coded overlays provide guided drilldown for help desk operators and administrators.

Read Post

eG Innovations

Read more about New Dashboards and Reports for Kubernetes Monitoring

Elastic named a Leader in The Forrester Wave: Cognitive Search Platforms, Q4 2025

Oct 3, 2025 By Natalie Blake In Elastic

Today, we’re excited to share that Elastic has been named a Leader in The Forrester Wave: Cognitive Search Platforms, Q4 2025. We believe this recognizes our continued innovation in AI-powered search and the momentum of the Elasticsearch Platform.

Read Post

Elastic

Read more about Elastic named a Leader in The Forrester Wave: Cognitive Search Platforms, Q4 2025

Observability vs. Visibility: What's the Difference?

Oct 3, 2025 By Faiz Shaikh In Last9

In modern IT systems—distributed services, cloud-native platforms, and dynamic networks—just knowing that something is “up” isn’t enough. Green checkmarks on dashboards don’t tell you why performance shifted, why latency crept in, or why a perfectly healthy-looking service suddenly failed. This is where the conversation around visibility and observability begins. They sound similar, but they solve very different problems.

Read Post

Last9

Read more about Observability vs. Visibility: What's the Difference?

Scheduling discovery jobs for dynamic enterprise networks

Oct 3, 2025 By Rama Venkatesan In Site24x7

Networks have evolved far beyond simple data conduits.They're now the backbone of decentralized digital enterprises, serving as critical channels for information exchange. Modern networks connect dispersed locations and devices, driving performance, security, and cost efficiency. However, decentralization also scatters assets, creates blind spots and increases operational complexity.

Read Post

Site24x7

Read more about Scheduling discovery jobs for dynamic enterprise networks

Understanding NetFlow: The Key to Network Insights

Oct 3, 2025 By VictoriaMetrics In VictoriaMetrics

Is your network data CRASHING your database? NetFlow offers incredible insights, but there's a hidden catch: cardinality explosion. Collecting every IP address can overload time-series databases (even VictoriaMetrics!), killing performance. Watch to learn how to tame the data beast! What's the worst 'cardinality explosion' you've ever witnessed?

View Video

VictoriaMetrics

Read more about Understanding NetFlow: The Key to Network Insights

September product updates

Oct 2, 2025 By Valeria Kurolapova In StatusGator

September was a busy month at StatusGator! We rolled out several major updates designed to give you more visibility, better integrations, and deeper control of your monitoring workflows. From new Early Warning Signal integrations to AWS Health support — plus our biggest API release yet — here’s a quick recap of everything we shipped last month.

Read Post

StatusGator

Read more about September product updates

Why Citrix VAD/DaaS Customers Using VMware Should Consider Migrating to XenServer

Oct 2, 2025 By GripMatix In GripMatix

For years, VMware vSphere was the undisputed leader in enterprise virtualization. Its reliability, feature set, and ecosystem made it the go-to hypervisor for organizations. Also for organizations running Citrix Virtual Apps and Desktops (VAD) or Citrix DaaS, VMware was synonymous with virtualization excellence. But the landscape has changed, dramatically. If you're still running your Citrix workloads on VMware, it's time to take a serious look at XenServer, and here's why.

Read Post

GripMatix

Read more about Why Citrix VAD/DaaS Customers Using VMware Should Consider Migrating to XenServer

Announcing Scout's MCP Server for AI-Native Monitoring!

Oct 2, 2025 By Sarah Morgan In Scout

We’re excited to introduce the Scout Monitoring MCP Server — a new way to bring AI-native monitoring directly into your coding assistant. Instead of flipping between dashboards and logs, the MCP (Model Context Protocol) server surfaces performance data, errors, and slow endpoints right where you work. Ask plain-language questions like “show me the latest five errors” and get answers grounded in live telemetry. You can even let your coding assistant propose and push fixes!

Read Post

Scout

Read more about Announcing Scout's MCP Server for AI-Native Monitoring!

AI for Network Leaders by Selector - Strategic Imperatives in an AI World by William Collins

Oct 2, 2025 By Selector In Selector

Strategic Imperatives for Infrastructure Leaders in an AI-Enabled World William Collins, Director of Technical Evangelism at Itential, explores the strategic imperatives facing infrastructure leaders in today’s AI-enabled world. He unpacks the Gartner Hype Cycle, the true monetary costs of network downtime, and shows how Itential + Selector can close the loop on AIOps with autonomous agents and MCPs.

View Video

Selector

Read more about AI for Network Leaders by Selector - Strategic Imperatives in an AI World by William Collins

AI for Network Leaders by Selector - Building Your First RAG App by John Capobianco

Oct 2, 2025 By Selector In Selector

Building Your First GenAI RAG Application John Capobianco, Head of Developer Relations at Selector, walks through a 6-step process for building your first GenAI RAG application. From foundational building blocks to the path toward full AI agents, RAG remains a powerful tool with huge ROI. Even in a world of autonomous agents and MCPs, RAG is still one of the best ways for network engineers and IT leaders to query dozens of sources and unlock real value.

View Video

Selector

Read more about AI for Network Leaders by Selector - Building Your First RAG App by John Capobianco

AI for Network Leaders by Selector - AI Agents and MCP by John Capobianco

Oct 2, 2025 By Selector In Selector

AI Agents & Model Context Protocol John Capobianco, Head of Developer Relations at Selector, dives deep into AI Agents and the Model Context Protocol (MCP). In this session, John demonstrates Selector MCP in action — running as a client-server, connecting multimodal inputs, and even talking to Selector using microphone + TTS audio via Gemini CLI. He also showcases Sebastian Maniak’s Claude Desktop integration, where Selector MCP powers a chatGPT-like UI for network engineers. A practical look at how MCP is transforming AI into a true digital co-worker.

View Video

Selector

Read more about AI for Network Leaders by Selector - AI Agents and MCP by John Capobianco

AI for Network Leaders by Selector - Round Table discussion

Oct 2, 2025 By Selector In Selector

AI Leaders in Networking Roundtable | Greg Freeman, Du’An Lightfoot, Jeremy Shulman, William Collins, Scott Robohn & John Capobianco (Moderator)

View Video

Selector

Read more about AI for Network Leaders by Selector - Round Table discussion

Cloud Microservices Monitoring on AWS and Azure with OpenTelemetry

Oct 2, 2025 By Alexandr Bandurchin In Uptrace

Your checkout flow starts in an AWS Lambda function, calls a payment service running on EKS, then triggers notifications through Azure Functions. Three different compute platforms, two cloud providers, one distributed trace that you can't see. Cloud providers want you to use their native monitoring tools. AWS pushes X-Ray and CloudWatch. Azure promotes Application Insights and Azure Monitor. These tools work well within their ecosystems but lock you into vendor-specific implementations.

Read Post

Uptrace

Read more about Cloud Microservices Monitoring on AWS and Azure with OpenTelemetry

Observability - Not Just Dashboards and Alerts | Why Teams Like Uber & Salesforce Use Grafana Cloud

Oct 2, 2025 By Grafana In Grafana

Grafana Cloud is a fully managed observability platform built on open source and open standards. From Fitbits to power grids, it helps teams monitor systems, cut through noise, and act faster. With 150+ integrations, Grafana Cloud unifies logs, metrics, and traces, giving visibility from backend to frontend. AI-powered guidance accelerates root cause analysis and simplifies on-call, while customers like Citigroup, Salesforce, Uber, and ASOS scale with confidence.

View Video

Grafana

Read more about Observability - Not Just Dashboards and Alerts | Why Teams Like Uber & Salesforce Use Grafana Cloud

Honeycomb Observability Day SF - Kesha Mykhailov, Fin.ai: Human-Centric Observability in AI Systems

Oct 2, 2025 By Honeycomb In Honeycomb

Empathy is one of the superpowers of modern teams, especially when building tools that interact with humans. This talk by Kesha Mykhailov tells the story of Fin, Intercom's Customer Support agent, and how they transformed their approach to Fin's.

View Video

Honeycomb

Read more about Honeycomb Observability Day SF - Kesha Mykhailov, Fin.ai: Human-Centric Observability in AI Systems

Inside the InfluxDB 3 Plugin Ecosystem

Oct 2, 2025 By Allyson Boate In InfluxData

Companies today face growing pressure to manage and analyze massive flows of time series data, from IoT sensors to cloud-native infrastructure. Storing this information is relatively straightforward. The greater obstacle is keeping it useful and consistent while balancing a wide range of tools and modern technology platforms that continue to evolve.

Read Post

InfluxData

Read more about Inside the InfluxDB 3 Plugin Ecosystem

A closer look at Grafana k6 browser: alignment with Playwright, modern features for frontend testing, and what's next

Oct 2, 2025 By Ankur Agarwal In Grafana

Over the years, we’ve seen our community embrace Grafana k6 browser as a key component of their frontend testing strategies. By helping collect frontend web vitals, capture custom metrics, and simulate user actions like clicking buttons or completing forms, the module offers teams a deeper understanding of performance and availability from their end users’ point of view.

Read Post

Grafana

Read more about A closer look at Grafana k6 browser: alignment with Playwright, modern features for frontend testing, and what's next

Sending beers all across Belgium, a throwback to how we named Oh Dear

Oct 2, 2025 By Mattias Geniar In Oh Dear

We're obviously a little biased, but we believe we have one of the best website monitoring tools on the market today, leading in features compared to our competitors. We've already tried a variety of marketing techniques to promote our service, but none really had the impact we were looking for. Maybe we're better at actually building good software than we are at marketing it? Or are we trying what everyone else is also doing, thus making it all harder?

Read Post

Oh Dear

Read more about Sending beers all across Belgium, a throwback to how we named Oh Dear

What the 2025 DORA Report Teaches Us About Observability and Platform Quality

Oct 2, 2025 By Shabih Syed In Honeycomb

The 2025 DORA State of AI-Assisted Software Development Report delivers a critical insight for technology leaders: AI is fundamentally an amplifier, not a solution. It magnifies the strengths of high-performing organizations with robust observability while exposing the dysfunctions of struggling ones. For organizations that have rushed to adopt AI coding assistants all while expecting immediate productivity gains, this finding demands a strategic pivot.

Read Post

Honeycomb

Read more about What the 2025 DORA Report Teaches Us About Observability and Platform Quality

Debugging Microservices in Production with Distributed Tracing

Oct 2, 2025 By Alexandr Bandurchin In Uptrace

Your production checkout flow just started returning 500 errors. Six microservices handle checkout. Logs show errors in three of them. Which service broke? Which error happened first? What caused the cascade? Traditional debugging doesn't work. You can't attach a debugger to production. Searching logs across six services gives thousands of lines with no obvious connection. By the time you correlate timestamps and trace IDs manually, customers have abandoned their carts.

Read Post

Uptrace

Read more about Debugging Microservices in Production with Distributed Tracing

When BGP becomes UX: The inside story of a SaaS routing decision gone wrong (or right)

Oct 2, 2025 By Wasil Banday In Catchpoint

Most operations teams trust their green dashboards. If the internal monitoring says everything is healthy, the app must be fine, right? But as the Internet keeps proving, what’s green inside the firewall can look red for customers outside of it. Sometimes, a single change in how web traffic moves can suddenly slow logins, disrupt websites, or hurt business results, even if everything looks fine inside.

Read Post

Catchpoint

Read more about When BGP becomes UX: The inside story of a SaaS routing decision gone wrong (or right)

Agentic AIOps in Action: LogicMonitor, IBM, and Red Hat Deliver Self-Healing IT

Oct 2, 2025 By Luca Gianaschi In LogicMonitor

Your most skilled engineers shouldn’t be spending nights and weekends piecing together root causes of outages. Yet many organizations still rely on manual incident response across sprawling hybrid and multi-cloud environments. The result: slower resolution times, frustrated customers and lost revenue that can reach up to $1 million per hour according to IDC. At LogicMonitor, we believe the answer isn’t just better monitoring. It is systems that can heal themselves.

Read Post

LogicMonitor

Read more about Agentic AIOps in Action: LogicMonitor, IBM, and Red Hat Deliver Self-Healing IT

VictoriaMetrics Virtual Meet Up - October 2025

Oct 2, 2025 By VictoriaMetrics In VictoriaMetrics

Agenda highlights: VictoriaMetrics Roadmap Update Ryan Jacobs will talk about his experience of switching to and using VictoriaMetrics and VictoriaLogs in the world of. Anomaly Detection & Roadmap Community + AMA.

View Video

VictoriaMetrics

Monitoring

Read more about VictoriaMetrics Virtual Meet Up - October 2025

Seer can fix Web Vitals

Oct 2, 2025 By Sentry In Sentry

Join Serge and Ben to learn more about Web Vitals, lab data vs real user metrics and how Seer can help fix your Web Vitals.

View Video

Sentry

Monitoring

Read more about Seer can fix Web Vitals

AIOps 2.0: The future of IT operations is here

Oct 2, 2025 By ManageEngine

Learn to fix IT incidents in minutes with next-gen AIOps that combines AI, automation, and observability to build resilient, high-performance IT ecosystems at scale.

Get White Paper

ManageEngine

Read more about AIOps 2.0: The future of IT operations is here

September 2025 - Early Warning Signals

Oct 1, 2025 By Colin Bartlett In StatusGator

In September 2025, StatusGator Early Warning Signals identified dozens of outages across cloud, fintech, and education platforms. Many of these incidents were detected before providers acknowledged them — and in some cases, without any acknowledgment at all. We’ve highlighted several of the most significant outages as featured incidents, followed by a list of additional disruptions reported throughout the month.

Read Post

StatusGator

Read more about September 2025 - Early Warning Signals

Announcing the StatusGator API v3

Oct 1, 2025 By Colin Bartlett In StatusGator

We’re excited to introduce StatusGator API v3, the biggest update yet to our developer platform. This release is built on modern standards, offers a richer data model, and unlocks powerful new endpoints designed to help you integrate StatusGator more deeply into your workflows.

Read Post

StatusGator

Read more about Announcing the StatusGator API v3

Monitor Slurm with Datadog

Oct 1, 2025 By Bowen Chen In Datadog

Slurm (Simple Linux Utility for Resource Management) is an open source workload management system used to schedule jobs and manage resources for high-performance computing (HPC) Linux clusters. It ensures that jobs and resources are scheduled fairly and efficiently and is scalable across large clusters, an issue that native Linux process management tools struggle with.

Read Post

Datadog

Read more about Monitor Slurm with Datadog

Monitor Website Performance Globally with Site24x7

Oct 1, 2025 By ManageEngine Site24x7 In Site24x7

Regional slowdowns can go unnoticed until customers start complaining. Site24x7 helps you spot website performance issues worldwide with real-time monitoring from 130+ locations.

View Video

Site24x7

Monitoring

Read more about Monitor Website Performance Globally with Site24x7

How to know your data with Cribl's Ed Bailey and VisiCore Technology's Paul Stout.

Oct 1, 2025 By Cribl In Cribl

Classifying and tagging data is the key to automating pipelines and improving visibility across the enterprise. We’ll share both the technical and business impact of truly knowing your data, and why Cribl makes it possible. Plus, we’ll talk CriblCon and why we’re excited to see you there.

View Video

Cribl

Read more about How to know your data with Cribl's Ed Bailey and VisiCore Technology's Paul Stout.

Reality Bytes: Our Everyday AI Use (Personal & Professional)

Oct 1, 2025 By Nexthink In Nexthink

The Reality Bytes team is back together again! Tim, Tom, Megan, Dina and Sean swap stories of how AI has reshaped their personal and professional lives and habits over the past year—from eerie chatbot encounters and creative breakthroughs to frustrations with hallucinations and the hunt for the true “human fingerprint.”

View Video

Nexthink

Read more about Reality Bytes: Our Everyday AI Use (Personal & Professional)

OTel Naming Best Practices for Spans, Attributes, and Metrics

Oct 1, 2025 By Anjali Udasi In Last9

An incident’s in progress. Services are slow, customers are frustrated, and your dashboards… look fine. At least, until you search for payment metrics and get 47 different names for the same signal. Suddenly, the real issue isn’t latency — it’s inconsistency. The OpenTelemetry project recently published a three-part series on naming conventions to solve exactly this problem.

Read Post

Last9

Read more about OTel Naming Best Practices for Spans, Attributes, and Metrics

How to check CPU usage on Linux

Oct 1, 2025 By Blerim Sheqa In Icinga

When your Linux system feels sluggish, one of the first things to investigate is the CPU usage. The CPU (Central Processing Unit) is the brain of your machine, and if it’s overloaded, everything else slows down. In this guide, you’ll learn different ways to Linux check CPU usage with command-line tools, how to interpret the metrics, and why automatic monitoring with Icinga ensures long-term system stability.

Read Post

Icinga

Read more about How to check CPU usage on Linux

Easiest Way to Ship Docker & Nginx Logs to Loki with Promtail

Oct 1, 2025 By Benjamin Pitts In MetricFire

Effective monitoring catches problems before users do, and with Promtail, Loki, and LogQL, it’s a lightweight, approachable option for any DevOps team. This guide shows how to monitor Docker itself (pull failures, restarts, health flaps) so you’ve got a baseline on container runtime health.

Read Post

MetricFire

Read more about Easiest Way to Ship Docker & Nginx Logs to Loki with Promtail

A Network Crumb Back Story: A Baker's Dozen Retrospective

Oct 1, 2025 By Doug Madory In Kentik

Loaf and behold, this retrospective covers nearly 20 years of the Baker’s Dozen style annual ranking of the biggest ASes of the internet. In it we discuss the DFZ, de-peerings, and partitions. It was the yeast we could do.

Read Post

Kentik

Read more about A Network Crumb Back Story: A Baker's Dozen Retrospective

Explore the New InfluxDB 3 UI

Oct 1, 2025 By InfluxData In InfluxData

Explorer is the new UI for InfluxDB 3 Core (open source) and Enterprise. It brings everything into one place: ingesting data, querying, visualizing, and managing your database. It’s designed to remove friction: fewer tools, less context switching, faster feedback.

View Video

InfluxData

Read more about Explore the New InfluxDB 3 UI

Why 1% Packet Loss Is the New 100% Outage

Oct 1, 2025 By Yann Guernion In Broadcom

For years, you had an unspoken agreement. Your networks were built to be resilient, and your applications were, for the most part, forgiving. You sent emails, transferred files, and backed up data. If a few packets went missing along the way, the protocols would quietly clean up the mess. A little bit of packet loss was just background noise, an expected imperfection in a system that was, by and large, incredibly robust. You could tolerate it.

Read Post

Broadcom

Read more about Why 1% Packet Loss Is the New 100% Outage

The Importance of Community Knowledge in Tech

Oct 1, 2025 By VictoriaMetrics In VictoriaMetrics

Tools alone aren’t enough. How you use them and the expertise you tap into make all the difference. In this Short, we explore why even the best tools need the proper guidance to unlock their full potential. Open-source communities are goldmines of knowledge and support Connecting with experts can save you serious time and headaches While enterprise support is valuable, the community often has your back. Get practical tips to get the most out of your tools, and remember: it’s not just what you use; it’s how you connect, learn, and grow along the way.

View Video

VictoriaMetrics

Monitoring

Read more about The Importance of Community Knowledge in Tech

Operations | Monitoring | ITSM | DevOps | Cloud