Monthly Archive

Top 5 EdTech outages detected by StatusGator in May 2025

May 31, 2025 By Colin Bartlett In StatusGator

In May 2025, several EdTech platforms experienced service disruptions, impacting students, educators, and administrators. StatusGator’s Early Warning Signals feature once again provided timely alerts — often before the affected providers posted updates. Here are the top five EdTech outages detected by StatusGator in May.

Read Post

StatusGator

Read more about Top 5 EdTech outages detected by StatusGator in May 2025

Top 5 outages detected by StatusGator in May 2025

May 31, 2025 By Colin Bartlett In StatusGator

In May 2025, several widely used platforms experienced outages affecting both enterprise and consumer services. With StatusGator’s Early Warning Signals, users were alerted to disruptions ahead of official announcements, helping teams respond faster and reduce downtime impact. Here are five major outages StatusGator detected in May.

Read Post

StatusGator

Read more about Top 5 outages detected by StatusGator in May 2025

Traceparent: How OpenTelemetry Connects Your Microservices

May 30, 2025 By Preeti Dewani In Last9

In a microservices setup, tracking a single request across services quickly gets complex. One service calls another, then a third, and your logs don’t line up. The traceparent header carries context between services, so all parts of a request connect back to the start. For example, when a frontend sends a request to an API, which then calls a database service, traceparent it links those calls in the trace. Without it, you’re left guessing how requests flow.

Read Post

Last9

Read more about Traceparent: How OpenTelemetry Connects Your Microservices

How to Transform IT Operations with PowerFlow for ServiceNow

May 30, 2025 By ScienceLogic In ScienceLogic

IT teams today face a flood of tickets, disconnected tools, and complex hybrid infrastructure. ScienceLogic PowerFlow for ServiceNow simplifies it all—automating workflows, enriching data, and accelerating resolution at scale. In this video, see how PowerFlow brings intelligent automation to transform your IT operations, with real-time ticket enrichment and seamless ServiceNow integration. It’s scalable, efficient, and built for today’s hybrid environments.

View Video

ScienceLogic

Read more about How to Transform IT Operations with PowerFlow for ServiceNow

VictoriaMetrics Features & Community Call - May 2025

May 30, 2025 By VictoriaMetrics In VictoriaMetrics

Join us for this new monthly call, where we'll discuss cool features that are either new or that we'd like to highlight to the user community. We'll also look at some of the questions that were asked by users that could be of interest to others. We'll talk about how to optimise data collected by default in-stack helm chart, incl. topics such as: Cardinality explorer Stream aggregation Unused metrics And more! We look forward to seeing you there!

View Video

VictoriaMetrics

Monitoring

Read more about VictoriaMetrics Features & Community Call - May 2025

A Leader, Once Again - Thanks to You

May 30, 2025 By Pedro Bados In Nexthink

The demonstration of sustained excellence is ultimately more impressive and important than any individual breakthrough, though the former often go less celebrated. This week, however, Nexthink is extremely proud to celebrate that we’ve once again been named a Leader in the Gartner Magic Quadrant for Digital Employee Experience (DEX) Tools. It’s a distinction we don’t take for granted—and one we’re honored to have maintained.

Read Post

Nexthink

Read more about A Leader, Once Again - Thanks to You

How to Pinpoint Root Cause in Real Time

May 30, 2025 By ScienceLogic In ScienceLogic

When systems fail, it’s not just about knowing that something went wrong—it’s about understanding why it happened and pinpointing the root cause fast. ScienceLogic Skylar AI automatically analyzes massive volumes of data, detects patterns, and delivers clear, human-readable insights. The result? Your team knows exactly where to start, acts faster, and keeps issues from escalating.

View Video

ScienceLogic

Read more about How to Pinpoint Root Cause in Real Time

Our latest Pingdom data improvements - to get more from your monitoring

May 30, 2025 By Noorul Huda N In Squared Up

At SquaredUp, we’re obsessed with making monitoring not just powerful, but a genuinely delightful experience for engineers and teams. When we first built our Pingdom plugin, the goal was simple: make website uptime and performance data easy to visualize alongside everything else you care about. But as our users pushed the boundaries—connecting more endpoints, demanding richer insights, and needing faster troubleshooting—we realized our plugin needed to keep up.

Read Post

Squared Up

Read more about Our latest Pingdom data improvements - to get more from your monitoring

Quick Guide to Monitoring Your Podman VM Using OpenTelemetry

May 30, 2025 By Benjamin Pitts In MetricFire

Podman is a modern, open-source container engine developed by Red Hat that serves as a powerful alternative to Docker. It's designed with security, composability, and simplicity in mind. Here’s why it stands out.

Read Post

MetricFire

Read more about Quick Guide to Monitoring Your Podman VM Using OpenTelemetry

How to Detect Cloud Configuration Changes Before They Cost You

May 30, 2025 By ScienceLogic In ScienceLogic

In this video, see how ScienceLogic helps IT teams take back control. By delivering real-time insight, intelligent policy recommendations, and automated enforcement, ScienceLogic keeps your cloud environment compliant, cost-efficient, and secure. From real-time change detection to automated remediation, you’ll get the visibility and control needed to move fast and stay ahead of disruption.

View Video

ScienceLogic

Read more about How to Detect Cloud Configuration Changes Before They Cost You

How to Block Chat Widgets During Playwright Tests (Drift, Intercom & More)

May 30, 2025 By Leo Baecker In Hyperping

Chat widgets are great for customer support, but they can wreak havoc on your automated tests. These floating elements often interfere with Playwright tests by covering clickable buttons, triggering unexpected popups, or causing element selection issues. If you've ever had a test fail because a chat widget appeared at the wrong moment, you're not alone. This guide shows you exactly how to block popular chat widgets like Drift, Intercom, Zendesk, and others during your Playwright test runs.

Read Post

Hyperping

Read more about How to Block Chat Widgets During Playwright Tests (Drift, Intercom & More)

E-Commerce Micro-Friction: The Conversion Killer You're Not Measuring (...But Should Be)

May 30, 2025 By Germain UX Team In Germain UX

Friction density (the number of stacked micro-frictions in a single session) is the predictor of abandonment we don’t talk about enough. And guess what? It beats any single UX metric by 2× when it comes to predicting lost sales.

Read Post

Germain UX

Read more about E-Commerce Micro-Friction: The Conversion Killer You're Not Measuring (...But Should Be)

The 12 Best Nagios Alternatives in 2025

May 30, 2025 By Pavithra Parthiban In Atatus

Are you looking for a Nagios alternative? Then you have come to the right place. In this blog, we will go through the top Nagios alternatives available today. But before that, let’s briefly look at what Nagios offers and why some teams might consider switching to another monitoring solution.

Read Post

Atatus

Read more about The 12 Best Nagios Alternatives in 2025

Dashboards, or Launchpads?

May 30, 2025 By Martin Thwaites In Honeycomb

I have a personal vendetta against “dashboards.” Not because they’re not useful—I actually think they’re extremely useful—but rather because they’re generally built with the wrong user in mind, then used by a completely different user, and for a different use case.

Read Post

Honeycomb

Read more about Dashboards, or Launchpads?

How to Automatically Detect Linux Configuration Drifts

May 30, 2025 By ScienceLogic In ScienceLogic

When a quick Linux fix during an outage slips through unnoticed, it can silently break compliance and put your infrastructure at risk. In this video, we dive into how ScienceLogic helps IT and security teams detect and resolve these hidden issues automatically—before they become bigger problems. We demonstrate how ScienceLogic identifies unauthorized Linux configuration changes, flags policy violations, and restores compliance through intelligent automation. From configuration drift detection to enriched ServiceNow tickets and automated remediation, it’s all about eliminating the guesswork and staying ahead of risk.

View Video

ScienceLogic

Read more about How to Automatically Detect Linux Configuration Drifts

Early Warning Signals: Now visible everywhere!

May 29, 2025 By Colin Bartlett In StatusGator

We’ve just rolled out an exciting enhancement to our Early Warning Signals system: Possible outages that we detect are now visible directly in your Admin board and on your public Status Page. Until now, we’ve notified StatusGator users about possible outages via email — and more recently, Slack — when we detected signs of an outage before any official acknowledgment. Now, these early warnings appear right where they’re most impactful.

Read Post

StatusGator

Read more about Early Warning Signals: Now visible everywhere!

Windows Error Logs: Your Guide to Simplified Debugging

May 29, 2025 By Faiz Shaikh In Last9

When an application functions flawlessly in your environment but crashes unpredictably on a client’s Windows server, the root cause is often buried in system logs—logs many developers overlook. Windows maintains comprehensive error records that document crashes, failures, and system events with precise detail. These Windows error logs serve as an invaluable resource for diagnosing issues in production environments.

Read Post

Last9

Read more about Windows Error Logs: Your Guide to Simplified Debugging

What's new in Grafana Metrics Drilldown: advanced filtering options, UI enhancements, and more

May 29, 2025 By Zhehao Zhou In Grafana

Grafana Metrics Drilldown offers a queryless experience for browsing Prometheus-compatible metrics. With Metrics Drilldown — which is part of our suite of Grafana Drilldown apps — you can quickly find related metrics with just a few simple clicks, no PromQL queries required.

Read Post

Grafana

Read more about What's new in Grafana Metrics Drilldown: advanced filtering options, UI enhancements, and more

Achieving FedRAMP Authorization: Driving Federal IT Efficiency and Security with ScienceLogic Government Cloud

May 29, 2025 By ScienceLogic In ScienceLogic

We are thrilled to announce that ScienceLogic has achieved Federal Risk and Authorization Management Program (FedRAMP) Moderate authorization for the ScienceLogic Government Cloud. This milestone represents the culmination of our commitment to delivering secure, reliable, and efficient IT operations management solutions for government agencies.

Read Post

ScienceLogic

Read more about Achieving FedRAMP Authorization: Driving Federal IT Efficiency and Security with ScienceLogic Government Cloud

How Auditd Logs Help Secure Linux Environments

May 29, 2025 By Anjali Udasi In Last9

If you manage a Linux server and notice something unusual, auditd logs can help you track exactly what’s happening. This built-in audit system records who accessed the system and what actions they performed. In this guide, we’ll cover setting up auditd, reading the logs, and using them to detect potential security issues early.

Read Post

Last9

Read more about How Auditd Logs Help Secure Linux Environments

Five Creative Uses of Content Monitoring Software

May 29, 2025 By ChangeTower In ChangeTower

Content monitoring software is designed to track changes on web pages over time—capturing additions, deletions, and modifications across everything from blog posts to landing pages and legal disclaimers. Traditionally, it’s used by businesses and organizations that need to keep tabs on important, frequently updated content—either for compliance, competitive intelligence, or performance reasons. Here are some of the most common use cases.

Read Post

ChangeTower

Read more about Five Creative Uses of Content Monitoring Software

Leading analyst study reveals how resilience unlocks eCommerce growth

May 29, 2025 By Catchpoint In Catchpoint

Customer expectations for seamless digital experiences are higher than ever, and any disruption in availability or performance can lead to abandoned purchases and millions in lost revenue. A new commissioned study conducted by Forrester Consulting on behalf of Catchpoint shows retail & eCommerce companies are struggling: To ensure seamless customer experiences, retail & eCommerce companies must employ comprehensive Internet Performance Monitoring or assume the risk of allowing millions in revenue slip away.

View Video

Catchpoint

Monitoring

Read more about Leading analyst study reveals how resilience unlocks eCommerce growth

Highlights from Google Cloud Next 2025

May 29, 2025 By Trammell Saltzgaber In Datadog

Google Cloud Next is the biggest event of the year for the Google Cloud community, showcasing the latest and greatest offerings from Google Cloud and hundreds of its partners. As a long-time Google Cloud partner and recipient of three Google Cloud Partner of the Year awards in 2025, Datadog was there in full force, delivering several speaking sessions and running a booth on the expo floor where we met with thousands of attendees. In case you missed it, don’t worry.

Read Post

Datadog

Read more about Highlights from Google Cloud Next 2025

Syslog Implementation: Servers, Integration and Best Practices

May 29, 2025 By Alexandr Bandurchin In Uptrace

Syslog is a fundamental protocol for collecting messages and event data from various devices and applications across a network. Think of it as a universal language that allows your servers, routers, firewalls, and software to send their operational insights to a central logging point. Born from Unix systems, Syslog has evolved to become the industry standard, forming the backbone of effective log management and providing a unified view of your infrastructure's activity.

Read Post

Uptrace

Read more about Syslog Implementation: Servers, Integration and Best Practices

Debugging Errors in Background Jobs

May 29, 2025 By Sentry In Sentry

Debugging background jobs is one of those tasks that always sounds easier than it is—until you’re knee-deep in stack traces that offer no real clues. Background jobs love to run in isolated environments, cutting themselves off from all the helpful context you’d normally have. @nikolovlazar shows us how to debug these errors anyway—piecing together the missing context across systems so you can actually fix the problem instead of just guessing.

View Video

Sentry

Read more about Debugging Errors in Background Jobs

What's Inside InfluxDB 3.1: New Features for Security, Performance, and Visibility

May 29, 2025 By Peter Barnett In InfluxData

InfluxDB 3.1 is now available for both Core and Enterprise editions, bringing significant improvements that make managing high-volume, high-velocity time series data even easier, faster, and more secure. InfluxDB 3 Core is the free, open source edition of InfluxDB 3—a high-speed, recent-data engine licensed under MIT and Apache 2. InfluxDB 3 Enterprise is the commercial version of Core, adding support for longer-term historical queries, high availability, enhanced security, and more.

Read Post

InfluxData

Read more about What's Inside InfluxDB 3.1: New Features for Security, Performance, and Visibility

Navigating the SSE Landscape: The 2025 Gartner Magic Quadrant

May 29, 2025 By Teneo In Teneo

Having reviewed the 2025 Gartner Magic Quadrant for Security Service Edge (SSE), it is fair to say that it reflects a comprehensive evaluation of vendors delivering integrated, cloud-based security solutions. However, while such assessments provide valuable insights for those looking for full-stack adoption, real-world adoption may require deeper analysis and strategic planning.

Read Post

Teneo

Read more about Navigating the SSE Landscape: The 2025 Gartner Magic Quadrant

Detect hallucinations in your RAG LLM applications with Datadog LLM Observability

May 28, 2025 By Barry Eom In Datadog

Hallucinations occur when a large language model (LLM) confidently generates information that is false or unsupported. These responses can spread misinformation that jeopardizes safety, causes reputational damage, and erodes user trust. Augmented generation techniques, such as retrieval-augmented generation (RAG), aim to reduce hallucinations by providing LLMs with relevant context from verified sources and prompting the LLMs to cite these sources in their responses.

Read Post

Datadog

Read more about Detect hallucinations in your RAG LLM applications with Datadog LLM Observability

Kubernetes Logs: How to Collect and Use Them

May 28, 2025 By Anjali Udasi In Last9

If you’ve worked with Kubernetes, you know logs are essential for understanding what’s happening inside your clusters. However, unlike traditional servers, Kubernetes logs present their unique challenges. Pods frequently start and stop, containers restart regularly, and logs stored locally can be lost quickly. Because of this, managing logs in Kubernetes requires a different approach.

Read Post

Last9

Read more about Kubernetes Logs: How to Collect and Use Them

Docker Container Lifecycle: Key States and Best Practices

May 28, 2025 By Faiz Shaikh In Last9

You’ve probably run a lot of Docker containers, but do you know what happens behind the scenes? The Docker container lifecycle is the path a container follows from being created to running, stopping, and finally getting removed. Understanding these steps helps you figure out why a container might not start or when to restart it instead of creating a new one.

Read Post

Last9

Read more about Docker Container Lifecycle: Key States and Best Practices

Introducing Session Health in Sentry (Now In Open Beta)

May 28, 2025 By Steve Zegalia In Sentry

You push a release that touches the checkout flow. Now you’re glued to dashboards and checking Slack, hoping you didn’t introduce a regression that breaks the payment path. You can’t tell if you’ve just shipped a blocker that’s stalling every cart—or some edge case quietly making users bail.

Read Post

Sentry

Read more about Introducing Session Health in Sentry (Now In Open Beta)

Flying Your Network Blind? | Obkio

May 28, 2025 By Obkio In Obkio

We created this video for every IT team still relying on guesswork to manage network performance. Here’s the reality: No monitoring = flying blind No alerts = no prevention No visibility = slow troubleshooting, false assumptions, and frustrated users Even the best IT pros need the right tools, just like pilots need instruments. Have you ever thought about: – Why “no complaints” isn’t the same as “no issues”– The hidden cost of poor visibility– How skill only takes you so far without data to back it up.

View Video

Obkio

Read more about Flying Your Network Blind? | Obkio

Lowering the barrier to user feedback #coding #programming #webdesign #debugging

May 28, 2025 By TrackJS In TrackJS

Jordan discusses lowering the barrier to getting user feedback during the redesign of @Trackjs.

View Video

TrackJS

Read more about Lowering the barrier to user feedback #coding #programming #webdesign #debugging

A Message from our CEO: Virtana Acquires Zenoss!

May 28, 2025 By Virtana In Virtana

View Video

Virtana

Read more about A Message from our CEO: Virtana Acquires Zenoss!

Breaking Silos: Pairing InfluxDB 3 with Your Historian for Better Insights

May 28, 2025 By Allyson Boate In InfluxData

Industrial systems constantly generate time series data—streams of time-stamped values like temperature, flow rate, vibration, or power load. This data powers real-time monitoring, performance tracking, and long-term forecasting across critical infrastructure, energy systems, and manufacturing environments.

Read Post

InfluxData

Read more about Breaking Silos: Pairing InfluxDB 3 with Your Historian for Better Insights

ManageEngine Site24x7 monitoring actions are now available within ServiceDesk Plus On Demand

May 28, 2025 By Ramkumar Ramaswamy In Site24x7

At ManageEngine, we're committed to empowering IT teams with tools that simplify operations and deliver effortless observability for all stakeholders. We're excited to announce the Site24x7 extension for ManageEngine ServiceDesk Plus On Demand now available on the ManageEngine Marketplace. This extension transforms ServiceDesk Plus On Demand from a passive ticketing tool into an active hub for IT infrastructure management.

Read Post

Site24x7

Read more about ManageEngine Site24x7 monitoring actions are now available within ServiceDesk Plus On Demand

User experience depends on more than just your website speed.

May 28, 2025 By Catchpoint In Catchpoint

Even if your Core Web Vitals are flawless, user experience depends on so much more—CDNs, DNS, BGP, APIs, third-party services, and local ISPs all play a role in how users experience your page. If any part of that chain breaks, so does the user experience. And worse? Your teams are left in the dark, scrambling to find the root cause.

View Video

Catchpoint

Monitoring

Read more about User experience depends on more than just your website speed.

Honeycomb MCP: Code to Production Impact

May 28, 2025 By Honeycomb In Honeycomb

Right from your IDE, find out how long your code talked to execute in production. If your AI agent speaks Model Context Protocol, it can query Honeycomb for you!

View Video

Honeycomb

Read more about Honeycomb MCP: Code to Production Impact

Kubernetes observability: How to enrich logs with GeoIP using the Kubernetes Monitoring Helm Chart

May 28, 2025 By Mattias Segerdahl In Grafana

When your Kubernetes app suddenly has traffic spikes in a distant country, it can be difficult to determine why. Let’s say, for example, we have an e-commerce app that started to receive an unusual surge of visitors from Australia — something we never anticipated. We search for answers in our logs, but without geographic context, we don’t have the full insights we need.

Read Post

Grafana

Read more about Kubernetes observability: How to enrich logs with GeoIP using the Kubernetes Monitoring Helm Chart

Creating a Service Level Objective

May 28, 2025 By Honeycomb In Honeycomb

Last Thursday, our /api/cart endpoint got a lot slower. But we didn't notice, because we hadn't set up a Service Level Objective to guard its reliability and responsiveness. This video shows how to identify and create that SLO in Honeycomb.io.

View Video

Honeycomb

Read more about Creating a Service Level Objective

Honeycomb MCP: Performance Problem to Code Fix

May 28, 2025 By Honeycomb In Honeycomb

Integrate your AI agent with Honeycomb, and it can run queries to find problems, then get right into the code to fix it.

View Video

Honeycomb

Read more about Honeycomb MCP: Performance Problem to Code Fix

Build Vega-Lite visualizations natively in Datadog with the Wildcard widget

May 28, 2025 By Candace Shamieh In Datadog

Datadog dashboards provide a unified view of your applications, infrastructure, logs, and other observability data—making it easy to monitor health, investigate issues, and share insights across teams. While native Datadog widgets support a broad range of visualization types, some use cases call for more customized representations, particularly when you’re working with unconventional data formats, external sources, or specific transformations.

Read Post

Datadog

Read more about Build Vega-Lite visualizations natively in Datadog with the Wildcard widget

Elastic and AWS collaborate to bring GenAI to DevOps, security, and search

May 28, 2025 By Eddie Xu, In Elastic

Today, we are happy to celebrate Elastic and AWS committing to a five-year strategic collaboration agreement (SCA). Our collaboration underscores the efforts of Elastic and AWS to provide you with increased speed and greater flexibility as you adopt generative AI technology.

Read Post

Elastic

Read more about Elastic and AWS collaborate to bring GenAI to DevOps, security, and search

SQL Server Security: Protecting Your Data From Threats

May 28, 2025 By DNSstuff tech team In SolarWinds

If your organization isn’t focused on data security, it’s time to make some changes, particularly if you rely on SQL Server to manage and store valuable information. Cyber threats, data breaches, and malicious attacks are on the rise—and they are constantly evolving. That’s why it’s essential to have robust security measures in place. SQL Server has several built-in security features, but you must take a proactive approach to protect your data.

Read Post

SolarWinds

Read more about SQL Server Security: Protecting Your Data From Threats

AIOps benefits: 5 core ways agentic AI transforms IT

May 28, 2025 By LogicMonitor In LogicMonitor

Your systems are getting faster. More complex. More distributed. But your tools are still waiting for something to go wrong before they do anything about it. That’s the real limitation of most AIOps platforms. They highlight issues. They suggest next steps. But they stop short of action—leaving your team to connect the dots, chase down context, and manually fix what broke. Agentic AIOps doesn’t wait. It acts.

Read Post

LogicMonitor

Read more about AIOps benefits: 5 core ways agentic AI transforms IT

Bring a Business Service Perspective to Your Network Monitoring

May 28, 2025 By Robert Kettles In Broadcom

In recent years, network performance and business performance have become increasingly intertwined. Now, virtually every critical employee and customer service is in some way reliant upon network connectivity. When connectivity falters, those critical processes can be impaired or stopped completely. However, for too many teams, it can be difficult to knowledgeably determine how specific outages or issues actually affect a business service. For example, say an operator discovers a device is down.

Read Post

Broadcom

Read more about Bring a Business Service Perspective to Your Network Monitoring

The End of the Network Engineer as We Know It?

May 28, 2025 By Yann Guernion In Broadcom

For decades, the enterprise network was a well-defined fortress and network engineers were its meticulous guardians. However, their visibility and control was largely confined within the parameters of their organization's infrastructure. The cloud revolution and the ubiquity of SaaS applications have shattered these traditional boundaries. Today, for virtually every organization, the internet is the new enterprise network.

Read Post

Broadcom

Read more about The End of the Network Engineer as We Know It?

The Architecture Loop: MVC and the Hidden Costs of Microservices

May 28, 2025 By Sarah Morgan In Scout

In 2025, many companies are reckoning with the true cost of microservices, especially as cloud bills grow and engineering teams face coordination fatigue. The move back to monoliths is gaining traction, particularly for startups and mid-sized businesses who need: ‍

Read Post

Scout

Read more about The Architecture Loop: MVC and the Hidden Costs of Microservices

Introducing Logz.io Dashboards (Beta): Shaping the future of unified Observability with Open 360

May 28, 2025 By Jade Lassery In logz.io

We’re thrilled to announce the Beta launch of Logz.io Dashboards – a major step forward in how engineers and DevOps teams visualize and analyze their telemetry data. For the first time, Logz.io users can now create dashboards that bring together logs, metrics, and traces in a single unified view — making it easier than ever to monitor performance, detect issues, and troubleshoot incidents without switching tools or losing context. This launch is more than just a product update.

Read Post

logz.io

Read more about Introducing Logz.io Dashboards (Beta): Shaping the future of unified Observability with Open 360

What's New in 6.2 Webinar

May 28, 2025 By Graylog In Graylog

SIEM & Log Management — Without Compromise: Watch an exclusive dive into Graylog 6.2 Spring ’25 Release, purpose-built to eliminate the trade-offs traditional Log Management and SIEMs force on your IT, Security, DevOps and Compliance teams. You get smarter data retention, plus easier detection and investigations.

View Video

Graylog

Read more about What's New in 6.2 Webinar

What's New in Progress Flowmon ADS 12.5

May 28, 2025 By Filip Cerny In Flowmon

Progress is pleased to announce that we have updated our industry-leading Flowmon Anomaly Detection System (ADS) to version 12.5. The latest update has these additions: Let’s take a look.

Read Post

Flowmon

Read more about What's New in Progress Flowmon ADS 12.5

How to Add Performance Data Graphs into Your Icinga Instance

May 28, 2025 By Guest Author In Icinga

This is a guest blogpost by Markus Opolka from the Icinga Enterprise Partner NETWAYS. After forking the Grafana Module for Icinga Web last year, we started thinking about alternative ways to display Icinga performance data graphically in the web interface. Running a separate Grafana instance just to render graphs is a lot of overhead and adds operational complexity — no matter how much you like Grafana. Plus, installing the grafana-image-renderer isn’t always straightforward.

Read Post

Icinga

Read more about How to Add Performance Data Graphs into Your Icinga Instance

Introducing Netdata Insights

May 27, 2025 By Shyam Sreevalsan In netdata

We’ve been thinking a lot about synthesis lately. Netdata already samples every metric every second at the edge. Engineers told us the remaining pain point was synthesis, the ability to pull hours or days or months of high‑resolution time‑series into a concise explanation they could hand to a teammate (or use themselves to debug faster).

Read Post

netdata

Read more about Introducing Netdata Insights

SigNoz Community Edition now available with SSO (Google OAuth) and API Keys

May 27, 2025 By Ankit Anand In SigNoz

One of the biggest asks from our open-source community has been to open-source our SSO support, which was part of our enterprise offering. Today, we’re thrilled to announce that support for SSO with Google OAuth is now part of our latest release. Latest version: v0.85.0 Not only that, we've also shipped another highly anticipated feature for our Community Edition: API Keys for comprehensive programmatic access to SigNoz.

Read Post

SigNoz

Read more about SigNoz Community Edition now available with SSO (Google OAuth) and API Keys

Discover powerful insights with nested metric queries

May 27, 2025 By Kathy Lin In Datadog

To gain adequate visibility into your distributed applications, you need to observe those applications at different levels of granularity. This means that you need to be able to query collected telemetry data both at the level of the whole application and at the level of selected components. Thanks to the power of Datadog tagging, you can already do this by aggregating your metrics within any scope of your choosing.

Read Post

Datadog

Read more about Discover powerful insights with nested metric queries

Why didn't my Playwright test capture video? Troubleshooting guide.

May 27, 2025 By Checkly In Checkly

How to diagnose failures with a Playwright trace: https://www.checklyhq.com/guides/reading-traces/

Playwright fixtures: https://youtu.be/2O7dyz6XO2s

View Video

Checkly

Read more about Why didn't my Playwright test capture video? Troubleshooting guide.

A Fresh Look Without Moving the Cheese

May 27, 2025 By Todd H. Gardner In TrackJS

After 12 years of faithful service, the TrackJS interface was starting to show its age. Not that it wasn’t working—it was still doing exactly what our customers needed it to do. But when you’re staring at Bootstrap styles from 2012 and a version of LESS that might be officially defunct, it’s probably time for a refresh.

Read Post

TrackJS

Read more about A Fresh Look Without Moving the Cheese

Server Performance Metrics Explained

May 27, 2025 By Faiz Shaikh In Last9

Server performance metrics help you figure out what’s going wrong, where your bottlenecks are, and how your system handles load. They give you the data to plan capacity, fix issues before they escalate, and build more reliable infrastructure. In this guide, we’ll go over the core metrics that matter, how to monitor them effectively, and the tools that can help along the way.

Read Post

Last9

Read more about Server Performance Metrics Explained

3 AIOps Trends for 2025 #aiops

May 27, 2025 By ScienceLogic In ScienceLogic

As IT environments grow more complex, teams need smarter, faster ways to stay in control. In 2025, three trends are redefining how modern IT operations teams drive efficiency and resilience: Automation Everywhere: Offload routine tasks with intelligent workflows Predictive Everything: Spot and resolve issues before they impact users AI + Human Collaboration: Empower teams with real-time, AI-driven insights.

View Video

ScienceLogic

Read more about 3 AIOps Trends for 2025 #aiops

Brand email with your logo

May 27, 2025 By Colin Bartlett In StatusGator

StatusGator supports custom email branding on our Enterprise plan and as an add-on to other plans, allowing your customers or end-users to get an email that has your organization logo and sends from your organization’s email address. Previously, this email logo used the same image as your status page. Now, you can upload a custom logo to be used just for your emails. Enjoy improved branding by uploading a logo that fits the email perfectly.

Read Post

StatusGator

Read more about Brand email with your logo

Span Metrics

May 27, 2025 By Sentry In Sentry

Try Sentry for free: https://sentry.io
Docs: https://docs.sentry.io

View Video

Sentry

Read more about Span Metrics

Simplifying Observability: Streamlining Telemetry with a Centralized Pipeline

May 27, 2025 By Norman Hsieh In Checkly

Modern applications generate a deluge of telemetry data—logs, metrics, and traces—that hold the key to understanding system performance and reliability. However, managing this data effectively is a growing challenge for DevOps teams. Raw telemetry can overwhelm teams with complexity and noise even when collected via robust standards like OpenTelemetry.

Read Post

Checkly

Read more about Simplifying Observability: Streamlining Telemetry with a Centralized Pipeline

Telemetry for Modern Apps: Reducing MTTR with Smarter Signals

May 27, 2025 By Mezmo In Mezmo

By Sara Miteva Sr. Product Marketing Manager, Checkly ‍ Modern applications are complex. Microservices, third-party dependencies, and continuous deployments all contribute to a flood of telemetry data—logs, metrics, traces—flying in from every direction.

Read Post

Mezmo

Read more about Telemetry for Modern Apps: Reducing MTTR with Smarter Signals

Raids or Work Projects? The Surprising Similarities!

May 27, 2025 By solarwindsinc In SolarWinds

Coordinating 40 MMO players online isn’t so different from leading a team at work. From problem-solving to collaboration, the skills gamers practice are shockingly transferable to the workplace.

View Video

SolarWinds

Read more about Raids or Work Projects? The Surprising Similarities!

How to import Prometheus-style alerts and recording rules to Grafana-managed alerts and recording rules

May 27, 2025 By Sonia Aguilar In Grafana

Grafana Alerting has evolved dramatically since the legacy dashboard-alert days. Today, Grafana-managed alerts power enterprise-scale monitoring in Grafana Cloud and on-prem installations. And over the last two years, we’ve added RBAC, state history, versioning, and much more. At the same time, our own monitoring at Grafana Labs relies heavily on Prometheus-style alerts—a situation that’s not uncommon for our users, too.

Read Post

Grafana

Read more about How to import Prometheus-style alerts and recording rules to Grafana-managed alerts and recording rules

From Alert to Fix in 10 minutes: How a Slow Query Took Down Placid.app

May 27, 2025 By Armin Ulrich In Sentry

This is a guest post from Armin Ulrich, a fullstack developer, and founder of placid.app. He also created the MadeWith* network where he shares his projects and allows other developers to share theirs. There are many things I would rather do at 9pm than tracking down a mission-critical bug, but sometimes you don’t have a choice. Let me tell you the story about a slow query that led to a cascading failure–and how it could have been worse.

Read Post

Sentry

Read more about From Alert to Fix in 10 minutes: How a Slow Query Took Down Placid.app

Extreme automation and the SAP Cloud ERP journey

May 26, 2025 By Brenton O'Callaghan In Avantra

Cloud ERP arrives as the new holy grail of ERP architecture: a composable, flexible and scalable collection of core business services working together to meet enterprise ERP needs. Of course, getting there for a large enterprise with significant existing complexity across legacy SAP implementations isn't a trivial task. Much has been written about S/4HANA migration, but less explored are the benefits of automation solutions used for the regular operations of SAP to migration projects. These solutions offer a number of accelerators and benefits to migration projects and SAP teams, so it is worth exploring.

Read Post

Avantra

Read more about Extreme automation and the SAP Cloud ERP journey

How to Reduce Downtime: Keep Your Business Running Smoothly

May 26, 2025 By Nuno Tomas In isDown

Downtime refers to any period when your business operations are interrupted or unavailable due to technical issues. Whether it's caused by unscheduled downtime, like sudden system failures, or planned downtime for regular maintenance, it can significantly impact your business continuity. The effects of downtime can be severe, leading to financial losses, decreased productivity, and a damaged reputation.

Read Post

isDown

Read more about How to Reduce Downtime: Keep Your Business Running Smoothly

The ROI of monitoring your Azure environment: Prevent surprises, control costs, boost uptime

May 26, 2025 By Mahalashmi Narayanan In Site24x7

Like many cloud providers, Azure offers services that scale with usage. However, unanticipated overutilization of Service Bus, Azure Functions, and SQL databases can incur additional costs. Managing these resources effectively is crucial for keeping the billing framework predictive.

Read Post

Site24x7

Read more about The ROI of monitoring your Azure environment: Prevent surprises, control costs, boost uptime

Cloud Cost Management & Trends in 2025: Strategies to Optimize Your Cloud Spend

May 26, 2025 By Kayly Lange In Splunk

Cloud computing has become the backbone of modern business operations, powering everything from day-to-day collaboration to large-scale digital transformation initiatives. As organizations deepen their reliance on cloud services, the financial stakes continue to grow. According to Gartner, global spending on public cloud services is projected to reach over $720 billion in 2025, a significant increase from nearly $600 billion in 2024.

Read Post

Splunk

Read more about Cloud Cost Management & Trends in 2025: Strategies to Optimize Your Cloud Spend

Shedding Light on Kafka's Black Box Problem (with OpenTelemetry)

May 26, 2025 By Elizabeth Mathew In SigNoz

"All language is but a poor translation." — Franz Kafka This quote by Franz Kafka reminds me of the time when I used to look at metrics from “Apache Kafka” topics trying to figure out what was causing the huge lags and manually deleting the messages in certain partitions to get rid of polluted messages. Yep, pretty lost in translation. I wasn’t aware of the power of observability for a Kafka producer-topic-consumer system.

Read Post

SigNoz

Read more about Shedding Light on Kafka's Black Box Problem (with OpenTelemetry)

Graylog vs Loki: Key Differences and Use Cases

May 26, 2025 By Anjali Udasi In Last9

Logs are a key part of building and running software, but managing them can get complicated fast. As your apps grow and generate logs from many sources, choosing the right tool to store, search, and analyze those logs becomes important. Graylog and Loki are two popular options, each with a different way of handling logs. In this blog, we’ll break down the main differences between Graylog and Loki, how they work, and which types of projects they suit best.

Read Post

Last9

Read more about Graylog vs Loki: Key Differences and Use Cases

An Easy and Practical Guide to CDN Monitoring

May 26, 2025 By Preeti Dewani In Last9

A CDN delivers your content around the world, making sure users get it quickly and reliably. When it slows down or goes offline, users notice right away. Good CDN monitoring gives your team the information needed to fix issues before they affect users. This guide explains the basics of CDN monitoring and shows practical ways to set it up.

Read Post

Last9

Read more about An Easy and Practical Guide to CDN Monitoring

Mission Impossible: Find out the Reasons Why Your Network Is Down (and How to Proactively Prevent Network Downtime)

May 26, 2025 By Andrii Kernitskyi In Obkio

Your mission, should you choose to accept it, is to prevent network downtime before it takes your business offline. The threat is real. One moment, your network is up. The next calls drop, websites freeze, apps stall, and customers vanish. You hear the dreaded question echoing across departments: “Is the network down?” You’re not alone.

Read Post

Obkio

Read more about Mission Impossible: Find out the Reasons Why Your Network Is Down (and How to Proactively Prevent Network Downtime)

How to Choose an APM Solution: 5 Critical Questions for 2025

May 25, 2025 By Alexandr Bandurchin In Uptrace

An APM solution, or Application Performance Monitoring tool, is a software application that helps businesses monitor and manage the performance and availability of software applications. APM tools gather data from systems, servers, databases, APIs, and end-user devices to provide deep insights into the root causes of performance issues. APM solutions have evolved far beyond basic monitoring.

Read Post

Uptrace

Read more about How to Choose an APM Solution: 5 Critical Questions for 2025

Grafana Campfire - Hiring with AI and more about Grafana MCP (Grafana Community Call - May 2025)

May 24, 2025 By Grafana In Grafana

In this Campfire community call, we will talk about the new and the future of AI in the field of Observability space and also discuss about the Grafana MCP server to provide access to your Grafana instance and the surrounding ecosystem. Join me (Usman), Matt Ryer, Carl Bergquist, David Kaltschmidt for this exciting session. Special guests: Sarah Zinger, Cyril Tovena and Ben Sully.

View Video

Grafana

Read more about Grafana Campfire - Hiring with AI and more about Grafana MCP (Grafana Community Call - May 2025)

NiCE Expands Microsoft SCOM Services with New Expert Training Options

May 23, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

NiCE IT Management Solutions is excited to announce expanded service offerings and professional training options for Microsoft System Center Operations Manager. In addition to our well-established consulting and monitoring solutions, we now offer custom and standard SCOM training programs tailored to varying skill levels and organizational needs. Our goal: empower IT teams to maximize performance, ensure stability, and deepen expertise in managing modern infrastructure.

Read Post

NiCE IT Mgmt

Read more about NiCE Expands Microsoft SCOM Services with New Expert Training Options

Easy Way to Convert Wavefront Metrics Using OpenTelemetry

May 23, 2025 By Benjamin Pitts In MetricFire

Once upon a time in the world of metrics, Wavefront was a pioneer. Before Prometheus took over and tools like OpenTelemetry unified tracing and metrics, Wavefront brought something novel to the table: human-readable metrics with real-time querying and tag-based dimensionality. In enterprise environments running VMware or early microservices, it offered a scalable way to understand a system's behavior. But as the telemetry landscape evolved, many systems that spoke Wavefront were left behind.

Read Post

MetricFire

Read more about Easy Way to Convert Wavefront Metrics Using OpenTelemetry

Ownership change of the ansible-collection-icinga to NETWAYS

May 23, 2025 By Feu Mourek In Icinga

After NETWAYS has already taken a leading role in the past in maintaining the Ansible Collection Icinga, contributing features and bug fixes, it’s now official: The Ansible Collection Icinga is moving into the NETWAYS namespace (on GitHub and Ansible Galaxy). The people involved in the repository will remain largely the same.

Read Post

Icinga

Read more about Ownership change of the ansible-collection-icinga to NETWAYS

What are Microservices? A Path to Scalability and Agility

May 23, 2025 By Wendy Howard In eG Innovations

If developing scalable, agile applications is a priority for your business, microservices may provide a compelling solution. But what are microservices exactly? The proper microservices definition refers to a modern architectural approach where an application is built as a collection of loosely coupled services. Each service is independent, self-contained, and designed around a specific business capability.

Read Post

eG Innovations

Read more about What are Microservices? A Path to Scalability and Agility

Now in Beta: Move All Your Monitors to Code with Checkly CLI Import

May 23, 2025 By Checkly In Checkly

Checkly CLI Import is now in Beta. If you started adding your monitors through the Checkly UI and now want to scale and move to Monitoring as Code, you can do this effortlessly - by using our new Checkly Import command.

View Video

Checkly

Read more about Now in Beta: Move All Your Monitors to Code with Checkly CLI Import

Why didn't my Playwright test capture video?

May 23, 2025 By Nočnica Mellifera In Checkly

If you use Checkly, eventually you'll be looking at alerts about something failing, and wonder how to debug a failed check. For most of us, the first thing we want to see is the video of a failed check run. Sometimes, though, our check doesn’t capture video. This guide will cover three common reasons a video doesn’t show up on a check run. This advice is general for Playwright as well as those running Playwright tests on Checkly.

Read Post

Checkly

Read more about Why didn't my Playwright test capture video?

Understanding Core Web Vitals | Grafana Frontend Observability

May 23, 2025 By Grafana In Grafana

In this video, we break down Core Web Vitals in Grafana Cloud’s Frontend Observability and Grafana Faro.

View Video

Grafana

Read more about Understanding Core Web Vitals | Grafana Frontend Observability

Inside the Observability Journey: Lessons from CarGurus, Nearform & More

May 23, 2025 By Grafana In Grafana

Join us for a dynamic panel from Observability Sessions Boston where leaders from CarGurus, Nearform, and Grafana Labs share their real-world experiences with observability. In this candid discussion, David Frankel (CarGurus) and Joe Szodfridt (Nearform) delve into the challenges of implementing scalable observability practices, moving from centralized models to federated teams, and navigating cloud migration with a focus on performance and cost.

View Video

Grafana

Read more about Inside the Observability Journey: Lessons from CarGurus, Nearform & More

Take enhanced control of your log data with Datadog Log Workspaces

May 23, 2025 By Aaron Kaplan In Datadog

Security, operations, and development teams rely more and more on the ability to efficiently query logs. As these teams monitor the health, performance, and usage of their systems and investigate incidents, delving into log data can often be a matter of urgency.

Read Post

Datadog

Read more about Take enhanced control of your log data with Datadog Log Workspaces

Surprised By Your AWS ELB Bill? Here's What Happened

May 23, 2025 By LogicMonitor In LogicMonitor

On May 1st, AWS corrected a long-standing billing bug tied to Elastic Load Balancer (ELB) data transfers between Availability Zones (AZs) and regions. That fix triggered a noticeable increase in charges for many users, especially for those with high traffic volumes or distributed architectures. The problem wasn’t new usage; it was a silent correction to an old error.

Read Post

LogicMonitor

Read more about Surprised By Your AWS ELB Bill? Here's What Happened

VPC Log Format: Custom and Advanced Configurations

May 23, 2025 By Anjali Udasi In Last9

VPC Flow Logs come with a default format that gives you basic network traffic details. But you can tweak the format to capture exactly what you need. This can lower costs, speed up processing, and make your logs fit better with what you’re trying to monitor. If you want to improve security, keep an eye on performance, or save money, adjusting your VPC logs can make a big difference. Let’s take a look at some practical ways to customize your logs beyond the default settings.

Read Post

Last9

Read more about VPC Log Format: Custom and Advanced Configurations

A Simple Guide to Monitoring and Optimizing Prometheus CPU Usage

May 23, 2025 By Faiz Shaikh In Last9

Prometheus is supposed to help you monitor your stack, not become the thing you need to monitor. But if you’ve ever seen it spike in CPU and slow everything down, you know that’s not always the case. High Prometheus CPU usage usually shows up when you're scraping too many metrics, using expensive queries, or running with default configs that don’t fit your workload. This guide covers how to track Prometheus CPU usage, what typically causes it, and how to fix it.

Read Post

Last9

Read more about A Simple Guide to Monitoring and Optimizing Prometheus CPU Usage

SAML authentication in Grafana Cloud: a guide for easy configuration

May 23, 2025 By Ryan Kelly In Grafana

In my role as Senior Observability Architect here at Grafana Labs, one of the things I focus on is making sure customers are getting the most out of our products. Recently, I noticed a trend where customers were struggling to get SAML authentication configured properly. They were getting stuck on some of the steps needed to configure the users key pair values, which allows users to log in with the correct roles assigned in Grafana.

Read Post

Grafana

Read more about SAML authentication in Grafana Cloud: a guide for easy configuration

Harnessing Network Observability to Enhance Grid Resilience

May 23, 2025 By Alec Pinkham In Broadcom

Within the utility sector, a lot is changing. Utilities continue to pursue digital transformation, altering the way services are delivered and operations are managed. What hasn’t changed is the criticality of the services provided. These organizations deliver essential resources like natural gas, electricity, and water—services that we as consumers rely upon constantly for our comfort, sustenance, communications, and more.

Read Post

Broadcom

Read more about Harnessing Network Observability to Enhance Grid Resilience

Preparing for the Autonomous Future

May 23, 2025 By Dallon Robinette In Selector

Throughout this blog series, we’ve followed how AI reshapes network operations – from foundational data harmonization to real-time correlation, from contextual insights to agent-driven automation, and most recently, to conversational access through natural language interfaces. But we haven’t reached the final destination.

Read Post

Selector

Read more about Preparing for the Autonomous Future

NiCE MP Catalog Offline Toolkit Free of Charge!

May 22, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

A free-of-charge utility for System Center Operations Manager administrators working in secure, offline, or air-gapped environments.

Read Post

NiCE IT Mgmt

Read more about NiCE MP Catalog Offline Toolkit Free of Charge!

How to implement business observability

May 22, 2025 By Elastic Observability Team In Elastic

It sounds simple: You define metrics for success, you track them, and if they fail, you fix them. For decades, this was how businesses monitored their systems. However, a reactive monitoring approach, which alerts businesses about failures only after the issue has already impacted operations, became insufficient as digital architectures grew more complex.

Read Post

Elastic

Read more about How to implement business observability

Welcome to the SolarWinds Community | Overview & Resources Guide

May 22, 2025 By solarwindsinc In SolarWinds

Welcome to the official SolarWinds Community! Whether you're just getting started or are a long-time user, this video offers a complete overview of the valuable tools, training, and support available to help you get the most from your SolarWinds experience.

View Video

SolarWinds

Read more about Welcome to the SolarWinds Community | Overview & Resources Guide

Create metricization rules for your data (with Voiceover)

May 22, 2025 By Splunk In Splunk

Create an SPL2 statement that generates metrics from your log data based on extracted fields - now enhanced with voiceover narration to provide clearer instruction and improved accessibility.

View Video

Splunk

Read more about Create metricization rules for your data (with Voiceover)

Create a Splunk Observability Cloud destination (with Voiceover)

May 22, 2025 By Splunk In Splunk

Create a connection to your Splunk Observability Cloud deployment from the system connections page in Splunk Cloud Platform - now enhanced with voiceover narration to provide clearer instruction and improved accessibility.

View Video

Splunk

Read more about Create a Splunk Observability Cloud destination (with Voiceover)

Motadata AIOps - AI-Driven Network Monitoring Software

May 22, 2025 By Motadata In Motadata

What positions Motadata AIOps as a standout among the premier network monitoring tools available in the market? In a crowded market of network monitoring tools, Motadata AIOps distinguishes itself through its intelligent and future-proof approach. The Network Observability tool leverages the power of AI to monitor your network and predict and prevent problems before they occur. This helps you achieve unmatched scalability for your growing network needs, while its open architecture and integration capabilities ensure a unified view of your entire IT environment.

View Video

Motadata

Read more about Motadata AIOps - AI-Driven Network Monitoring Software

Announcing Icinga for Kubernetes v0.3.0

May 22, 2025 By Eric Lippmann In Icinga

We’re excited to share that Icinga for Kubernetes v0.3.0 is here! This release is packed with features designed to make monitoring your Kubernetes environments smoother, smarter, and more efficient. Let’s take a closer look at what’s new.

Read Post

Icinga

Read more about Announcing Icinga for Kubernetes v0.3.0

Motadata AIOps | Monitoring Infrastructure Using Monitors & Monitor Settings

May 22, 2025 By Motadata In Motadata

In the world of IT infrastructure management, having a real-time understanding of the health and performance of your systems is essential. Motadata AIOps introduces the Monitors, a way to provide comprehensive insights into your IT environment, empowering you to proactively manage and optimize your infrastructure.

View Video

Motadata

Read more about Motadata AIOps | Monitoring Infrastructure Using Monitors & Monitor Settings

Securing SaaS: Auvik Helped This MSP Uncover What's Really Happening on Their Network

May 22, 2025 By Auvik In Auvik

RJ2 Technologies didn't know they needed a SaaS management solution... until Auvik helped them tighten up things like shared passwords and unnecessary license costs - reducing their security risk AND saving $$$!

View Video

Auvik

Read more about Securing SaaS: Auvik Helped This MSP Uncover What's Really Happening on Their Network

Cleaning up your old code #coding #programming #webdesign #debugging

May 22, 2025 By TrackJS In TrackJS

Eric talks about the redesign of @Trackjs and what we gained from cleaning up old stuff.

View Video

TrackJS

Read more about Cleaning up your old code #coding #programming #webdesign #debugging

Real-time detection of BGP blackholing and prefix hijacks

May 22, 2025 By Sheldon Pereira In Catchpoint

Border Gateway Protocol (BGP) remains the backbone of inter-domain routing on the Internet, but its fundamental trust model leaves it vulnerable to misconfigurations, hijacks, and blackholing. When these issues occur, they often go undetected by the impacted networks—until users report degraded performance or service outages. This post walks through a real-world incident in which a legitimate traffic spike led to an upstream provider mistakenly blackholing a critical IP address.

Read Post

Catchpoint

Read more about Real-time detection of BGP blackholing and prefix hijacks

Play the Long Game: Scaling Without Breaking the Bank (or Your Team)

May 22, 2025 By ScienceLogic In ScienceLogic

Contributed by Abel Levya, Sales Engineer at ScienceLogic.

Read Post

ScienceLogic

Read more about Play the Long Game: Scaling Without Breaking the Bank (or Your Team)

Understand and manage your Datadog spend with Datadog cost data in Cloud Cost Management

May 22, 2025 By Katherine Broner In Datadog

As your organization scales its Datadog footprint, you want to understand what’s driving cost changes and promote cost awareness. But to take meaningful action, you need more than a monthly bill—you need real-time, contextualized cost data tied to services and teams. Without this visibility, it’s hard to assign ownership, prevent cost overruns, or identify which changes are affecting spend.

Read Post

Datadog

Read more about Understand and manage your Datadog spend with Datadog cost data in Cloud Cost Management

OpenTelemetry vs Micrometer: Here's How to Decide

May 22, 2025 By Anjali Udasi In Last9

In a distributed system, things break in unexpected ways. That’s why observability isn’t optional—it’s how you understand what’s going on under the hood. If you’re comparing tools to instrument your services, OpenTelemetry and Micrometer are two names you’ll run into. Both are used to collect metrics, but they take very different approaches—especially when it comes to flexibility, vendor support, and what you can do with the data.

Read Post

Last9

Read more about OpenTelemetry vs Micrometer: Here's How to Decide

Track the Right Elasticsearch Metrics Without the Noise

May 22, 2025 By Faiz Shaikh In Last9

Elasticsearch does a lot right—it's fast, scalable, and makes searches feel simple. But when things slow down or break, figuring out what’s going on can be frustrating. Especially if you’re not keeping an eye on the right metrics. This guide covers Elasticsearch metrics that are worth tracking and how they help you keep your cluster healthy without data overload.

Read Post

Last9

Read more about Track the Right Elasticsearch Metrics Without the Noise

Common Issues with Grafana Login and How to Fix Them

May 22, 2025 By Anjali Udasi In Last9

Grafana is a popular choice for monitoring and visualizing metrics, but login issues can quickly block your access and slow you down. Forgot your password? Can’t get into the admin account? Problems after changing authentication settings? These are some of the most common hiccups—and they’re usually easy to fix. This guide covers the frequent login problems you might face and walks you through practical ways to resolve them.

Read Post

Last9

Read more about Common Issues with Grafana Login and How to Fix Them

Grafana Assistant Creates a Dashboard "Out of Thin Air" | AI-Powered Observability | Grafana Labs

May 22, 2025 By Grafana In Grafana

AI-powered observability means you can *poof* create dashboards out of thin air.

View Video

Grafana

Read more about Grafana Assistant Creates a Dashboard "Out of Thin Air" | AI-Powered Observability | Grafana Labs

How to Troubleshoot Faster with LM Logs

May 22, 2025 By LogicMonitor In LogicMonitor

When an alert fires, your goal is clear: fix the problem—fast. But traditional troubleshooting rarely makes that easy. You’re immediately thrown into decision mode: All the while, the clock is ticking. The longer you’re stuck guessing what to do next, the longer your downtime drags on, and the more non-value-added engineering time you burn.

Read Post

LogicMonitor

Read more about How to Troubleshoot Faster with LM Logs

What is Digital Adoption? Strategies for 2025

May 22, 2025 By Shawn Lazarus In Nexthink

In today’s digital-first workplace, it’s not enough to deploy new software. You need your teams to actually use it. That’s where digital adoption comes in. Digital adoption is the process by which individuals not only learn how to use digital tools but also integrate them into their day-to-day tasks in a way that enhances performance. True digital adoption means employees are using the right features, in the right context, to complete work with minimal friction and maximum confidence.

Read Post

Nexthink

Read more about What is Digital Adoption? Strategies for 2025

Get Better Visibility Into App Hangs On Apple Devices

May 22, 2025 By Philipp Hofmann In Sentry

App hangs are the worst kind of bug: they don’t crash, they don’t log, and unless you're actively profiling, good luck catching them in the debugger. Maybe the main thread is blocked because it’s decoding a massive image with UIImage(data:). Maybe a background task is holding a lock or waiting on a DispatchGroup that never finishes. Maybe an async flow is stuck waiting on a continuation that never resumes.

Read Post

Sentry

Read more about Get Better Visibility Into App Hangs On Apple Devices

Using the OpenTelemetry Operator to boost your observability

May 22, 2025 By Israel Blancas In Coralogix

If you’ve ever wrangled sidecars or sprinkled instrumentation code just to get basic trace data, you know the setup overhead isn’t always worth the payoff. But what if it was… just easier? That’s where the OpenTelemetry Operator for Kubernetes steps in… and it plays great with Coralogix out of the box!

Read Post

Coralogix

Read more about Using the OpenTelemetry Operator to boost your observability

Filter, Mask, and Route Cisco ASA Data in Edge Processor

May 22, 2025 By Splunk In Splunk

This video introduces you to Splunk's new data management capabilities: filter, mask and route Cisco ASA Data in Splunk Edge Processor, empowering businesses to run operations efficiently while meeting compliance requirements, making it a powerful tool for addressing modern business challenges.

View Video

Splunk

Read more about Filter, Mask, and Route Cisco ASA Data in Edge Processor

Observability 2.0 in the Real World: Lessons from SimpliSafe's Engineering Journey

May 22, 2025 By Grafana In Grafana

In this candid and insightful talk from Observability Sessions Boston, Laban Eilers, a platform engineer at SimpliSafe, takes us on a practical deep dive into the evolution of observability—from the traditional “three pillars” model to the emerging promise of Observability 2.0.

View Video

Grafana

Read more about Observability 2.0 in the Real World: Lessons from SimpliSafe's Engineering Journey

Connect to Splunk Observability Cloud (with Voiceover)

May 22, 2025 By Splunk In Splunk

Build an Ingest Processor pipeline that ingests data matching partition conditions and sends it to your chosen destination - now enhanced with voiceover narration to provide clearer instruction and improved accessibility.

View Video

Splunk

Read more about Connect to Splunk Observability Cloud (with Voiceover)

Deploy and monitor your Ingest Processor pipeline activity (with Voiceover)

May 22, 2025 By Splunk In Splunk

Save and apply your complete pipeline to begin ingesting and processing data, then verify its activity on the Ingest Processor page in Splunk Cloud Platform.

View Video

Splunk

Read more about Deploy and monitor your Ingest Processor pipeline activity (with Voiceover)

Set Up Tracing for a Ruby on Rails Application in AppSignal

May 21, 2025 By Daniel Amah In AppSignal

In this guide, we'll harness AppSignal to detect, diagnose, and remove performance bottlenecks and employ proper tracing in a Ruby on Rails application. From setting up tracing to capturing errors and logging, we’ve got you covered. We'll ensure our application runs smoother than ever, even under the heaviest loads! But first, let's quickly touch on how to define tracing and its benefits.

Read Post

AppSignal

Read more about Set Up Tracing for a Ruby on Rails Application in AppSignal

Enhancing workflow efficiency with Elasticsearch and Red Hat OpenShift AI

May 21, 2025 By Amy Ghate In Elastic

Elastic collaborates with Red Hat on the validated pattern to enhance financial analyst workflows with RAG-powered search. We’re excited to share that Elastic and Red Hat have partnered to create validated patterns that integrate Elasticsearch’s generative AI (GenAI) and vector search capabilities with Red Hat OpenShift AI. This integration can run on accelerated hardware on-prem or in IBM Cloud to power retrieval augmented generation (RAG) solutions.

Read Post

Elastic

Read more about Enhancing workflow efficiency with Elasticsearch and Red Hat OpenShift AI

Sneak Peek: MetricFire's New Logging Tool for Scalable, Open-Source Observability

May 21, 2025 By MetricFire In MetricFire

Take a first look at MetricFire’s brand-new logging tool — designed to simplify log ingestion, storage, and visualization using open-source components like Loki, Python, Telegraf and Grok. Collect logs, search across services, and correlate them with your metrics — all inside your existing Hosted Graphite environment. Whether you're an SRE, DevOps engineer, or running logs on a budget, this sneak peek reveals how MetricFire is evolving toward full observability.

View Video

MetricFire

Read more about Sneak Peek: MetricFire's New Logging Tool for Scalable, Open-Source Observability

What is Amazon Inspector? Monitoring and Alerting with Amazon Inspector

May 21, 2025 By Babu Sundaram In eG Innovations

Amazon Inspector is an automated security assessment service that scans AWS workloads for vulnerabilities, misconfigurations, unintended network exposure and compliance risks, helping organizations enhance cloud security, detect threats, and meet regulatory requirements (such as ISO/IEC 27001, HIPAA, NIS 2 and SOC 2 Type 2) in real time. Amazon Inspector discovers and scans Amazon EC2 instances, container images in Amazon ECR (Elastic Container Registry), and Lambda functions.

Read Post

eG Innovations

Read more about What is Amazon Inspector? Monitoring and Alerting with Amazon Inspector

Is There an Existential Crisis in Network Observability?

May 21, 2025 By Yann Guernion In Broadcom

We've all been there. Users report that applications are slow, calls are dropping, or that "the internet is broken." Yet, a glance at the network dashboards shows a sea of green—latency looks acceptable, packet loss is minimal, and bandwidth seems fine. This common scenario highlights a fundamental challenge in network observability: the perceived disconnect between the technical measurements we gather and the actual experience of the people using our digital services.

Read Post

Broadcom

Read more about Is There an Existential Crisis in Network Observability?

Early Warning Signals now available in Slack

May 21, 2025 By Colin Bartlett In StatusGator

We’re excited to announce that Early Warning Signals are now available in Slack! Early Warning Signals help you detect service disruptions before they’re officially reported. Now, these critical notifications will show up directly in your Slack workspace, keeping your team in the loop without having to check your email.

Read Post

StatusGator

Read more about Early Warning Signals now available in Slack

Top 13 Fluentd Alternatives 2025

May 21, 2025 By Pavithra Parthiban In Atatus

Fluentd is popular for its flexibility and extensive plugin support, making it easy to collect, process, and forward logs from many different sources. However, as environments scale and observability needs evolve, teams often seek alternatives that offer lower resource usage, easier configuration, broader telemetry support, or tighter integration with their existing toolchains.

Read Post

Atatus

Read more about Top 13 Fluentd Alternatives 2025

Grafana Pyroscope: Pyroscope GitHub Integration (Community Call May 2025)

May 21, 2025 By Grafana In Grafana

Are you a Go programmer? Come join Christian Simon from the Grafana Pyroscope team to learn about how you can use the Grafana Pyroscope GitHub integration to help you understand your line-by-line resource usage of resources in Go programs.

View Video

Grafana

Read more about Grafana Pyroscope: Pyroscope GitHub Integration (Community Call May 2025)

Logz.io AI Agents: Transforming Observability Through Intelligent Automation

May 21, 2025 By Jade Lassery In logz.io

Let’s be honest. AI features can sound cool on paper, but too many tools overpromise and underdeliver. At Logz.io, we didn’t want to build “yet another AI chatbot.” We wanted to create something our engineers and yours would actually use when incidents hit, logs explode, or someone asking, “What just happened to production?” Here’s how our AI Agent evolved from a basic chat interface to an incident-resolving, log-analyzing, doc-digging, context-aware assistant.

Read Post

logz.io

Read more about Logz.io AI Agents: Transforming Observability Through Intelligent Automation

A New Era of Efficiency: Leveraging AI, Data, and Modernization to Improve Public Services

May 21, 2025 By Datadog In Datadog

Greg Reeder from Datadog talks with Martha Dorris, a leader in government customer experience, about how agencies can drive efficiency using AI, real-time data, and observability. They highlight CX wins at the State Department, IRS, and CBP—showing how smarter monitoring and design improve services, reduce costs, and strengthen citizen trust.

View Video

Datadog

Read more about A New Era of Efficiency: Leveraging AI, Data, and Modernization to Improve Public Services

Observability vs Monitoring: Enhancing, Not Replacing

May 21, 2025 By Jan Schuppik In Icinga

In the dynamic world of IT operations, a common misconception has emerged: Observability vs Monitoring is often framed as a battle where one replaces the other. At Icinga, where open-source monitoring is our expertise, we aim to clarify this misunderstanding. Observability doesn’t supplant monitoring—it complements and enhances it. The term “Observability” has become a buzzword in the tech industry, often touted as the modern solution to outdated, static monitoring practices.

Read Post

Icinga

Read more about Observability vs Monitoring: Enhancing, Not Replacing

Digital Noise Cancellation: What Gigamon Can Teach Us About Listening to the Right Signals

May 21, 2025 By Teneo In Teneo

When I’m on the train to work in the morning, I always reach for my noise-cancelling headphones. Not because the world is too loud, but because I want to hear what matters. It’s a small act of filtering signal from noise. And this got me thinking that, increasingly, that same mindset is becoming essential in how we design and manage digital infrastructure. There’s no shortage of data. In fact, there’s too much of it.

Read Post

Teneo

Read more about Digital Noise Cancellation: What Gigamon Can Teach Us About Listening to the Right Signals

5 Critical Steps in the Effective Change Management Process. Guide + Best Practices

May 21, 2025 By Staff Contributor In SolarWinds

Change is constant, but without a structured approach, it can lead to confusion, resistance, and costly disruptions. A well-planned change management process ensures transitions happen smoothly, minimizing risks while keeping teams aligned and operations running efficiently. Whether adopting new technology, restructuring teams, or refining business strategies, organizations that manage change effectively turn challenges into opportunities for growth.

Read Post

SolarWinds

Read more about 5 Critical Steps in the Effective Change Management Process. Guide + Best Practices

Celebrating 14K Stars on GitHub: Spring Update

May 21, 2025 By Denys Holius In VictoriaMetrics

Seeing that VictoriaMetrics products are this popular with engineers worldwide is fantastic: Just a little over a year ago, we hit 10K stars, and with the adoption of VictoriaLogs, the star count now went beyond 14K. We don’t take these GitHub Stars milestones for granted: It’s amazing to see these stats grow organically thanks to the community of users out there who use our products. Thank you so much!

Read Post

VictoriaMetrics

Read more about Celebrating 14K Stars on GitHub: Spring Update

Grafana Cloud updates: New observability as code tools, Grafana Drilldown enhancements, and more

May 21, 2025 By Kristin Knapp In Grafana

We consistently roll out helpful updates and fun features in Grafana Cloud, our fully managed observability platform powered by the open source Grafana LGTM Stack: Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics. With GrafanaCON 2025 — and the release of Grafana 12 — earlier this month, there are a ton of Grafana Cloud updates to share.

Read Post

Grafana

Read more about Grafana Cloud updates: New observability as code tools, Grafana Drilldown enhancements, and more

Supercharge Telemetry Pipelines: Introducing Sources and Destinations in Cribl Packs

May 21, 2025 By Giovanni Mola In Cribl

Cribl Packs have always provided a powerful way to package and share configurations across Cribl Stream environments. From pipelines to lookups, knowledge objects to functions—Packs make telemetry pipelines simple and portable. Now, we’re excited to announce a game changing expansion: Sources and Destinations can now be included in Cribl Packs!

Read Post

Cribl

Read more about Supercharge Telemetry Pipelines: Introducing Sources and Destinations in Cribl Packs

Application Performance Monitoring Guide: Strategies, Best Practices, and Tools

May 21, 2025 By Staff Contributor In SolarWinds

With the introduction of cloud services and microservices, applications have become more complicated due to their increased layers of complexity and distributed architecture. While microservices clearly offer speed, they also make things harder for the developers and operations teams. These teams need to plan for the reliable and efficient performance of such applications. To combat these challenges, application performance monitoring (APM) has surfaced as an indispensable discipline.

Read Post

SolarWinds

Read more about Application Performance Monitoring Guide: Strategies, Best Practices, and Tools

NEW Cribl Packs Are Here: Portable Telemetry Pipelines Made Easy

May 21, 2025 By Cribl In Cribl

Now with support for sources and destinations, Cribl Packs make it simple to build and share containerized telemetry pipelines across your environments.

View Video

Cribl

Read more about NEW Cribl Packs Are Here: Portable Telemetry Pipelines Made Easy

Beyond Testing: Ensuring Uptime with Playwright

May 21, 2025 By Checkly In Checkly

Join Nočnica and Filip for best practices on using your Playwright tests across your SDLC.

View Video

Checkly

Read more about Beyond Testing: Ensuring Uptime with Playwright

LLM Observability with Honeycomb.io

May 21, 2025 By Honeycomb In Honeycomb

When your software integrates with Generative AI, you need great observability. You need to see everything about the interaction with the LLM. You also need to see everything around it! That's application observability, with distributed tracing.

View Video

Honeycomb

Read more about LLM Observability with Honeycomb.io

Stockholm Observability journeys panel with PostNord, AWS, and Kanari Sweden

May 21, 2025 By Grafana In Grafana

A dynamic discussion with Grafana experts and local observability leaders sharing successes, setbacks, and next steps in their observability journeys.

View Video

Grafana

Read more about Stockholm Observability journeys panel with PostNord, AWS, and Kanari Sweden

Mastering Heroku Monitoring in 2025: Best Practices for Optimal Application Performance

May 21, 2025 By Elliot Langston In MetricFire

In today's fast-paced digital landscape, ensuring the reliability and performance of your applications is paramount. Heroku, a cloud-based Platform-as-a-Service (PaaS), simplifies application deployment and scaling. However, to fully leverage Heroku's capabilities, effective monitoring is essential. This guide delves into best practices for monitoring Heroku applications, providing context, practical steps, and unique insights to enhance your observability strategy.

Read Post

MetricFire

Read more about Mastering Heroku Monitoring in 2025: Best Practices for Optimal Application Performance

Getting Started with Loki for Log Management

May 21, 2025 By Anjali Udasi In Last9

Logs are essential, but managing them can be tedious. They quickly consume storage, slow down your searches, and make troubleshooting feel like an endless chore. Loki monitoring helps simplify this process, offering a more efficient approach to logging that developers can appreciate.

Read Post

Last9

Read more about Getting Started with Loki for Log Management

.NET Logging with Serilog and OpenTelemetry

May 21, 2025 By Faiz Shaikh In Last9

Debugging modern.NET apps isn’t as simple as scanning logs anymore. With services spread out and systems growing more complex, it's easy to miss the bigger picture. Serilog gives you clean, structured logs. OpenTelemetry brings in traces and metrics to connect the dots. This guide covers how to wire up Serilog with OpenTelemetry, send logs to traces, and build an observability setup that helps you troubleshoot, without digging through disconnected logs for hours.

Read Post

Last9

Read more about .NET Logging with Serilog and OpenTelemetry

Hidden Risks in Linux Power Monitoring - And How to Fix Them

May 20, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

In today's enterprise IT landscape, Linux on IBM Power Systems plays a crucial role in powering mission-critical workloads. Industries such as finance, healthcare, telecommunications, and manufacturing rely on IBM Power's scalability, performance, and security to handle large-scale data processing, AI-driven analytics, and high-performance computing. As these environments continue to evolve, ensuring peak system performance and reliability is more important than ever.

Read Post

NiCE IT Mgmt

Read more about Hidden Risks in Linux Power Monitoring - And How to Fix Them

How we use RUM to make design decisions that enhance user experience

May 20, 2025 By Candace Shamieh In Datadog

Before we started using Datadog Real User Monitoring (RUM), we relied on frontend logging to gather data about the user experience. Logs gave us some helpful information about exceptions and errors but didn't provide any insight into issues directly related to the user’s perspective.

Read Post

Datadog

Read more about How we use RUM to make design decisions that enhance user experience

10 Best Compliance Monitoring Tools for 2025

May 20, 2025 By ChangeTower In ChangeTower

In 2025, the role of compliance officers and risk managers has never been more complex—or more critical. New regulatory requirements, AI-generated content, and increasingly sophisticated cyber threats have dramatically raised the stakes. Here is just a small set of the pressing challenges facing compliance monitoring officials this year.

Read Post

ChangeTower

Read more about 10 Best Compliance Monitoring Tools for 2025

Monitoring Critical Experiences

May 20, 2025 By Sentry In Sentry

Try Sentry for free: https://sentry.io
Docs: https://docs.sentry.io

View Video

Sentry

Monitoring

Read more about Monitoring Critical Experiences

Finding the right Cisco Prime replacement: A guide to seamless network configuration management transition

May 20, 2025 By akash.mj@zohocorp.com In ManageEngine

With Cisco Prime Infrastructure approaching its EOL and EOS in September 2025, network administrators are at a crossroads. The transition away from this long-standing network configuration management tool necessitates a strategic evaluation of alternatives that align with organizational needs and budgets.

Read Post

ManageEngine

Read more about Finding the right Cisco Prime replacement: A guide to seamless network configuration management transition

OpenTelemetry with Prometheus: better integration through resource attribute promotion

May 20, 2025 By Cyrille Le Clerc In Grafana

With the 3.0 release, Prometheus firmly established itself as the leading metrics database for OpenTelemetry. A lot of work has gone into integrating the two open source projects, including a major Prometheus enhancement we’re really excited about: resource attribute promotion.

Read Post

Grafana

Read more about OpenTelemetry with Prometheus: better integration through resource attribute promotion

Customize your incident response with new features in Grafana Cloud IRM

May 20, 2025 By Joey Orlando In Grafana

No matter where or how you work, we all have the same goal when an incident occurs: to get it resolved effectively and efficiently—and as quickly as possible. However, the way we achieve that goal isn’t always the same. We understand that different organizations operate differently, so you need flexibility from your IRM tooling.

Read Post

Grafana

Read more about Customize your incident response with new features in Grafana Cloud IRM

Using Website Change Monitoring Software in the Age of AI Content

May 20, 2025 By ChangeTower In ChangeTower

The rise of artificial intelligence has revolutionized the way businesses create and manage digital content. As of 2024, over 45% of marketing teams are actively using generative AI tools like ChatGPT, Claude, and Jasper to create website copy, blog posts, product descriptions, and more (Salesforce). Meanwhile, Gartner predicts that by 2026, 80% of content on the internet will be AI-generated.

Read Post

ChangeTower

Read more about Using Website Change Monitoring Software in the Age of AI Content

How to Monitor Website Performance Smarter and Faster

May 20, 2025 By ScienceLogic In ScienceLogic

Is your website really performing the way your users expect? In today’s digital world, even small slowdowns can mean lost revenue and damaged brand trust. That’s where ScienceLogic comes in. This video shows how ScienceLogic’s website monitoring gives you real-time and historical visibility across regions and infrastructure. From synthetic transactions to full-stack observability, you’ll see how to spot performance issues early, validate autoscaling, and ensure fast, reliable digital experiences.

View Video

ScienceLogic

Read more about How to Monitor Website Performance Smarter and Faster

Breaking the Cycle: How Intelligent Automation Frees IT to Drive Innovation

May 20, 2025 By ScienceLogic In ScienceLogic

For decades, enterprise IT teams have operated in a state of controlled chaos. Pressured to keep digital lights on, these teams have spent far too much time buried in logs, swatting away alerts, and fighting fires one incident at a time. The familiar mantra—“do more with less”—has translated into a culture of reactive operations, where innovation takes a backseat to survival.

Read Post

ScienceLogic

Read more about Breaking the Cycle: How Intelligent Automation Frees IT to Drive Innovation

Optimize cross-platform mobile apps with Datadog RUM and Kotlin Multiplatform support

May 20, 2025 By Jessica Manheimer In Datadog

Mobile developers are increasingly adopting Kotlin Multiplatform to share business logic across iOS and Android. While Kotlin Multiplatform reduces duplication of code-writing efforts, it also introduces blind spots. Developers often lack real-time visibility into how shared code performs across platforms, making it harder to troubleshoot issues and monitor user experience.

Read Post

Datadog

Read more about Optimize cross-platform mobile apps with Datadog RUM and Kotlin Multiplatform support

Introducing the Datadog Developer Hub

May 20, 2025 By Arielle Mella In Datadog

Finding the right integrations, libraries, and open source tooling to extend a product has long been a challenge for developers. While Datadog has a vast offering of monitoring and observability solutions, many teams need to customize their setup in some way—whether by extending the Datadog Agent, integrating with third-party services, or using SDKs to interact with the Datadog API.

Read Post

Datadog

Read more about Introducing the Datadog Developer Hub

Monitoring AI Proxies to optimize performance and costs

May 20, 2025 By Barry Eom In Datadog

Businesses deploying LLM workloads increasingly rely on LLM proxies (also known as LLM gateways) to simplify model integration and governance. Proxies provide a centralized interface across LLM providers, govern model access and usage, and apply compliance safeguards for smoother operations and reduced complexity—making LLM usage more consistent and scalable.

Read Post

Datadog

Read more about Monitoring AI Proxies to optimize performance and costs

Turning Network Telemetry into Network Intelligence

May 20, 2025 By Phil Gervasi In Kentik

By applying data engineering and machine learning to raw network telemetry, it’s possible to surface insights that would otherwise go unnoticed. Learn how this approach helps teams detect anomalies in real time, forecast capacity needs, and automate responses across complex, multi-domain environments.

Read Post

Kentik

Read more about Turning Network Telemetry into Network Intelligence

Hands-On With Sentry Structured Logs

May 20, 2025 By Sentry In Sentry

Sentry is launching structured logs! You can use structured logs to capture those random events that happen across your application that don't always translate down to errors or crashes.

View Video

Sentry

Read more about Hands-On With Sentry Structured Logs

Hybrid Cloud Monitoring: A Comprehensive Guide to Strategies, Best Practices, and Tools

May 20, 2025 By Staff Contributor In SolarWinds

Modern infrastructures are no longer confined to on-premises servers alone. Instead, they span cloud environments, containers, microservices, and globally distributed systems. This landscape, known as a hybrid cloud environment, has become the new norm for organizations, primarily because it offers the scalability of the cloud and ownership over specific elements afforded by an on-premises setup.

Read Post

SolarWinds

Read more about Hybrid Cloud Monitoring: A Comprehensive Guide to Strategies, Best Practices, and Tools

What is AIOps? A clear, practical guide for 2025

May 20, 2025 By LogicMonitor In LogicMonitor

IT operations aren’t broken; they’re overloaded. Every alert, every outage, every ticket is a symptom of scale. Cloud sprawl, hybrid infrastructure, and tool overload have turned ops into a constant scramble. AIOps has been in the conversation for years, mostly as potential.

Read Post

LogicMonitor

Read more about What is AIOps? A clear, practical guide for 2025

Want AI to be better at debugging? It's all about context

May 20, 2025 By Milin Desai In Sentry

More code is being shipped today than ever before, accelerated by AI powered code gen tools. We’re in a golden age for builders. But here’s the thing: software still breaks in production. From a recent study by Microsoft, AI models struggle to debug software. It’s because most of these code gen tools lack the one thing every good developer relies on: context. To debug anything, you need context. Having AI tools doesn't change that.

Read Post

Sentry

Read more about Want AI to be better at debugging? It's all about context

Rollbar and ilert: Real-time error monitoring meets smart incident response

May 20, 2025 By Rollbar In Rollbar

We’re excited to share that Rollbar is now part of the ilert integration catalog! This new technical partnership allows software teams to detect application errors in real time with Rollbar and instantly respond using ilert’s powerful alerting and incident management features. What is Rollbar? Rollbar is a comprehensive, real-time error monitoring and debugging platform designed to help development teams detect, diagnose, and resolve issues faster—before they impact users.

Read Post

Rollbar

Read more about Rollbar and ilert: Real-time error monitoring meets smart incident response

Forecasting with InfluxDB 3 and HuggingFace

May 20, 2025 By Anais Dotis-Georgiou In InfluxData

Machine learning models must do more than make accurate predictions; they also need to adapt as the world around them changes. In real-world systems, data distributions shift due to seasonality, equipment wear, user behavior changes, or other external forces. If your models can’t keep up, the result is poor predictions. This can lead to outages, inefficiencies, or missed opportunities. That’s why forecasting systems need to be monitored and resilient, not just accurate.

Read Post

InfluxData

Read more about Forecasting with InfluxDB 3 and HuggingFace

Logs in Sentry: Now in Open Beta

May 20, 2025 By Dhrumil Parekh In Sentry

You’re looking at an error in Sentry—a failed payment in your Flask backend or an unexpected null in your Node API. You’ve got the stack trace. The request details. Even the full trace. What you don’t have: the logs your app emitted right before everything went sideways. With Sentry Logs (now in open beta), you can send application logs straight to Sentry and see them automatically connected to the errors and traces you already use.

Read Post

Sentry

Read more about Logs in Sentry: Now in Open Beta

Bringing Custom Crash Responses to Unreal Engine

May 20, 2025 By Bobby Galli In BugSplat

Show a customized, crash-specific message when your game crashes. Locked in an intense battle, hanging on for dear life, on the verge of nigh-impossible victory and then… boom! Positively, absolutely, unquestionably, no one wants a crash to interrupt their favorite game. Crashes are a frustrating yet inevitable part of gaming, and the only thing worse than being on the receiving end of a crash is being on the receiving end of the same crash repeatedly.

Read Post

BugSplat

Read more about Bringing Custom Crash Responses to Unreal Engine

From Cost Centre to Compounding Advantage

May 20, 2025 By Germain UX Team In Germain UX

Most teams still treat bugs like little fires to put out. A ticket gets logged. Someone investigates. A fix gets pushed. Then it’s onto the next one. But here’s the thing nobody tells you: Every bug is a chance to get smarter. And in 2025, the best teams aren’t the ones logging the fewest bugs. They’re the ones learning the most from every bug they fix.

Read Post

Germain UX

Read more about From Cost Centre to Compounding Advantage

Top 11 Application Logging Tools for DevOps Engineers in 2025

May 20, 2025 By Faiz Shaikh In Last9

When something breaks in production, logs are usually where you start. They help you figure out what happened, where, and why. But with microservices architecture, logging isn't simple anymore. In a traditional monolithic application, logs live in one place. With microservices, they're scattered across multiple services, containers, and sometimes even data centers. What used to be a simple grep command now feels like solving a mystery without most of the clues.

Read Post

Last9

Read more about Top 11 Application Logging Tools for DevOps Engineers in 2025

Grafana Tempo vs Jaeger: Key Features, Differences, and When to Use Each

May 20, 2025 By Anjali Udasi In Last9

Both Grafana Tempo and Jaeger are distributed tracing tools designed for modern microservice architectures. Jaeger, released as an open-source project by Uber in 2015, has matured into a graduated CNCF project. Tempo, announced by Grafana Labs in October 2020, is a newer entrant focused on high-volume tracing with a unique storage architecture. Before comparing these tools in detail, let's quickly review what distributed tracing is and why it matters.

Read Post

Last9

Read more about Grafana Tempo vs Jaeger: Key Features, Differences, and When to Use Each

Is SCOM dead? Not even close - It has just evolved

May 20, 2025 By Jonas Lenntun In OpsLogix

Is SCOM dead? Not even close - It has just evolved System Center Operations Manager (SCOM) is far from dead. While a growing number of monitoring alternatives have emerged in recent years, SCOM in 2025 remains a critical tool, especially for organizations running hybrid environments. Thanks to its stateful, object-oriented monitoring model and a rapidly evolving ecosystem of modern Management Packs (MPs).

Read Post

OpsLogix

Read more about Is SCOM dead? Not even close - It has just evolved

IT Performance Challenges: Why They Persist-and How to Solve Them for Good

May 19, 2025 By Kristy Slimmer In Galileo

IT Ops Problem Solver Series – Part 2: This article is a summary of a full report in our IT Ops Problem Solver Series. In this series, we’ll tackle the biggest problems facing IT Ops leaders and explore how some of Galileo’s clients are addressing them. In this part of the series, we delve into IT performance challenges and how to address them effectively.

Read Post

Galileo

Read more about IT Performance Challenges: Why They Persist-and How to Solve Them for Good

Introducing Native Mobile Support in Honeycomb for Frontend Observability

May 19, 2025 By Elsie Phillips In Honeycomb

You shipped your latest release. You tested it on emulators, QA devices, and the latest OS versions. But now it’s live and running on thousands or millions of mobile devices, across a jungle of screen sizes, hardware specs, OS versions, and network conditions. A user reports a crash on an old Samsung device over 3G. Someone else complains the app feels “sluggish” after updating. You dig through logs. Rebuild test cases. Ping the backend team. Try to reproduce. Yet, still no answers.

Read Post

Honeycomb

Read more about Introducing Native Mobile Support in Honeycomb for Frontend Observability

Today, users expect lightning-fast websites, and they're not waiting around.

May 19, 2025 By Catchpoint In Catchpoint

Nearly half of users will leave if your page takes more than 3 seconds to load. Every extra second will cost you potential customers and revenue. Howard Beader goes into the stats and explains why web performance optimization is more critical than ever.

View Video

Catchpoint

Monitoring

Read more about Today, users expect lightning-fast websites, and they're not waiting around.

Learning from LFX Mentorship @ CNCF - Jaeger

May 19, 2025 By Hariom Gupta In JaegerTracing

Hariom Gupta Follow 4 min read· 1 hour ago -- Listen Share Starting this journey was both exciting and fulfilling — and now, here I am at the finish line, having successfully completed the LFX Mentorship Program and reflecting on the experience through this blog. The past three months have been incredible — surpassing my expectations in so many ways.

Read Post

JaegerTracing

Read more about Learning from LFX Mentorship @ CNCF - Jaeger

SigNoz Launch Week 4.0 - OpenTelemetry Powered Innovations That Redefine Observability

May 19, 2025 By Anushka Karmakar In SigNoz

OpenTelemetry is rapidly becoming the backbone of modern observability, but true innovation happens when you build directly on its latest capabilities. For Launch Week 4.0, we’re excited to showcase five powerful features; each crafted to help you get more value from your telemetry, make debugging faster, and deliver a unified observability experience. Here’s a quick look at what’s new, why it matters, and how SigNoz is pushing the boundaries of what’s possible with OTel.

Read Post

SigNoz

Read more about SigNoz Launch Week 4.0 - OpenTelemetry Powered Innovations That Redefine Observability

How to Handle Logging in Microservices Architectures

May 19, 2025 By Anjali Udasi In Last9

Effective microservices logging requires standardized formats, centralized storage, correlation IDs, and strong security measures. This guide walks you through setup strategies, best practices, and tool recommendations that won't break your budget.

Read Post

Last9

Read more about How to Handle Logging in Microservices Architectures

Evaluating Synthetic Monitoring Platforms: What to Look for in 2025

May 19, 2025 By Alexandr Bandurchin In Uptrace

Synthetic monitoring simulates user interactions with applications to proactively identify performance issues before they impact real users. Modern distributed systems require sophisticated monitoring capabilities to effectively test microservices, APIs, and complex user journeys across diverse environments. This article provides a framework to evaluate synthetic monitoring platforms in 2025.

Read Post

Uptrace

Read more about Evaluating Synthetic Monitoring Platforms: What to Look for in 2025

Azure Monitor offers Grafana dashboards natively for immediate real time operational monitoring

May 19, 2025 By Ash Mazhari In Grafana

The Grafanaverse just got a little bit bigger. Today at its annual Build conference, Microsoft introduced Azure Monitor dashboards with Grafana, a new service that provides Azure users with Grafana dashboards natively integrated in the Azure Portal at no additional cost and with little administrative overhead required.

Read Post

Grafana

Read more about Azure Monitor offers Grafana dashboards natively for immediate real time operational monitoring

Monitoring Node.js: Key Metrics You Should Track

May 19, 2025 By Faiz Shaikh In Last9

Effective Node.js monitoring requires tracking runtime metrics (memory, CPU), application metrics (request rates, response times), and business metrics (user actions, conversion rates). This guide covers what to track, how to collect it, and how to set up meaningful alerts.

Read Post

Last9

Read more about Monitoring Node.js: Key Metrics You Should Track

Guide to Monitoring Apache Flink Using OpenTelemetry and MetricFire

May 19, 2025 By Benjamin Pitts In MetricFire

Apache Flink is an open-source, distributed stream processing engine built for real-time, high-throughput data pipelines. It excels at processing continuous data streams with low latency, making it a great fit for use cases like fraud detection, log analytics, real-time dashboards, personalized recommendations, and IoT telemetry.

Read Post

MetricFire

Read more about Guide to Monitoring Apache Flink Using OpenTelemetry and MetricFire

AI's Unrealized Potential: Honeycomb and DORA on Smarter, More Reliable Development with LLMs

May 19, 2025 By Honeycomb In Honeycomb

Charity Majors, CTO and Co-founder at Honeycomb, and Phillip Carter, Principal Product Manager at Honeycomb, recently hosted a webinar with DORA's Nathen Harvey on AI's unrealized potential. As part of this, we created a 3-minute highlight reel of the webinar that you can watch.

View Video

Honeycomb

Read more about AI's Unrealized Potential: Honeycomb and DORA on Smarter, More Reliable Development with LLMs

Why a No-Index Observability Architecture is Essential

May 19, 2025 By Andre Scott In Coralogix

When was the last time you asked about the architecture behind your observability provider? For most IT professionals whether in development, operations, or security, it’s not a question that naturally comes up. Yet, this architectural detail could be the difference between insight at scale and runaway costs. People are drawn to the features, the shiny things. They promise to unlock insight, drive faster response times, and tighten security.

Read Post

Coralogix

Read more about Why a No-Index Observability Architecture is Essential

Getting Started with SolarWinds Orion Dashboards

May 19, 2025 By Sameer Mhaisekar In Squared Up

SolarWinds is a popular IT infrastructure monitoring tool deployed on-prem, most well-known for its network and server monitoring capabilities. While it offers rich telemetry, it’s easy to miss the bigger picture. SquaredUp turns this complex monitoring data into clear, shareable dashboards that make it easier to spot trends, catch issues early, and keep everyone on the same page.

Read Post

Squared Up

Read more about Getting Started with SolarWinds Orion Dashboards

Create a Production Status Page in a Few Minutes

May 17, 2025 By Checkly In Checkly

Create a status page with Checkly: https://www.checklyhq.com/docs/status-pages/

View Video

Checkly

Read more about Create a Production Status Page in a Few Minutes

Tracing Funnels - Define funnels between spans | SigNoz Launch Week 4.0 Day 5

May 17, 2025 By SigNoz - Open Source Observability Platform In SigNoz

Build funnels directly on your traces and get instant answers to questions like: What fraction of spans made it from event A to event B? Between which spans are most requests failing? What is the latency between key spans? Traditional observability tools let you inspect traces and spans, but they can’t aggregate or analyze how requests flow across multiple services or stages in your system. In asynchronous, distributed architectures, the root span rarely tells the full story-and there’s no way to measure conversion, drop-off, or latency between arbitrary steps across all traces.

View Video

SigNoz

Read more about Tracing Funnels - Define funnels between spans | SigNoz Launch Week 4.0 Day 5

Future-Proof Your MariaDB-Based Services

May 16, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

We’re excited to announce the release of the NiCE MariaDB on Linux Management Pack, designed to deliver advanced monitoring and performance insights for organizations running MariaDB on Linux infrastructure. As MariaDB continues to power business-critical applications across industries, visibility into its performance, availability, and health becomes essential.

Read Post

NiCE IT Mgmt

Read more about Future-Proof Your MariaDB-Based Services

SOC 2 Type 1 Compliance: Netdata is committed to Security and Trust

May 16, 2025 By Shyam Sreevalsan In netdata

We are pleased to announce that Netdata has successfully achieved SOC 2 Type 1 attestation! Following an independent examination performed by AssuranceLab CPAs LLC, the report confirms that—as of April 25, 2025—the design of Netdata’s controls meets the Security, Availability, and Confidentiality Trust Services Criteria defined by the AICPA. At Netdata, the security and integrity of the monitoring data our users entrust to us are paramount.

Read Post

netdata

Read more about SOC 2 Type 1 Compliance: Netdata is committed to Security and Trust

Synthetic Testing Examples: User Flow Testing, APIs Validation, Custom Metrics, Log Ingestion, and More

May 16, 2025 By Jeremy Hicks In Splunk

Starting from scratch with synthetic testing of your web properties and APIs can be difficult. Questions like “what should we be testing?” will very quickly become exercises in figuring out “how can we actually do that?” which may involve sifting through various elements of the DOM or JSON responses. But there are shortcuts to synthetic testing mastery!

Read Post

Splunk

Read more about Synthetic Testing Examples: User Flow Testing, APIs Validation, Custom Metrics, Log Ingestion, and More

Create a status page for your production service in 5 minutes

May 16, 2025 By Nočnica Mellifera In Checkly

“When are we going to tell users about this?” By the time your incident response team, it’s already too late. During an outage, communicating about downtime with your user base has three main drawbacks: Instead, it’s better to create a status page that automatically shares the status of all your services in a format that users can easily understand. You’ll build trust with your users as you proactively share service status, lessening the perceived impact of incidents.

Read Post

Checkly

Read more about Create a status page for your production service in 5 minutes

Overview of Check Types

May 16, 2025 By Uptime Website Monitoring In uptime

This video is an overview of all the check types we have on offer at Uptime.com.

View Video

uptime

Monitoring

Read more about Overview of Check Types

Monitoring Oracle Cloud Load Balancer: Unlock peak performance with Applications Manager

May 16, 2025 By shruthi.rm@zohocorp.com In ManageEngine

Imagine you’re running a popular online learning platform that experiences a surge in traffic during peak hours, right before exams. Students worldwide are logging in simultaneously, watching videos, submitting assignments, and taking tests. If your Oracle Cloud Load Balancer isn’t distributing traffic efficiently or back-end servers are struggling to keep up, students could face slow loading times or service outages.

Read Post

ManageEngine

Read more about Monitoring Oracle Cloud Load Balancer: Unlock peak performance with Applications Manager

A Mindset Shift: Making Observability Integral to DevOps Practices: Datev & OpenTelemetry | Grafana

May 16, 2025 By Grafana In Grafana

In the evolving landscape of DevOps, observability is no longer optional—it’s a fundamental pillar of success. During this session, Gunter from Datev explores the critical mindset shift required to make observability an integral part of DevOps practices.

View Video

Grafana

Read more about A Mindset Shift: Making Observability Integral to DevOps Practices: Datev & OpenTelemetry | Grafana

The Control Plane Highway: Networking's Hidden Infrastructure

May 16, 2025 By Sunanda Kommula In Selector

When we discuss networks, we typically envision data packets racing along physical wires like vehicles on a highway. But beneath this visible traffic flows another critical pathway that few recognize: the control plane highway. This unseen infrastructure, where routing information flows between devices, makes the data highway possible. Before user data can flow, millions of paths must be established, creating a parallel network of equally vital importance.

Read Post

Selector

Read more about The Control Plane Highway: Networking's Hidden Infrastructure

Enhance Security with SAML: Pandora FMS Now Supports Azure Entra ID

May 16, 2025 By Rocío Cerón In Pandora FMS

In modern enterprise environments, access management is key to ensuring security and regulatory compliance (ENS, ISO 27001, NIS2, etc.). That’s why Pandora FMS has added support for Azure Entra ID, enabling authentication through SAML (Security Assertion Markup Language). With this integration, we provide simplified and secure access to our platform using Single Sign-On (SSO).

Read Post

Pandora FMS

Read more about Enhance Security with SAML: Pandora FMS Now Supports Azure Entra ID

Auto Scaling of Kubernetes Workloads Using Custom Application Metrics

May 16, 2025 By Ofer Yaniv In Broadcom

Orchestration platforms such as Kubernetes and OpenShift help customers reduce costs by enabling on-demand, scalable compute resources. Customers can manually scale out and scale in their Kubernetes compute resources as needed. Autoscaling is the process of automatically adjusting compute resources to meet a system's performance requirements. As workloads grow, systems require additional resources to sustain performance and handle increasing demand.

Read Post

Broadcom

Read more about Auto Scaling of Kubernetes Workloads Using Custom Application Metrics

Level Up Your Network Visibility: DX NetOps Topology is Now Generally Available

May 16, 2025 By Sandeep Tiwary In Broadcom

The wait is finally over! We are thrilled to announce that DX NetOps 24.3.9 marks the official general availability (GA) of DX NetOps Topology, a key milestone in our network observability journey. After a successful early access program with many customer deployments, we are excited to bring this highly anticipated solution to the broader community. DX NetOps Topology is designed to provide you with the insights and operational efficiency needed to manage both traditional and software-defined networks.

Read Post

Broadcom

Read more about Level Up Your Network Visibility: DX NetOps Topology is Now Generally Available

JVM Metrics: A Complete Guide for Performance Monitoring

May 16, 2025 By Faiz Shaikh In Last9

Your Java app slows down during peak load. A microservice crashes, but logs aren’t helpful. These aren’t rare events—they’re common signs something’s off inside the JVM. For Java developers and DevOps teams, JVM metrics offer clues to what’s going on. This blog covers the key metrics to track, what they tell you, and how to use them to troubleshoot performance issues in a practical, no-nonsense way.

Read Post

Last9

Read more about JVM Metrics: A Complete Guide for Performance Monitoring

Best Practices to Ensure Effective Downtime Communication

May 16, 2025 By Nuno Tomas In isDown

When systems go down, users don't just lose access, they lose trust if they're left in the dark. That's why having a clear plan for downtime communication matters just as much as restoring service. Whether you're managing a cloud platform, SaaS tool, or any digital service, how you respond during a disruption can shape your reputation long after the issue is resolved. While downtime is inevitable, confusion and frustration don't have to be.

Read Post

isDown

Read more about Best Practices to Ensure Effective Downtime Communication

Database monitoring in Financial Services: why this high-stakes sector requires a scalable, more comprehensive solution

May 16, 2025 By Jess Folkerts In Redgate

IT and data teams in Financial Services must meet the more exacting demands for data integrity, compliance, performance, high availability and security that are expected in the sector. These demands require a dedicated, comprehensive, and scalable monitoring solution to help teams succeed in this high stakes environment.

Read Post

Redgate

Read more about Database monitoring in Financial Services: why this high-stakes sector requires a scalable, more comprehensive solution

Transforming Observability: Simpler, Smarter, and More Affordable Data Control

May 16, 2025 By Mezmo In Mezmo

At Mezmo, we’ve always believed that observability should empower innovation, not hold it back with complexity and unpredictable costs. However, as organizations scale and data volumes continue to explode, the old ways of managing telemetry data aren’t sustainable.

Read Post

Mezmo

Read more about Transforming Observability: Simpler, Smarter, and More Affordable Data Control

Tracing Funnels - Define funnels b/w spans in your distributed systems

May 16, 2025 By Anushka Karmakar In SigNoz

Distributed tracing has long been the go-to for understanding the performance of microservices and asynchronous systems. But as systems grow in complexity, simply viewing individual traces and spans isn’t enough. Teams need to answer questions like: SigNoz Tracing Funnels is here to change that, bringing the clarity of product analytics-style funnel analysis to backend traces, and doing so in a way that’s never been available before.

Read Post

SigNoz

Read more about Tracing Funnels - Define funnels b/w spans in your distributed systems

Coroot 1.11: What's New

May 16, 2025 By Nikolay Sivko In Coroot

We’re excited to announce the release of Coroot 1.11! This version comes with powerful new features designed to give you more control over your observability data and make your troubleshooting process faster and more intuitive. Let’s dive into what’s new.

Read Post

Coroot

Read more about Coroot 1.11: What's New

Linux Security Logs: Complete Guide for DevOps and SysAdmins

May 15, 2025 By Anjali Udasi In Last9

Security logs are the quiet sentinels of your Linux systems, recording critical information that can mean the difference between detecting an intrusion and discovering a breach months too late. For most DevOps professionals and system administrators, these logs contain valuable insights that often go untapped. While they're essential for compliance, their real value lies in providing visibility into your system's security posture and operational health.

Read Post

Last9

Read more about Linux Security Logs: Complete Guide for DevOps and SysAdmins

Prometheus vs Zabbix: A Hands-On Technical Comparison and a Modern Alternative

May 15, 2025 By Pavithra Parthiban In Atatus

When choosing a monitoring tool, two popular names often come up, Prometheus and Zabbix. Both are powerful and widely adopted but come with different approaches and learning curves. Prometheus is favored in cloud-native environments for its time-series data model and flexibility, while Zabbix has long served traditional IT infrastructures with its rich agent-based monitoring. But what if you are looking for a simpler, more unified solution?

Read Post

Atatus

Read more about Prometheus vs Zabbix: A Hands-On Technical Comparison and a Modern Alternative

You, Me, and BugSplat's MCP

May 15, 2025 By Bobby Galli In BugSplat

Let's face it - from an experienced developer's perspective, most software trends are, put lightly, incredibly annoying. The last thing a grizzled, old, technical wizard wants to hear is some half-brained junior developer telling them to switch their SQL server to MongoDB, replace the PHP EC2 with serverless Python, or rewrite their entire front-end with HTMX. The hype-train is so intense that even watching TV feels risky, as you might see something as absurd as an ad for AI toothpaste.

Read Post

BugSplat

Read more about You, Me, and BugSplat's MCP

Tracealyzer Was Just the Beginning

May 15, 2025 By Percepio In Percepio

If you’ve been building embedded systems for a while, chances are you know Percepio for Tracealyzer. And we’re proud of that. For over a decade, Tracealyzer has been helping engineers visualize and solve complex RTOS issues faster, with over 30 ways to slice and understand system behavior. But in 2025, embedded systems demand more. They’re always on. Always connected. And increasingly, always business-critical.

Read Post

Percepio

Read more about Tracealyzer Was Just the Beginning

CI/CD Observability Powered by OpenTelemetry | SigNoz Launch Week 4.0 Day 4

May 15, 2025 By SigNoz - Open Source Observability Platform In SigNoz

Tired of guessing why your releases stall, which PRs are stuck, or where flaky tests are wasting your team’s time? Most teams obsess over production monitoring, but what about the bottlenecks that often hide in the CI/CD pipeline slowing delivery, draining productivity, and introducing risk before code ever ships. With CI/CD Observability, you can: So, stop flying blind in your delivery process and make every release faster, more reliable, and fully transparent!

View Video

SigNoz

Read more about CI/CD Observability Powered by OpenTelemetry | SigNoz Launch Week 4.0 Day 4

State of the Observability Databases with Dee Kitchen (Grafana Office Hours #30)

May 15, 2025 By Grafana In Grafana

In this Grafana Office Hours, we talk about the state of observability databases (Grafana Loki, Mimir, Tempo, and Pyroscope) and where they're going. We talk about current and upcoming architectural changes in all four, how we're making them more performant, how compatible they are with OpenTelemetry, and what we're working on next for each database. In this conversation are Dee Kitchen (VP of Engineering for Databases) and Senior Developer Advocates Jay Clifford and Nicole van der Hoeven.

View Video

Grafana

Read more about State of the Observability Databases with Dee Kitchen (Grafana Office Hours #30)

Monitoring your MCP Server in Production (with Sentry)

May 15, 2025 By David Cramer In Sentry

So you're building an MCP server for your project or service, to allow AI chatbots and agents to interact with it? Great! You've decided to build it using Cloudflare Workers, have written the code, shipped it, and the first users are getting onboard: you're officially running it in production. That's when problems start. I'm not here to dissuade you from shooting your shot, but let's make sure you've got your bases covered in production when something inevitably goes wrong.

Read Post

Sentry

Read more about Monitoring your MCP Server in Production (with Sentry)

CI/CD Observability Powered by OpenTelemetry

May 15, 2025 By Vibhu Pandey In SigNoz

Modern engineering teams spend a lot of time and resources in setting up monitoring of their production systems - tracking uptime, catching errors, and responding to incidents before customers ever notice. But what about the journey before code reaches production? For most teams, observing the CI/CD pipeline is either an afterthought or completely overlooked. While we recognize its importance, do we truly understand how well our CI/CD process is functioning?

Read Post

SigNoz

Read more about CI/CD Observability Powered by OpenTelemetry

Understanding Your App's Health With Core Mobile Vitals

May 15, 2025 By Bee Klimt In Honeycomb

Mobile apps are a little different from services run on servers. You build your mobile app, you ship it off to the world, and then it gets run by the end user on their own machine. If your app is running poorly on some percentage of users’ devices, you may never know. That’s where observability comes in. There are certain important metrics that every mobile app has in common.

Read Post

Honeycomb

Read more about Understanding Your App's Health With Core Mobile Vitals

Top 5 Benefits of a Status Page Aggregator

May 15, 2025 By Colin Bartlett In StatusGator

According to the 2024 State of SaaSOps report, organizations now use an average of 112 SaaS applications. That’s 112 potential points of failure. Manually checking or subscribing to each of those status pages is not scalable. Even small teams often rely on 30+ services spanning infrastructure, communication, payments, and security. A status page aggregator like StatusGator consolidates service statuses from hundreds or even thousands of providers into a single, unified view.

Read Post

StatusGator

Read more about Top 5 Benefits of a Status Page Aggregator

AWS Lambda's INIT billing update: What's changing and why it matters for your cloud costs

May 15, 2025 By ramji.ry@zohocorp.com In ManageEngine

Starting on Aug. 1, 2025, AWS will bill for the initialization (INIT) phase of Lambda functions, bringing a key change to how you are charged for serverless workloads. This billing update will impact functions using managed runtimes with ZIP archive packaging, which previously excluded the INIT phase from the billed duration. For teams that rely heavily on AWS Lambda, this is a small but significant change. The INIT phase, while short, could introduce costs that were previously invisible.

Read Post

ManageEngine

Read more about AWS Lambda's INIT billing update: What's changing and why it matters for your cloud costs

The Datadog Agent: Why it's essential for monitoring your infrastructure and applications with Datadog

May 15, 2025 By Bowen Chen In Datadog

If you’re a Datadog customer, you’re likely using our platform to gain visibility into your infrastructure and applications and to troubleshoot using logs, metrics, and traces when issues arise. To support these efforts, you’ll want access to the most granular telemetry signals and intuitive workflows that streamline your investigation.

Read Post

Datadog

Read more about The Datadog Agent: Why it's essential for monitoring your infrastructure and applications with Datadog

3 ways to drive software delivery success with Datadog DORA Metrics

May 15, 2025 By Teddy Gesbert In Datadog

Delivering software quickly and reliably is the main focus of modern DevOps. But to improve your delivery performance, you need to understand it, and that starts with measurement. Teams primarily measure performance in this area by using DORA metrics—deployment frequency, change lead time, change failure rate, and time to restore service*. These metrics help teams understand trends in their software delivery practices in quantifiable terms that they can track and improve over time.

Read Post

Datadog

Read more about 3 ways to drive software delivery success with Datadog DORA Metrics

Unify your FinOps and engineering workflows in Datadog Cloud Cost Management

May 15, 2025 By Natasha Goel In Datadog

As your applications scale across cloud and SaaS providers, allocating costs and optimizing workloads become increasingly important—and challenging. Without access to cost data in their daily workflows, engineering teams can’t easily understand the cost of their resources and identify where they can reduce their spend. And while FinOps teams have access to cost data, they often review this information in silos.

Read Post

Datadog

Read more about Unify your FinOps and engineering workflows in Datadog Cloud Cost Management

Improve user access and admin controls with the latest platform updates from Sumo Logic

May 15, 2025 By Margaret Selid In Sumo Logic

By centralizing your mission-critical logs, metrics, traces, and events from all of your systems into one platform, Sumo Logic enables teams across development, security, and operations to operate from a single source of truth. While this unified approach is crucial for fast issue identification and minimizing downtime from infrastructure failures or security breaches, not everyone on your team needs access to every bit of data.

Read Post

Sumo Logic

Read more about Improve user access and admin controls with the latest platform updates from Sumo Logic

7 Best Network Configuration Management Tools

May 15, 2025 By Staff Contributor In SolarWinds

If you want a secure, efficient, and compliant network, network configuration management is a must. Whether managing a small network or being responsible for a large enterprise system, having the right solution can make all the difference. Network configuration management tools provide valuable insights into devices on your network, and they can help quickly restore previous configurations in the event of a failure, misconfiguration, or security incident. What is network configuration management?

Read Post

SolarWinds

Read more about 7 Best Network Configuration Management Tools

Deploying the #Bindplane #OpenTelemetry #Collector in #Docker Compose with OpAMP support

May 15, 2025 By Bindplane In ObservIQ

Check out the full ‪‪‪@bindplane community call in May.

View Video

ObservIQ

Read more about Deploying the #Bindplane #OpenTelemetry #Collector in #Docker Compose with OpAMP support

Control Observability Costs With Filtering

May 15, 2025 By Honeycomb In Honeycomb

Not all telemetry is created equal. Curate the data you save in Honeycomb using Honeycomb Telemetry Pipeline Manager. Then restore them later if you change your mind!

View Video

Honeycomb

Read more about Control Observability Costs With Filtering

Debugging Microservices

May 15, 2025 By Sentry In Sentry

Debugging microservices is tough, especially when you're juggling multiple services and relying only on logs. This video cuts through the complexity by showing you how to implement distributed tracing using Sentry. You'll see a practical demonstration in a food ordering app (built with React and Go) of how tracing can give you a clear view of your entire request flow, from the initial button click to the final operation across all your services.

View Video

Sentry

Read more about Debugging Microservices

Bindplane's Bring Your Own Collector Fixing Incompatible Collectors #opentelemetry #collector

May 15, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane community call in May.

View Video

ObservIQ

Read more about Bindplane's Bring Your Own Collector Fixing Incompatible Collectors #opentelemetry #collector

Easy setup for Bindplane's #OpenTelemetry #Collector running in Docker with the ephemeral flag

May 15, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane community call in May.

View Video

ObservIQ

Read more about Easy setup for Bindplane's #OpenTelemetry #Collector running in Docker with the ephemeral flag

vmalert - Maximize Your Monitoring - Tech Talk #5

May 14, 2025 By VictoriaMetrics In VictoriaMetrics

This time, we're diving into a critical component for operational excellence: vmalert. Effective alerting is the backbone of proactive monitoring, enabling teams to detect and respond to issues swiftly before they impact users. But setting up truly effective alerting – alerts that are reliable, actionable, and low-noise – requires understanding the tools and best practices.

View Video

VictoriaMetrics

Read more about vmalert - Maximize Your Monitoring - Tech Talk #5

Getting started with ServiceNow dashboards

May 14, 2025 By Sameer Mhaisekar In Squared Up

ServiceNow is a cloud-based platform that streamlines IT service management, operations, and various business workflows across organizations. Dashboards in ServiceNow can play a valuable role by offering a clear view of key metrics, trends, and performance indicators. While there are dashboards locally in ServiceNow portal, they often fail to provide a fuller picture of the impact of the incidents in context with other key metrics from external tools.

Read Post

Squared Up

Read more about Getting started with ServiceNow dashboards

CI/CD Observability Powered by OpenTelemetry and SigNoz

May 14, 2025 By SigNoz - Open Source Observability Platform In SigNoz

Most teams have strong monitoring for production, but what about the journey before your code gets deployed? The CI/CD pipeline is where bottlenecks, flaky tests, and process gaps silently waste your team’s time. Until now, this part of the workflow has mostly been a black box. We’re excited to announce CI/CD Observability in SigNoz - a new way to track, analyze, and improve your software delivery process, powered by OpenTelemetry.

View Video

SigNoz

Read more about CI/CD Observability Powered by OpenTelemetry and SigNoz

8 Network Statistics IT Pros Should Know to Understand and Optimize Network Performance

May 14, 2025 By Andrii Kernitskyi In Obkio

Slow Zoom calls, dropped VPN connections, and lagging applications sound familiar? These common network frustrations often stem from underlying performance issues that could be diagnosed and resolved with the right data. For IT professionals, raw network metrics alone aren’t enough. To truly optimize performance, you need network statistics: aggregated, analyzed, and interpreted insights that turn numbers into actionable decisions.

Read Post

Obkio

Read more about 8 Network Statistics IT Pros Should Know to Understand and Optimize Network Performance

Control Observability Costs With Sampling

May 14, 2025 By Honeycomb In Honeycomb

How can we send less data without losing information? Sampling removes duplication among distributed traces, while keeping all the interesting ones. See how this works with Honeycomb and Refinery.

View Video

Honeycomb

Read more about Control Observability Costs With Sampling

Visualize Amazon Aurora, Zendesk, and more: What's new in Grafana data sources

May 14, 2025 By Kristin Knapp In Grafana

One of our biggest goals at Grafana Labs is to help you unify and derive value from your data, regardless of where that data lives. As a result, we’re fully committed to making Grafana an open, composable, and extensible observability platform. Last week at GrafanaCON 2025, where we celebrated the launch of Grafana 12, we highlighted one of the key ways we deliver on this promise of openness and extensibility: our broad ecosystem of Grafana data sources.

Read Post

Grafana

Read more about Visualize Amazon Aurora, Zendesk, and more: What's new in Grafana data sources

Introducing SCIM provisioning in Grafana: Enterprise-grade user management made simple

May 14, 2025 By Vardan Torosyan In Grafana

We’re excited to share that SCIM provisioning is available in public preview for Grafana Enterprise and Grafana Cloud Advanced! This powerful feature, introduced last week at GrafanaCON 2025 as part of the Grafana 12 release, transforms how organizations manage users and teams in Grafana, bringing automated user lifecycle management and enhanced security to your observability platform.

Read Post

Grafana

Read more about Introducing SCIM provisioning in Grafana: Enterprise-grade user management made simple

Third party API Monitoring powered by OpenTelemetry semantics

May 14, 2025 By Anushka Karmakar In SigNoz

In today’s cloud-native world, third-party APIs are everywhere. Payments, notifications, search, AI, analytics as modern applications are built on a web of external services. But what happens when one of those APIs slows down, starts throwing errors, or gets rate-limited? Suddenly, your users are facing issues, and you’re stuck asking.

Read Post

SigNoz

Read more about Third party API Monitoring powered by OpenTelemetry semantics

Inside the Sidebar Redesign: A Designer's Perspective

May 14, 2025 By Jenya Malynovska In Checkly

We just released a fresh new view of the sidebar in the Checkly app. This decision was made after some extensive user research—and this is the story of how we got to the final solution.

Read Post

Checkly

Read more about Inside the Sidebar Redesign: A Designer's Perspective

AI at the Edge: Why Smart Data Placement is the Key to Unlocking Its Power

May 14, 2025 By Teneo In Teneo

As organizations increasingly deploy AI solutions, I am seeing more and more that the strategic placement of data—particularly at the edge—is becoming paramount to unlocking AI’s full potential. This is a viewed shared by our partners at Riverbed, as highlighted in a recent white paper, Accelerating AI and Data Movement at the Edge. Edge computing enables businesses to perform complex operations at production sites by positioning compute resources nearer to users and operations.

Read Post

Teneo

Read more about AI at the Edge: Why Smart Data Placement is the Key to Unlocking Its Power

Third party API Monitoring Powered by OTel Semantic Conventions | SigNoz Launch Week 4.0 Day 3

May 14, 2025 By SigNoz - Open Source Observability Platform In SigNoz

Is it the third-party API or my code? Your service suddenly slows down, or errors spike, and you’re stuck guessing if it’s your own logic or an external API you don’t control. We’ve seen this pain across teams: dashboards don’t tell you which vendor or endpoint is the culprit, and debugging turns into a maze of guesswork. Rate limiting, vendor errors, or integration issues often slip through until users complain.

View Video

SigNoz

Read more about Third party API Monitoring Powered by OTel Semantic Conventions | SigNoz Launch Week 4.0 Day 3

Introduction To Browser Checks | Grafana Cloud Synthetic Monitoring

May 14, 2025 By Grafana In Grafana

Learn how to set up browser checks using Grafana Cloud Synthetic Monitoring. In this video, we walk through how to create a browser check and analyze test results. Browser checks simulate real user interactions to track critical workflows and catch issues early.

View Video

Grafana

Read more about Introduction To Browser Checks | Grafana Cloud Synthetic Monitoring

Tracing Funnels - Define funnels b/w spans in your distributed system

May 14, 2025 By SigNoz - Open Source Observability Platform In SigNoz

View Video

SigNoz

Read more about Tracing Funnels - Define funnels b/w spans in your distributed system

IIS server: Uses, benefits, and challenges

May 14, 2025 By Geoffrin Edwin In Site24x7

Internet Information Services, commonly referred to as IIS, is Microsoft's web server software. It is built to host websites, applications, and services for Windows systems. If you are considering IIS for hosting your website or applications, let us take you through the basics of IIS, its benefits over the other options available, and the common pitfalls organizations face when they opt for IIS. IIS has evolved a lot since its GA release in 1995.

Read Post

Site24x7

Read more about IIS server: Uses, benefits, and challenges

Introducing Metrics Explorer | SigNoz Launch Week 4.0 Day 2

May 14, 2025 By SigNoz - Open Source Observability Platform In SigNoz

Ever tried to build a metrics dashboard and thought, “Wait, what metrics am I actually sending?” We heard this from users again and again-so we built Metrics Explorer. For the first time, you get a real-time, interactive view of every metric coming into your system: Whether you’re onboarding a new integration, debugging an alert, or just exploring your data, Metrics Explorer makes it easy to understand and work with your metrics-no more guesswork, just clarity.

View Video

SigNoz

Read more about Introducing Metrics Explorer | SigNoz Launch Week 4.0 Day 2

Level Up Your Confidence & Problem-Solving Skills! | How Gaming Boosts Workplace Success

May 14, 2025 By solarwindsinc In SolarWinds

Gamers learn more than just strategies—they learn self-confidence and resilience. Success in-game translates to success at work, with a new mindset: there’s always a way to solve the problem, even if it means leveling up your skills!

View Video

SolarWinds

Read more about Level Up Your Confidence & Problem-Solving Skills! | How Gaming Boosts Workplace Success

Investment Trends in Infrastructure Monitoring Market: What Users Should Know

May 14, 2025 By Angelika Bang In Icinga

In recent months, the IT monitoring landscape has seen notable investment activity: These developments are part of a larger trend, with investment firms showing growing interest in the IT monitoring sector. Although such activity isn’t unusual in the tech industry, the current intensity and frequency indicate that IT monitoring is emerging as a key area for growth-oriented strategies.

Read Post

Icinga

Read more about Investment Trends in Infrastructure Monitoring Market: What Users Should Know

Contextual Observability: Using Tagging and Metadata To Unlock Actionable Insights

May 14, 2025 By Mike Simon In Splunk

Observability isn’t about collecting more telemetry — it’s about making that telemetry data meaningful. Contextual observability transforms raw telemetry into actionable insights by enriching it with consistent tagging and metadata. Without context, telemetry data remains fragmented, troubleshooting slows, and aligning with business priorities is nearly impossible.

Read Post

Splunk

Read more about Contextual Observability: Using Tagging and Metadata To Unlock Actionable Insights

Baseline configuration management: Why it's critical for network stability

May 14, 2025 By akash.mj@zohocorp.com In ManageEngine

Imagine this: You've onboarded 30 new switches, 15 firewalls, and 20 routers into your network. You assume they all follow company policy. But months later, half of them are misconfigured, a few are running vulnerable firmware, and one rogue device is exposing ports it shouldn't. That’s not poor luck—that’s poor baseline configuration.

Read Post

ManageEngine

Read more about Baseline configuration management: Why it's critical for network stability

Leading analyst firm reveals the real cost of internet disruptions

May 14, 2025 By Howard Beader In Catchpoint

‘Without the internet, Digital Experiences do not exist,’ begins Increase Revenue and Improve Customer Experience with Internet Performance Monitoring, a study commissioned by Catchpoint to quantify the financial damage from internet outages. At first glance, that might seem painfully obvious—like pointing out that water is wet. But pause for a moment and consider: the digital experience today isn't just an aspect of business; it is the business. Suddenly, the stakes feel very different.

Read Post

Catchpoint

Read more about Leading analyst firm reveals the real cost of internet disruptions

Comprehensive Guide to Developing and Deploying a Python API with Docker and Kubernetes (Part I)

May 14, 2025 By Aymen Eralabs In MetricFire

In the evolving landscape of software development, containerization and orchestration have become pivotal. Docker and Kubernetes stand at the forefront of this transformation, offering scalable and efficient solutions for application deployment. This guide provides a detailed walkthrough on developing a Python API, containerizing it with Docker, and deploying it using Kubernetes, ensuring a robust and production-ready application.

Read Post

MetricFire

Read more about Comprehensive Guide to Developing and Deploying a Python API with Docker and Kubernetes (Part I)

Workshop: Mobile App Monitoring Platforms Don't Have To Be Noisy

May 13, 2025 By Sentry In Sentry

Debugging mobile apps shouldn’t mean drowning in alerts or spelunking through logs just to figure out why your app stuttered or froze. Most tools flood you with noise and leave you guessing. In this workshop, we’ll show you how to use Sentry to cut through the noise and zero in on what actually matters—whether it’s jank from blocked main threads, ANRs in production, dropped frames during scroll, or regressions that somehow made it to production.

View Video

Sentry

Read more about Workshop: Mobile App Monitoring Platforms Don't Have To Be Noisy

Your incident response plan is obsolete-unless it includes agentic AIOps

May 13, 2025 By LogicMonitor In LogicMonitor

Why are we still handling IT incident response like it’s 2014? Every day, ITOps teams are flooded with alerts, spread thin across hybrid systems, and stuck trying to stitch together visibility from solutions that don’t talk to each other. The incidents keep coming, but the tools aren’t getting smarter—and the humans are burned out. Even with best practices in place, response is often slow, inconsistent, and reactive. You chase symptoms instead of solving problems.

Read Post

LogicMonitor

Read more about Your incident response plan is obsolete-unless it includes agentic AIOps

How to easily connect Prometheus to Grafana Cloud

May 13, 2025 By Shawn Pitts In Grafana

Prometheus is one of the most popular open source monitoring tools due to its powerful flexibility for collecting time series metrics. But raw metrics aren’t always helpful on their own. That’s where Grafana Cloud comes in. By connecting Prometheus to Grafana Cloud, you get rich visualizations, alerts, and dashboards that make your data actionable without having to manage any additional infrastructure.

Read Post

Grafana

Read more about How to easily connect Prometheus to Grafana Cloud

Create an Ingest Processor pipeline

May 13, 2025 By Splunk In Splunk

Build an Ingest Processor pipeline that ingests data matching partition conditions and sends it to your chosen destination.

View Video

Splunk

Read more about Create an Ingest Processor pipeline

Reality Bites: 7 Key Disadvantages of Real User Monitoring

May 13, 2025 By AlertBot In AlertBot

Real estate professionals have said for years that the three most important factors about a property are location, location, and location. Well, for organizations with a web presence — which these days is the vast majority, and 100% of e-commerce companies — the three most important factors about their site are visitor experience, visitor experience, and (let’s all say it together!) visitor experience.

Read Post

AlertBot

Read more about Reality Bites: 7 Key Disadvantages of Real User Monitoring

Tracing Just Got a Whole Lot More Useful: Search, Visualize, and Alert with Sentry's new Query Engine

May 13, 2025 By Will McMullen In Sentry

For a while, tracing in Sentry was... fine. You could open up a slow transaction, poke around, find the N+1, and feel like a hero. But if you wanted to answer more complex questions - like why your payment API was getting slower in Europe, or which CDN was silently tanking your image loads - things got harder. We didn't really build it to help with answering broad questions.

Read Post

Sentry

Read more about Tracing Just Got a Whole Lot More Useful: Search, Visualize, and Alert with Sentry's new Query Engine

5 Must-Have Python Plugins for InfluxDB 3 Core & Enterprise

May 13, 2025 By Suyash Joshi In InfluxData

InfluxDB 3 is our latest time series database built for real-time analytics and high-volume data. Its Python Processing Engine lets developers run custom scripts known as plugins to process data, trigger alerts, or integrate with external systems via HTTP web requests. To demonstrate what’s possible, we’ve developed several plugins, all of which are available in the influxdb3_plugins GitHub repository. This public repo is open for anyone to use, modify, and contribute to.

Read Post

InfluxData

Read more about 5 Must-Have Python Plugins for InfluxDB 3 Core & Enterprise

An Introduction to Ecto for Elixir Monitoring with AppSignal

May 13, 2025 By Aestimo Kirina In AppSignal

Database performance can make or break your Elixir application. While Ecto provides a powerful toolkit for database interactions, understanding how these operations perform in production is critical. Whether you're dealing with slow queries, connection pool issues, or mysterious N+1 problems, the ability to effectively monitor and optimize your database operations can be the difference between a sluggish application and one that delights your users.

Read Post

AppSignal

Read more about An Introduction to Ecto for Elixir Monitoring with AppSignal

How to Monitor PowerShell Activity and Detect PowerShell Exploitation Vulnerabilities

May 13, 2025 By Babu Sundaram In eG Innovations

Why should you monitor PowerShell?…. PowerShell is a powerful automation tool, however its capabilities also make it a prime target for exploitation by cyber attackers. Implementing a robust, automated PowerShell monitoring solution is now essential to detect and prevent exploitation attacks before they compromise your systems. PowerShell is a powerful scripting tool that can automate tasks and manage systems, but its flexibility also makes it a target for abuse.

Read Post

eG Innovations

Read more about How to Monitor PowerShell Activity and Detect PowerShell Exploitation Vulnerabilities

Detect and resolve Amazon RDS disk I/O bottlenecks

May 13, 2025 By Grace Nalini In Site24x7

Amazon RDS simplifies database management, however disk I/O bottlenecks can affect performance, especially as workloads grow. Let's learn how to identify and fix I/O issues using AWS best practices and about Site24x7's monitoring tools for MySQL and PostgreSQL.

Read Post

Site24x7

Read more about Detect and resolve Amazon RDS disk I/O bottlenecks

Metrics Explorer - Search, Query, and Analyze all your Metrics at one place

May 13, 2025 By Anushka Karmakar In SigNoz

If you’ve ever found yourself staring at a dashboard dropdown, wondering, “What metrics am I even sending to my observability tool?”, you’re not alone. For most engineering teams, answering even the most basic telemetry questions is about as hard as catching a Mewtwo. Frustratingly elusive and way more complicated than it should be, like: We built Metrics Explorer to finally answer all of these questions instantly, and in one place.

Read Post

SigNoz

Read more about Metrics Explorer - Search, Query, and Analyze all your Metrics at one place

Community and Collaboration: Lessons from Gaming - SolarWinds TechPod 098

May 13, 2025 By solarwindsinc In SolarWinds

This episode explores the intersection of gaming, particularly MMOs, and its impact on workplace dynamics. Dr. Melika Shirmohammadi and Mostafa Ayoobzadeh join hosts Chrystal Taylor and Sean Sebring to discuss their motivations for studying the positive aspects of gaming, the skills that can be transferred to professional environments, and the social stigma surrounding gaming as a hobby.

View Video

SolarWinds

Read more about Community and Collaboration: Lessons from Gaming - SolarWinds TechPod 098

How to monitor Airflow metrics, logs, and lineage

May 13, 2025 By Thomas Sobolik In Datadog

In Part 1 of this series, we looked at key Airflow metrics to monitor. Now, we’ll explore how you can collect those metrics, along with logs and traces, using Airflow’s native tooling. We’ll also look at a few key ways you can monitor this data from within the Airflow webserver interface.

Read Post

Datadog

Read more about How to monitor Airflow metrics, logs, and lineage

Monitor Airflow with Datadog

May 13, 2025 By Thomas Sobolik In Datadog

In Part 1 of this series, we discussed key metrics for monitoring Airflow. In Part 2, we discussed strategies for collecting Airflow metrics, logs, and lineage. Finally, we will look at how you can use Datadog to monitor all this telemetry in a single consolidated view alongside telemetry from the rest of your infrastructure and services.

Read Post

Datadog

Read more about Monitor Airflow with Datadog

Debug Logs and Analyze Trends with Log Data Rehydration

May 13, 2025 By Mezmo In Mezmo

Everyone in your organization needs logs to perform the critical functions of their job. Developers need them to debug their applications, security engineers need them to respond to incidents, and support engineers need them to help customers troubleshoot issues. These various use cases create general requirements for enriched log data, often including accessing insights from outside typical retention windows.

Read Post

Mezmo

Read more about Debug Logs and Analyze Trends with Log Data Rehydration

Users are complaining, but your internal monitoring is showing green across the board?

May 13, 2025 By Catchpoint In Catchpoint

Chances are the issue is somewhere between you and your users. To deliver seamless digital experiences, you need to monitor the entire Internet Stack. From DNS and BGP to CDNs and third-party services, Internet Performance Monitoring (IPM) helps you find and fix what traditional tools can’t see.

View Video

Catchpoint

Monitoring

Read more about Users are complaining, but your internal monitoring is showing green across the board?

Key metrics for monitoring Airflow

May 13, 2025 By Thomas Sobolik In Datadog

Airflow is a popular open source platform that enables users to author, schedule, and monitor workflows programmatically. Airflow helps teams run complex pipelines that require task orchestration, dependency management, and efficient scheduling across many different tools. It’s particularly useful for creating data processing pipelines, orchestrating task-based workflows such as machine learning (ML) training, and running cloud services.

Read Post

Datadog

Read more about Key metrics for monitoring Airflow

Microsoft Outlook rolls out stricter email authentication requirements for high-volume senders to enhance security

May 13, 2025 By Bela Susan Thomas In Site24x7

Microsoft Outlook.com (which includes hotmail.com, live.com, and outlook.com) is implementing new email authentication procedures in an attempt to improve email security and preserve customer confidence. These modifications, which came into effect on May 5, 2025, are intended especially for high-volume senders, or those who send more than 5,000 emails every day.

Read Post

Site24x7

Read more about Microsoft Outlook rolls out stricter email authentication requirements for high-volume senders to enhance security

Unifying OpenTelemetry & Datadog | #Observability #OpenTelemetry #datadog

May 13, 2025 By Datadog In Datadog

Previously, teams had to choose between adopting the OpenTelemetry Collector’s capabilities and fully leveraging our advanced features. On This Month in Datadog, we’re spotlighting our OTel Collector distribution, which unifies OTel and Datadog. Check out the link in our bio to watch the new episode.

View Video

Datadog

Read more about Unifying OpenTelemetry & Datadog | #Observability #OpenTelemetry #datadog

Track GitHub Copilot Usage with Datadog #GitHubCopilot #Datadog #DevTools

May 13, 2025 By Datadog In Datadog

Easily track GitHub Copilot usage across your organization with our new integration. On This Month in Datadog, we’re covering this integration, Datadog CoTerm, and the new Optimization page in Datadog Real User Monitoring. Check out the link in our bio to watch the new episode.

View Video

Datadog

Read more about Track GitHub Copilot Usage with Datadog #GitHubCopilot #Datadog #DevTools

Deep Temporal Observability | SigNoz Launch Week 4.0 Day 1

May 13, 2025 By SigNoz - Open Source Observability Platform In SigNoz

If Temporal powers your business-critical workflows, you know how tough it is to get real visibility into what’s happening under the hood. Most tools only show basic Prometheus metrics-leaving you guessing about bottlenecks, failures, and performance issues. Join us for a live demo of SigNoz’s industry-first Temporal integration. We’ll show you how to: Whether you’re running Temporal in production or just exploring workflow orchestration, this session will show you how to move from “just metrics” to true, unified observability.

View Video

SigNoz

Read more about Deep Temporal Observability | SigNoz Launch Week 4.0 Day 1

Angular OpenTelemetry Setup and Troubleshooting

May 12, 2025 By Prathamesh Sonpatki In Last9

Implementing observability in Angular applications presents unique challenges. Understanding how users experience your application and identifying performance bottlenecks requires specialized tools and approaches. This guide covers implementing OpenTelemetry in Angular applications, with practical code examples for instrumentation, data collection, and integration with observability backends.

Read Post

Last9

Read more about Angular OpenTelemetry Setup and Troubleshooting

Ubuntu Cron Logs: A Complete Guide for Engineers

May 12, 2025 By Faiz Shaikh In Last9

Troubleshooting failed cron jobs without proper logging can be frustrating. Ubuntu cron logs record the execution of scheduled tasks, helping you identify what's working and what isn't. This guide covers what engineers need to know about Ubuntu cron logs – from finding them to analyzing their contents and setting up effective monitoring solutions.

Read Post

Last9

Read more about Ubuntu Cron Logs: A Complete Guide for Engineers

OpenAI's 'AI in the Enterprise' Report: A Must-Read - But One Crucial Piece Is Missing

May 12, 2025 By Teneo In Teneo

We are standing at the threshold of one of the most transformative technological shifts in modern enterprise history. AI is no longer on the horizon – it’s here, it’s powerful, and it’s already reshaping the way businesses think about productivity, creativity, and competitive advantage. OpenAI’s recent report, ‘AI in the Enterprise‘, offers a concise and thoughtful roadmap for leaders seeking to implement AI within their organizations.

Read Post

Teneo

Read more about OpenAI's 'AI in the Enterprise' Report: A Must-Read - But One Crucial Piece Is Missing

IBM's AI Just Replaced 94% of HR functions - What's Stopping You?

May 12, 2025 By Teneo In Teneo

At IBM’s Think conference this week, the company made a bold announcement: 94% of its HR functions are now handled by AI, a shift they claim will generate $3.5 billion in savings over the next two years. These are staggering numbers. And while the cynic in me can’t ignore that this announcement was made at what is, effectively, a sales conference – especially one that coincided with the launch of IBM’s AI Agent Store – the scale of those numbers deserves attention.

Read Post

Teneo

Read more about IBM's AI Just Replaced 94% of HR functions - What's Stopping You?

Business Process Automation, Explained

May 12, 2025 By Shanika Wickramasinghe In Splunk

Business process automation no longer sits on the sidelines. What was once an emerging technology is now the engine behind modern business operations. In fact, around 60% of companies already use automation tools in their workflows, according to Duke University. This is not just companies — developers are also contributing to this shift by adopting low-code, no-code, and digital process automation platforms. These new tools remove barriers that once slowed innovation.

Read Post

Splunk

Read more about Business Process Automation, Explained

Metrics Explorer - Search, Query, and Analyze all your Metrics at one place

May 12, 2025 By SigNoz - Open Source Observability Platform In SigNoz

View Video

SigNoz

Read more about Metrics Explorer - Search, Query, and Analyze all your Metrics at one place

Deep Temporal Observability - Correlate Metrics with Logs & Traces

May 12, 2025 By Anushka Karmakar In SigNoz

Temporal lets you orchestrate complex, reliable workflows, but when something breaks or slows down, the built-in dashboards only give you a list of events and some basic filters. You can see what happened and filter by attributes like workflow type or namespace, but you can't drill deeper. There's no way to jump straight from a metric spike to the exact trace or log line you care about.

Read Post

SigNoz

Read more about Deep Temporal Observability - Correlate Metrics with Logs & Traces

Gotta Go Slow

May 12, 2025 By Fred Hebert In Honeycomb

The last few months have been wild. Some of the busiest of my life, actually: For context: I’m Canadian, and all of this happened during the continued threats of annexation. All this to say, it’s been rough. I anticipated this would be a challenging time and that I would be exhausted. So, the plan became: do all the demanding things, take my sabbatical in May, and use April as an ‘in-between’ period with a bit less pressure.

Read Post

Honeycomb

Read more about Gotta Go Slow

We Saw That IT Outage Coming-And Stopped It: Why AIOps Deployment is a Game-Changer

May 12, 2025 By ScienceLogic In ScienceLogic

It’s 3:12 AM. Somewhere in a company’s global cloud infrastructure, a server cluster begins to show unusual read/write patterns. Traditionally, IT teams wouldn’t notice until dashboards light up with red alerts—often too late to avoid an outage that costs thousands, even millions, in lost revenue and trust. But this time, it’s different.

Read Post

ScienceLogic

Read more about We Saw That IT Outage Coming-And Stopped It: Why AIOps Deployment is a Game-Changer

Managing monthly reports with the API

May 12, 2025 By Sean White In Oh Dear

On the first of every month we generate an extensive PDF report for every site. This report contains a summary of all check results for the month and is a snapshot available to you and your team via email and the Oh Dear dashboard. We keep the report history so each month can be viewed in a browser or downloaded as a PDF. This report can also be emailed to any email address - not just team members - perfect for keeping your customers informed.

Read Post

Oh Dear

Read more about Managing monthly reports with the API

Building a Culture of Observability Through Ownership

May 12, 2025 By Martin McLarnon In Coralogix

There’s a problem in engineering culture that we don’t talk about enough: observability is an afterthought. It’s treated as tooling, not thinking. As a checkbox, not a habit. And that mindset gap creates real consequences: longer outages, frustrated teams and massive business costs. Atlassian’s Incident Management for High-Velocity Teams overview cites a 2014 study by Gartner, that the average cost of IT downtime is $5,600 per minute.

Read Post

Coralogix

Read more about Building a Culture of Observability Through Ownership

Securing IoT Devices with Firewall Monitoring: A Comprehensive Guide

May 12, 2025 By MetricFire Blogger In MetricFire

The proliferation of Internet of Things (IoT) devices has transformed various sectors, offering enhanced efficiency and connectivity. However, this expansion also introduces significant security challenges. Implementing robust firewall monitoring is essential to protect these devices and the networks they inhabit.

Read Post

MetricFire

Read more about Securing IoT Devices with Firewall Monitoring: A Comprehensive Guide

Splunk Observability Cloud's AI Assistant in Action | Practical Examples | Part 2

May 12, 2025 By Splunk In Splunk

In this video, we'll explore practical ways to utilize the AI Assistant in Splunk Observability Cloud. Through real-world scenarios, learn how the AI Assistant can help you interpret metrics, contextualize data, onboard new team members to your organization, and automate tasks via the Splunk Observability Cloud API. AI Assistant in Splunk Observability Cloud enhances observability by providing actionable insights and streamlining workflows.

View Video

Splunk

Read more about Splunk Observability Cloud's AI Assistant in Action | Practical Examples | Part 2

Real-Time Monitoring Solutions for Modern Web Applications

May 11, 2025 By OpsMatters In OpsMatters

Web applications have evolved from simple static sites into complex distributed systems spanning multiple servers, services, and geographical locations. This evolution has created new challenges for monitoring these applications effectively. Today's web stacks require comprehensive visibility across all layers to ensure optimal performance and reliability.

Read Post

OpsMatters

Read more about Real-Time Monitoring Solutions for Modern Web Applications

What Is an API Outage? Why It Happens and How to Avoid It

May 10, 2025 By Nuno Tomas In isDown

APIs are a big part of how modern applications or services work. They act as bridges, allowing systems to talk to each other and share data. Whether it's logging into an app or making an online payment, an application programming interface helps make that process smooth. But what happens when an API suddenly stops working? Even a short outage can cause a disruption. It can break features, delay operations, and impact users and businesses alike.

Read Post

isDown

Read more about What Is an API Outage? Why It Happens and How to Avoid It

SQL analytics - unified querying across any API

May 10, 2025 By John Hayes In Squared Up

SQL is just for querying relational data, right? Well, not necessarily! With our SQL Analytics feature, you can run SQL queries over all types of data from all kinds of backend stores. This gives you incredible flexibility and power – you can even combine different types of entity (e.g. a pull request and a pipeline run) in a single query. Equally, I could have datasets with job tickets from Jira, ServiceNow and Zendesk and combine them in a single query.

Read Post

Squared Up

Read more about SQL analytics - unified querying across any API

From Logs to Metrics Part 2: Building an Open-Source Logs-to-Graphite Pipeline

May 9, 2025 By Benjamin Pitts In MetricFire

Monitoring doesn't always need to be complex. In this guide, we'll show you how to transform some raw logs into usable metrics using a lightweight, open-source setup. We'll also use the Telegraf agent to convert logs into Graphite metrics that you can easily visualize and alert on. This is ideal for system admins, DevOps beginners, or anyone interested in building more innovative monitoring pipelines from scratch.

Read Post

MetricFire

Read more about From Logs to Metrics Part 2: Building an Open-Source Logs-to-Graphite Pipeline

Deep Temporal Observability - Correlate Metrics with logs & traces

May 9, 2025 By SigNoz - Open Source Observability Platform In SigNoz

We built deep Temporal observability in SigNoz.

View Video

SigNoz

Read more about Deep Temporal Observability - Correlate Metrics with logs & traces

Third party API Monitoring Powered by OpenTelemetry Semantics

May 9, 2025 By SigNoz - Open Source Observability Platform In SigNoz

View Video

SigNoz

Read more about Third party API Monitoring Powered by OpenTelemetry Semantics

VictoriaMetrics Components: Getting Started

May 9, 2025 By Phuong Le In VictoriaMetrics

This article introduces the key components of VictoriaMetrics and explains how they work together as part of a complete monitoring system. VictoriaMetrics is a top-tier monitoring solution known for its speed and low-resource consumption. It includes components for monitoring, alerting, data visualization, querying, scraping, incremental backups, and more.

Read Post

VictoriaMetrics

Read more about VictoriaMetrics Components: Getting Started

How to benchmark Elasticsearch performance with ingest pipelines and your own logs

May 9, 2025 By Philipp Kahr In Elastic

When setting up an Elasticsearch cluster, one of the most common use cases is to ingest and search through logs. This blog post focuses on getting a benchmark that will tell you how well your cluster will handle your workload. It allows you to create a reproducible environment for testing things out. Do you want to change the mapping of something, drop some fields, alter the ingest pipeline?

Read Post

Elastic

Read more about How to benchmark Elasticsearch performance with ingest pipelines and your own logs

CloudWatch vs OpenTelemetry: Choosing What Fits Your Stack

May 9, 2025 By Anjali Udasi In Last9

Choosing the right observability setup isn’t just a checkbox—it affects how quickly you can detect issues, debug them, and keep your systems reliable. CloudWatch and OpenTelemetry take different paths to that goal: one is a managed service tightly coupled with AWS, the other a flexible, open-source framework that's becoming a go-to in modern monitoring stacks.

Read Post

Last9

Read more about CloudWatch vs OpenTelemetry: Choosing What Fits Your Stack

Windows Monitoring with Sysmon: Practical Guide and Configuration

May 9, 2025 By Isaac García In Pandora FMS

One might think that, considering how effective some companies are at logging everything we do to serve us ads, they’d at least apply that to help us understand what’s happening on our systems and monitor their performance and security. But in the case of Windows, traditional logs fall short — and that’s where the importance of Sysmon comes in. Sysmon is a Windows service that logs operating system activity into the event log.

Read Post

Pandora FMS

Read more about Windows Monitoring with Sysmon: Practical Guide and Configuration

OnlineOrNot updates from April 2025

May 9, 2025 By Max Rozen In OnlineOrNot

In April I worked on a behind the scenes refactor, webhooks for status pages, a new endpoint, and more.

Read Post

OnlineOrNot

Read more about OnlineOrNot updates from April 2025

Establishing SD-WAN Observability to Fuel SASE Success

May 9, 2025 By Seth Differ In Broadcom

For today’s enterprises, ensuring optimized network connectivity and robust network security represent key imperatives. Given that, it makes sense that there’s rapidly growing use of solutions like secure access service edge (SASE). In fact, the SASE market is expected to grow to $5.9 billion by 2028. SASE delivers converged network and security capabilities. SASE is a cloud-based offering that is primarily delivered on an as-a-service basis.

Read Post

Broadcom

Read more about Establishing SD-WAN Observability to Fuel SASE Success

Process Monitoring - Huge Value from a Quick Task

May 9, 2025 By Steve Danseglio In Broadcom

DX Unified Infrastructure Management (DX UIM) from Broadcom is a comprehensive solution for monitoring an organization’s entire IT infrastructure. The product provides IT administrators and operations teams with a centralized view of their infrastructure to ensure availability and performance of servers, network devices, storage systems, virtualization environments, applications, and cloud services.

Read Post

Broadcom

Read more about Process Monitoring - Huge Value from a Quick Task

Making Network Intelligence Accessible to Everyone

May 9, 2025 By Dallon Robinette In Selector

For years, network operations have relied on complex query languages that demand specialized knowledge. Extracting insights from network data often meant writing intricate commands in formats like SQL, a skill reserved for seasoned IT professionals. But what if anyone, regardless of expertise, could ask a simple question and get immediate, accurate answers from their network?

Read Post

Selector

Read more about Making Network Intelligence Accessible to Everyone

This Month in Datadog: OpenTelemetry Collector distribution, GitHub Copilot integration, and more

May 9, 2025 By Datadog In Datadog

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog. This month, we put the Spotlight on the Datadog Distribution of the OpenTelemetry Collector.

View Video

Datadog

Read more about This Month in Datadog: OpenTelemetry Collector distribution, GitHub Copilot integration, and more

Solr Key Metrics: The Essential Guide for DevOps & SREs

May 9, 2025 By Faiz Shaikh In Last9

When Solr performance degrades, everyone notices - except maybe your monitoring system. The right metrics can alert you to problems before your users start complaining. This guide walks through the essential Solr metrics that matter for production deployments.

Read Post

Last9

Read more about Solr Key Metrics: The Essential Guide for DevOps & SREs

An ultimate step-by-step guide on Checkmk Cloud Monitoring

May 9, 2025 By Tim Nguyen Van In iLert

Checkmk launched Checkmk Cloud (SaaS) in February 2025, which is a fully managed, cloud-based version of their monitoring technology. This solution, designed for ease of use, allows enterprises to start monitoring their IT infrastructure with no installation, maintenance, or manual upgrades required. The SaaS version is compatible with both cloud-based and on-premises systems, bringing them together under a single, straightforward platform.

Read Post

iLert

Read more about An ultimate step-by-step guide on Checkmk Cloud Monitoring

Internet Latency: What Is It, How to Measure It, and How to Improve It

May 9, 2025 By Andrii Kernitskyi In Obkio

Internet latency, the often-overlooked delay between sending and receiving data, can mean the difference between a flawless video conference and a frustrating, glitchy mess. Measured in milliseconds (ms), these microscopic delays accumulate, creating tangible performance issues across all online activities.

Read Post

Obkio

Read more about Internet Latency: What Is It, How to Measure It, and How to Improve It

OpenTelemetry PHP: A Detailed Implementation Guide

May 9, 2025 By Preeti Dewani In Last9

Monitoring complex PHP applications can be challenging. When systems span multiple services and environments, traditional logging approaches often fall short. OpenTelemetry offers a solution - an open-source, vendor-neutral framework that standardizes how we collect and export telemetry data. This guide covers practical implementation steps for DevOps engineers working with PHP applications.

Read Post

Last9

Read more about OpenTelemetry PHP: A Detailed Implementation Guide

Graylog Enterprise Tour!

May 9, 2025 By Graylog In Graylog

Check out the latest tour of Graylog Enterprise. Check out Graylog Data Lake, Content Hub, Alerting, Role-Based Access Control, and more!

View Video

Graylog

Read more about Graylog Enterprise Tour!

The Best Open-Source Dashboard Tools for 2025: Expert Guide to Choosing the Right One

May 8, 2025 By Vipul Gupta In MetricFire

Table of Contents In today’s digital operations, dashboards aren’t just nice-to-haves—they’re essential. Teams across engineering, product, operations, and business intelligence rely on real-time data visibility to monitor systems, analyze trends, and catch anomalies before they escalate. For many organizations, open-source dashboard tools offer the best combination of flexibility, transparency, and cost-efficiency.

Read Post

MetricFire

Read more about The Best Open-Source Dashboard Tools for 2025: Expert Guide to Choosing the Right One

Reducing MTTR with Cloud Pathfinder

May 8, 2025 By Kentik In Kentik

Learn how to quickly identify and resolve cloud connectivity issues with Kentik’s Cloud Pathfinder. We demonstrate how Cloud Pathfinder simplifies troubleshooting by automatically mapping cloud network paths, pinpointing misconfigured security rules or incorrect routes, and providing actionable insights powered by integrated AI analysis. Reduce mean time to resolution (MTTR) and gain instant visibility into your cloud infrastructure with Kentik.

View Video

Kentik

Read more about Reducing MTTR with Cloud Pathfinder

The Azure Metrics That Actually Reduce Cloud Costs

May 8, 2025 By LogicMonitor In LogicMonitor

This is the fourth blog in our Azure Monitoring series, and this time, we’re digging into cost efficiency. Azure makes it easy to scale, but just as easy to overspend. Idle VMs, forgotten disks, and silent data transfer fees add up fast. The result is budget overruns that catch teams off guard and force reactive cuts. This blog breaks down the Azure metrics that actually help you reduce waste, improve visibility, and keep cloud spend aligned with business priorities. Missed our earlier posts?

Read Post

LogicMonitor

Read more about The Azure Metrics That Actually Reduce Cloud Costs

Know When to Fold Legacy Tools-And When to Go All In on AIOps

May 8, 2025 By ScienceLogic In ScienceLogic

Managed Service Providers (MSPs) are under intense pressure: 24/7 uptime, seamless client experiences, and proactive service models are table stakes. But many are still operating with fragmented, manual-heavy toolsets designed for a simpler era.

Read Post

ScienceLogic

Read more about Know When to Fold Legacy Tools-And When to Go All In on AIOps

Introducing Coralogix Continuous Profiling

May 8, 2025 By Coralogix In Coralogix

Debug faster, improve application performance, and lower your cloud costs - without slowing down production. Traditional profiling solutions come with a heavy price—added latency, excessive resource consumption, and performance degradation. At, we’re changing the game with Continuous Profiling, the first of its kind to offer real-time, kernel-level visibility into application performance without any code changes or production impact.

View Video

Coralogix

Read more about Introducing Coralogix Continuous Profiling

Agentic AI in financial services: The rise of autonomous intelligence

May 8, 2025 By Karen Mcdermott In Elastic

Agentic AI is coming to financial services. Elastic provides the data foundation and tools to make it work. In a recent talk at Stanford University, Jamie Dimon, chairman and CEO of JPMorganChase, addressed the firm’s use of AI and ended with mentioning that agentic AI was the next frontier of AI at the firm, inferring it wasn’t ready to be deployed yet. Let’s break down why that may be the case and what the financial services industry can do to become more comfortable with agentic AI.

Read Post

Elastic

Read more about Agentic AI in financial services: The rise of autonomous intelligence

Unleash SaaS Data With the Webhookevent Receiver

May 8, 2025 By Mike Terhar In Honeycomb

There are many vendors, Honeycomb included, where actions on the application can emit a web request that goes to another service for coordination or tracking purposes. Many vendors have pre-built integrations, but some have a fallback that says “Custom Webhook” or similar. If you’re looking to create a full picture of your request flow, you would want these other services to show up in your trace waterfall.

Read Post

Honeycomb

Read more about Unleash SaaS Data With the Webhookevent Receiver

Get Started With WhatsUp Gold in 5 Minutes

May 8, 2025 By Progress WhatsUp Gold In WhatsUp Gold

This video shows how to get up and running with WhatsUp Gold in 5 minutes.

View Video

WhatsUp Gold

Read more about Get Started With WhatsUp Gold in 5 Minutes

How It Works | Discover Obkio's Free Trial

May 8, 2025 By Obkio In Obkio

Discover Obkio’s Free Trial in Just 1 Minute! Are you curious about how Obkio works? This quick video gives you a fast, clear look at how our Network Performance Monitoring tool helps you identify, troubleshoot, and resolve network issues before they impact your users.

View Video

Obkio

Read more about How It Works | Discover Obkio's Free Trial

Common Pain Points Faced by MSPs | Obkio

May 8, 2025 By Obkio In Obkio

MSP Life Got You Like... Juggling client networks, mystery slowdowns, and finger-pointing? Yup, we’ve been there. In this short clip, we’re breaking down the real pain points MSPs face daily—and how Obkio makes them way less painful. Monitor. Troubleshoot. Breathe easy. MSPs, this one's for you.

View Video

Obkio

Read more about Common Pain Points Faced by MSPs | Obkio

End the MSP-Blame Game with Obkio Network Monitoring

May 8, 2025 By Obkio In Obkio

Tired of Playing the Blame Game? Client says it’s the network. ISP says it’s not them. You’re stuck in the middle. With Obkio, you get the proof you need to say, “Not it!” Watch how network monitoring helps MSPs finally put the blame game to rest—for good.

View Video

Obkio

Read more about End the MSP-Blame Game with Obkio Network Monitoring

Grafana Cloud Migration Assistant: from self-hosted to the cloud in minutes

May 8, 2025 By Daniel Ken Lee In Grafana

Moving your existing Grafana instance to Grafana Cloud just got dramatically simpler. Today, we’re excited to announce the general availability of the Grafana Cloud Migration Assistant, a powerful yet intuitive tool designed to streamline your migration journey. Traditionally, migrating from Grafana OSS or Grafana Enterprise to Grafana Cloud required technical expertise with Grafana’s HTTP API or command-line tools like Grizzly.

Read Post

Grafana

Read more about Grafana Cloud Migration Assistant: from self-hosted to the cloud in minutes

Grafana Alloy at 1: What's new and what's next for our OpenTelemetry Collector distribution

May 8, 2025 By Sam Alipio In Grafana

It’s been a year since we launched Grafana Alloy, our OpenTelemetry Collector distribution with built-in Prometheus pipelines and support for metrics, logs, traces, and profiles. OpenTelemetry is quickly becoming an industry standard for telemetry collection, processing, and delivery, and we’re committed to making Alloy the best possible collector for telemetry data, whether you’re using it with Grafana Cloud or not.

Read Post

Grafana

Read more about Grafana Alloy at 1: What's new and what's next for our OpenTelemetry Collector distribution

Track MongoDB Performance Metrics Without the Noise

May 8, 2025 By Anjali Udasi In Last9

When your MongoDB database slows down, it affects your entire application stack. Performance issues can range from minor inconveniences to major outages, making a solid understanding of MongoDB metrics essential for any DevOps engineer. This guide covers the key performance metrics you need to monitor in MongoDB, how to interpret what you're seeing, and practical steps to resolve common issues.

Read Post

Last9

Read more about Track MongoDB Performance Metrics Without the Noise

The Complete Guide to Observing RabbitMQ

May 8, 2025 By Faiz Shaikh In Last9

Message queues quietly power a lot of what happens behind the scenes in distributed systems. RabbitMQ is no exception—when it’s working, you don’t notice it. But when it’s not, things break in ways that are hard to trace. This guide walks through what you need to monitor in RabbitMQ, how to set it up, and how to troubleshoot when things go wrong—so you’re not stuck guessing when messages go missing.

Read Post

Last9

Read more about The Complete Guide to Observing RabbitMQ

How to Control and Optimize Azure Costs Without Losing Visibility

May 8, 2025 By LogicMonitor In LogicMonitor

This is the ninth post in our Azure Monitoring series, and it’s all about taking control of your cloud costs without losing visibility. We’ll unpack why Azure bills tend to spiral, where native tools fall short, and what it really takes to cut spend while keeping performance on point. You’ll walk away with practical ways to spot waste early, act fast, and stay ahead of surprise invoices. Missed the earlier posts? You can catch up anytime.

Read Post

LogicMonitor

Read more about How to Control and Optimize Azure Costs Without Losing Visibility

Why Traditional Monitoring Fails in an Internet-First World

May 8, 2025 By Catchpoint In Catchpoint

Modern applications are more complex than ever—distributed across cloud environments, reliant on APIs, and dependent on internet-based services. Yet, traditional monitoring approaches fail to capture the full picture, leaving IT teams blind to performance-impacting issues.

View Video

Catchpoint

Read more about Why Traditional Monitoring Fails in an Internet-First World

Why IT Must Rethink Monitoring for the Cloud-First Era

May 8, 2025 By Catchpoint In Catchpoint

As applications become more cloud-centric, distributed and service-oriented, traditional system-centric monitoring is no longer enough. The internet is now the backbone of digital experiences, requiring a shift in how organizations monitor uptime and optimize performance.

View Video

Catchpoint

Read more about Why IT Must Rethink Monitoring for the Cloud-First Era

What You Didn't See During the GrafanaCON 2025 Keynote Livestream...

May 8, 2025 By Grafana In Grafana

Our GrafanaCON co-chairs take you on a backstage tour through GrafanaCON 2025 Day 1 — sneak peeks, activities, and the conference magic. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more.

View Video

Grafana

Read more about What You Didn't See During the GrafanaCON 2025 Keynote Livestream...

Cloud quotas: How to make cloud management easy

May 8, 2025 By Geoffrin Edwin In Site24x7

In the past, a cloud architect's pain point was usually deciding between these two options: To tackle this confusion, major cloud service providers (CSPs) launched quotas (in their own words). To give you examples, here are the different terminologies used by the three major public CSPs: The main ingredient of a well-oiled cloud setup that significantly impacts cloud operations is understanding and managing cloud quotas, also known as service quotas.

Read Post

Site24x7

Read more about Cloud quotas: How to make cloud management easy

Top 3 tools for DORA metrics reporting: SquaredUp vs Power BI vs Jira

May 8, 2025 By Louise Berry In Squared Up

What is it that makes a high-performing software engineering team successful? This was the challenge undertaken by the DevOps Research and Assessment (DORA) team around 2015, who created a set of metrics that could provide a reliable, data-driven way to measure and improve software delivery performance.

Read Post

Squared Up

Read more about Top 3 tools for DORA metrics reporting: SquaredUp vs Power BI vs Jira

Meta-monitoring Loki (Loki Community Call May 2025)

May 8, 2025 By Grafana In Grafana

In this Loki Community Call, we talk about the need for meta-monitoring Loki: why Loki needs to be monitored, what to watch out for, and how to do it. We talk about different ways to get information from Loki that allow you to make it reliable, consistent, and performant, including a Helm chart to deploy a meta-monitoring stack on Kubernetes. We discuss the Loki mixin for Grafana and how to use it to visualize data about Loki. On the call are Jay Clifford, Nicole van der Hoeven, and Dylan Guedes from Grafana Labs.

View Video

Grafana

Read more about Meta-monitoring Loki (Loki Community Call May 2025)

What Is a Network Assessment, and What Is a Network Audit?

May 8, 2025 By Staff Contributor In SolarWinds

These days, networks are larger and more complex than ever. It’s all too easy to fall short when managing performance, security, and compliance. That’s where network assessments and network audits can help. Both network assessments and network audits can give you a more comprehensive understanding of your network and its current strengths, weaknesses, and threats. As a result, you can quickly identify and resolve issues.

Read Post

SolarWinds

Read more about What Is a Network Assessment, and What Is a Network Audit?

It's not just about fixing problems, it's about detecting them before they escalate.

May 8, 2025 By Catchpoint In Catchpoint

IT teams can’t solve what they can’t see. Undetected issues impacting end users lead to lost revenue, brand reputation damage, and frustrated customers. That’s why proactive monitoring is critical. By simulating end-user experiences, you catch small issues before they snowball into major incidents—saving time, money, and operational headaches.

View Video

Catchpoint

Monitoring

Read more about It's not just about fixing problems, it's about detecting them before they escalate.

Digital transformation still a work in progress for 94% of public sector organisations

May 7, 2025 By SolarWinds In SolarWinds

Over half of public sector IT leaders (58%) say skill and talent gaps are a top challenge to digital transformation efforts.

Read Post

SolarWinds

Read more about Digital transformation still a work in progress for 94% of public sector organisations

The Power of Great Design: Introducing the Enhanced Administrative UI for InfluxDB Cloud Dedicated

May 7, 2025 By Ritwika Ghosh In InfluxData

Managing your InfluxDB Cloud Dedicated environment just got easier. We’ve introduced an admin UI to streamline everyday tasks, so you can spend less time navigating settings and more time working with your data. The update is built for speed and usability. Whether you’re creating tables, managing tokens, or checking database status, the new UI helps you move faster with: This update is all about reducing friction for developers and teams managing time series infrastructure at scale.

Read Post

InfluxData

Read more about The Power of Great Design: Introducing the Enhanced Administrative UI for InfluxDB Cloud Dedicated

What Is Snort, How It Works, and Its Integration with SIEM for Cybersecurity

May 7, 2025 By Isaac García In Pandora FMS

You can’t defend against what you can’t see. That’s why the first essential requirement in cybersecurity is to know everything happening in your systems. To achieve this, we implement an IDS (Intrusion Detection System)—a solution that tirelessly monitors every corner of your network like the Eye of Sauron, instantly alerting you to breach attempts and suspicious behavior. Among IDS options, Snort stands out as one of the most popular.

Read Post

Pandora FMS

Read more about What Is Snort, How It Works, and Its Integration with SIEM for Cybersecurity

Network Stress Testing: What It Is & How to Run One

May 7, 2025 By Andrii Kernitskyi In Obkio

You’ve optimized your QoS settings, fine-tuned your firewall, and even upgraded your bandwidth, but what happens when your network gets hit with 10x the normal traffic? Will it hold up, or will it buckle under the pressure, leaving your users staring at spinning wheels and timeout errors? If you’re an IT pro, you know outages don’t happen during idle hours. They strike when traffic spikes.

Read Post

Obkio

Read more about Network Stress Testing: What It Is & How to Run One

Laravel just works. Now your performance monitoring does too.

May 7, 2025 By Will McMullen In Sentry

You remember that first time spinning up a Laravel app? Routes, auth, ORM, queues, all wired up without much effort. It’s one of the reasons Laravel feels productive out of the box. But when something starts slowing down, an Eloquent query drags, a job takes forever, or cache misses creep up, it’s not always obvious where to look. Laravel gives you the tools, but connecting the dots between them is usually on you.

Read Post

Sentry

Read more about Laravel just works. Now your performance monitoring does too.

Kubernetes Alerting That Won't Burn You Out

May 7, 2025 By Anjali Udasi In Last9

Kubernetes production environments require robust alerting to catch problems before they impact users. While monitoring shows system state, proper alerting tells you when something needs attention. This guide outlines 15 key Kubernetes alerts that help DevOps teams avoid outages and minimize downtime. For each alert, we provide implementation guidance and troubleshooting steps to resolve common issues quickly.

Read Post

Last9

Read more about Kubernetes Alerting That Won't Burn You Out

GrafanaCON 2025 Keynote Livestream

May 7, 2025 By Grafana In Grafana

So. Many. Announcements! Grafana 12 is massive. Tune in live to see what it's all about and get a sneak peak of everything we're releasing. We ran into a hiccup during the Git Sync demo. Relevant links: Follow us for the latest and greatest on all things Grafana and our other OSS projects.

View Video

Grafana

Read more about GrafanaCON 2025 Keynote Livestream

Splunk Observability Cloud's AI Assistant in Action | Practical Examples | Part 1

May 7, 2025 By Splunk In Splunk

In this video, we’ll provide practical, real-time examples demonstrating how to effectively use the AI Assistant in Splunk Observability Cloud. You'll learn how the AI Assistant can quickly identify unknown issues in your environment, perform detailed root cause analysis, analyze service performance and deployment impacts, and even help manage infrastructure costs and compliance. TOC.

View Video

Splunk

Read more about Splunk Observability Cloud's AI Assistant in Action | Practical Examples | Part 1

Google's Agent-to-Agent (A2A) Protocol is here-Now Let's Make it Observable

May 7, 2025 By Ankit Kumar In Catchpoint

Can your AI tools really work together, or are they still stuck in silos? With Google’s new Agent-to-Agent (A2A) protocol, the days of isolated AI agents are numbered. This emerging standard lets specialized agents communicate, delegate, and collaborate—unlocking a new era of modular, scalable AI systems. Here’s how A2A could transform your workflows, and why making it observable is just as important as making it possible.

Read Post

Catchpoint

Read more about Google's Agent-to-Agent (A2A) Protocol is here-Now Let's Make it Observable

2024 E-Commerce Site Speed & Cart Abandonment Report

May 7, 2025 By Simon Rodgers In WebSitePulse

This report aims to show how site speed affects cart abandonment. It covers global data for 2024, focusing on delays between two and three seconds.

Read Post

WebSitePulse

Read more about 2024 E-Commerce Site Speed & Cart Abandonment Report

Here are 10 ways to prevent website downtime

May 7, 2025 By Mattias Geniar In Oh Dear

Every minute of website downtime cost large organizations an average of $9,000. That’s half a million dollars every hour, damn. And that’s just the average. If your organization heavily relies on your website to do business, that cost can increase even further. Needless to say, preventing website downtime is a top priority.

Read Post

Oh Dear

Read more about Here are 10 ways to prevent website downtime

Essential Python Monitoring Techniques You Need to Know

May 7, 2025 By Anjali Udasi In Last9

Python powers critical applications across countless organizations, from data processing pipelines to web services that handle millions of requests. While Python's readability and extensive ecosystem make it a developer favorite, its performance characteristics require thoughtful monitoring. As systems grow in complexity, understanding what's happening inside your Python applications becomes increasingly important.

Read Post

Last9

Read more about Essential Python Monitoring Techniques You Need to Know

Grafana 12 release: observability as code, dynamic dashboards, new Grafana Alerting tools, and more

May 7, 2025 By Grafana Labs Team In Grafana

Grafana 12 is here! During the opening keynote of GrafanaCON 2025, we unveiled dozens of new reasons to fall in love with everyone’s favorite dashboarding and visualization tool—especially if your job is to keep teams, services, and, of course, a whole lot of beautiful Grafana dashboards organized. Grafana 12.0: Download now!

Read Post

Grafana

Read more about Grafana 12 release: observability as code, dynamic dashboards, new Grafana Alerting tools, and more

GrafanaCON 2025: A guide to all the announcements from Grafana Labs

May 7, 2025 By Grafana Labs Team In Grafana

GrafanaCON 2025 is in full swing in Seattle, where members of our open source community have gathered to explore the latest updates to Grafana Labs’ OSS projects, share their inspiring use cases, and build lasting connections at our biggest community event yet.

Read Post

Grafana

Read more about GrafanaCON 2025: A guide to all the announcements from Grafana Labs

A Detailed Guide on Docker Container Performance Metrics

May 7, 2025 By Preeti Dewani In Last9

Docker containers isolate application environments, making performance monitoring essential for visibility and stability — especially at scale. To manage production effectively, teams need clear insights into resource usage, bottlenecks, and failure points. This guide covers key Docker metrics, how to collect them, and how to use that data to keep your containerized systems running smoothly.

Read Post

Last9

Read more about A Detailed Guide on Docker Container Performance Metrics

Optimising OpenTelemetry Pipelines to Cut Observability Costs and Data Noise

May 7, 2025 By Elizabeth Mathew In SigNoz

Fat bills from observability vendors and tons of not-so-insightful telemetry data have turned out to be a very common issue today. This often leaves teams having to explain the lack of clear ROI, despite the growing costs. If you’re using OpenTelemetry to record your observability data, there are some practical methods you can apply to keep those costs from piling up.

Read Post

SigNoz

Read more about Optimising OpenTelemetry Pipelines to Cut Observability Costs and Data Noise

Modern Logging, Smarter Pricing: Why Graylog's Consumption Model Just Makes Sense

May 7, 2025 By The Graylog Product Team In Graylog

In the world of log management and security analytics, one thing is abundantly clear: data volumes fluctuate. Yet most pricing models haven’t caught up. Traditional ingest-based licensing models force organizations to size their license needs based on a worst-case capacity scenario—the “high-water mark”—whether those spikes are rare and/or expected.

Read Post

Graylog

Read more about Modern Logging, Smarter Pricing: Why Graylog's Consumption Model Just Makes Sense

Kentik Brings Network Intelligence to ServiceNow

May 7, 2025 By Jason McKerr In Kentik

Kentik and ServiceNow are teaming up to bring network intelligence to the ServiceNow AI Platform. This integration enables ServiceNow ITOM customers, even those without deep network expertise, to answer questions about connectivity, performance, and more.

Read Post

Kentik

Read more about Kentik Brings Network Intelligence to ServiceNow

The GrafanaCON 2025 Keynote in 1 minute

May 7, 2025 By Grafana In Grafana

Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more. We also have plans for every use case.

View Video

Grafana

Read more about The GrafanaCON 2025 Keynote in 1 minute

May 2025 Product Updates

May 7, 2025 By Leo Baecker In Hyperping

We're excited to share several improvements, enhancing visibility into your services and making it easier to manage your reliability data.

Read Post

Hyperping

Read more about May 2025 Product Updates

Observability Best Practices: Balancing Sustainability and Cost in a Data-Driven World

May 7, 2025 By Gita Rao Prasad In eG Innovations

Imagine this: Your IT team has invested in cutting-edge observability tools to keep systems running smoothly. But does that imply you are following observability best practices? As your business grows, so does the flood of logs, traces, and metrics—along with a skyrocketing cloud bill. What started as a way to gain better visibility is now a major expense, and suddenly, you’re asking: Are we paying too much for too little value? This challenge is becoming all too common.

Read Post

eG Innovations

Read more about Observability Best Practices: Balancing Sustainability and Cost in a Data-Driven World

How to decide between cloud and on-premise monitoring

May 6, 2025 By Dave Swersky In Raygun

Application performance monitoring systems tend to be available in two modes: on-premise and cloud-based SaaS. Which is the "right" choice? Well, it depends on your situation, but overall cloud-based SaaS offerings have significant benefits when compared to on-premise. However, it's not always so simple. The right selection depends on the facts on the ground. Using my experience working for a large-scale cloud solutions department, I've put together some key things you'll want to consider before you make a decision, starting with some benefits and challenges.

Read Post

Raygun

Read more about How to decide between cloud and on-premise monitoring

IT Monitoring News | May '25 Edition

May 6, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Welcome to the May edition of the NiCE bi-monthly monitoring news! As we move further into the year, we’re here with a fresh roundup of updates, insights, and resources from the world of IT monitoring. Whether you’re looking to stay informed, fine-tune your tools, or catch up on what’s new, this edition has you covered. Enjoy the read!

Read Post

NiCE IT Mgmt

Read more about IT Monitoring News | May '25 Edition

Easiest Way to Monitor Loki Performance With Telegraf

May 6, 2025 By Benjamin Pitts In MetricFire

Loki is a powerful, scalable log aggregation system designed by Grafana to efficiently collect, store, and query logs. It’s often deployed alongside Prometheus as part of modern observability stacks. Loki’s design emphasizes cost-effective storage by indexing only metadata, which makes it a great choice for high-volume environments. But while Loki excels at log ingestion and indexing, many teams overlook the critical task of monitoring Loki itself.

Read Post

MetricFire

Read more about Easiest Way to Monitor Loki Performance With Telegraf

AppSignal Closes $22 Million Growth Investment

May 6, 2025 By 3 authors In AppSignal

In August 2012, we set out to build a fairly priced, developer-centric APM and logging platform that we’d love to use ourselves. Soon after, AppSignal was born and quickly gained traction. Despite operating in a competitive market, we’ve become a household name among many of our peers, now serving over 2,000 organizations across six continents.

Read Post

AppSignal

Read more about AppSignal Closes $22 Million Growth Investment

We built AI-powered Root Cause Analysis that actually works

May 6, 2025 By Nikolay Sivko In Coroot

Figuring out why things break still sucks. We’ve got all the data: metrics, logs, traces, but getting to the actual root cause still takes way too long. Observability tools show us everything, but they don’t really tell us what’s wrong. So why do we even need to automate root cause analysis? First, time. Outages are expensive. And if your system has hundreds or thousands of services, digging through everything by hand just takes way too long.

Read Post

Coroot

Read more about We built AI-powered Root Cause Analysis that actually works

Prometheus native histograms in Grafana Cloud: More precise, easier to use, and better compatibility

May 6, 2025 By Gyorgy Krajcsovits In Grafana

Histograms help you monitor and visualize the distribution of values for key metrics, such as response times or request sizes of a service. They’re frequently used to gain insights into data patterns, anomalies, and trends, making them an important tool for observability.

Read Post

Grafana

Read more about Prometheus native histograms in Grafana Cloud: More precise, easier to use, and better compatibility

Maintaining Effective IT Infrastructure Monitoring in the Public Sector

May 6, 2025 By Progress WhatsUp Gold In WhatsUp Gold

Public sector organizations have needs very different from their commercial counterparts. Cybercriminals go after public sector organizations because they hold confidential, often classified, information—the exact data state-sponsored and other criminal groups salivate over. Based on tax payments, these organizations serve and answer to the public. Progress WhatsUp Gold offers ample out-of-the box monitoring features, helping you monitor more of what matters to your organization.

View Video

WhatsUp Gold

Read more about Maintaining Effective IT Infrastructure Monitoring in the Public Sector

You can now log in using passkeys

May 6, 2025 By Freek Van der Herten In Oh Dear

We’ve added a new option to log in to Oh Dear: passkeys! When logging in using a passkey, you don’t have to type an email or password, and you won’t be redirect to a third party for logging in. It’s also super fast.

Read Post

Oh Dear

Read more about You can now log in using passkeys

SQL Server Observability: Monitoring, Troubleshooting, and Best Practices

May 6, 2025 By Preeti Dewani In Last9

For DevOps teams managing mission-critical databases, SQL Server observability is a fundamental capability that provides comprehensive insight into database performance and health. Effective observability practices enable teams to identify potential issues before they impact end users and provide the context necessary to resolve problems efficiently. SQL Server observability involves collecting and analyzing metrics, logs, and traces to build a complete picture of database behavior.

Read Post

Last9

Read more about SQL Server Observability: Monitoring, Troubleshooting, and Best Practices

Visualizing Session Flow With Honeycomb

May 6, 2025 By Ken Rimple In Honeycomb

I want to know what users are doing in my application. A distributed trace is the best way to show the data flow of one user interaction through my application, but it isn’t sufficient to show the overall user experience.

Read Post

Honeycomb

Read more about Visualizing Session Flow With Honeycomb

The Definitive Guide to OpenTelemetry Exporters for High-Performance Monitoring

May 6, 2025 By Anjali Udasi In Last9

In modern distributed architectures, observability has shifted from optional to necessary. OpenTelemetry has emerged as the standard framework for telemetry data collection, with exporters serving as the critical bridge to your backend monitoring systems. For developers at any stage—those new to observability practices or those refining existing monitoring setups—a solid grasp of OpenTelemetry exporters will significantly reduce debugging time and improve system visibility.

Read Post

Last9

Read more about The Definitive Guide to OpenTelemetry Exporters for High-Performance Monitoring

Troubleshooting N+1 Errors with Tracing

May 6, 2025 By Sentry In Sentry

Repository: https://github.com/pry0rity/supa-simple-demo
Try Sentry for free: https://sentry.io
Docs: https://docs.sentry.io

View Video

Sentry

Read more about Troubleshooting N+1 Errors with Tracing

Prometheus native histograms in Grafana Cloud

May 6, 2025 By Grafana In Grafana

In this demo video, Gyorgy Krajcsovits, Senior Software Engineer at Grafana Labs, shows you how to use Prometheus native histograms in Grafana.

View Video

Grafana

Read more about Prometheus native histograms in Grafana Cloud

Emerge Tools is now a part of Sentry

May 6, 2025 By David Cramer In Sentry

Today I'm thrilled to announce that Emerge Tools is joining Sentry. Emerge builds best-in-class mobile tooling trusted by some of the most important brands in the world. You’ve probably seen the work of the team through their relentless efforts to improve mobile builds, efforts we’ve always admired here at Sentry. It was no surprise that when we finally met Emerge founders Josh and Noah we found that we shared a similar view of the world and hit it off instantly.

Read Post

Sentry

Read more about Emerge Tools is now a part of Sentry

WhatsUp Gold IT-Infrastruktur-Monitoring: Überblick, Entwicklungen & Neuerungen

May 5, 2025 By Progress WhatsUp Gold In WhatsUp Gold

Erleben Sie in unserem exklusiven Webinar, wie Sie mit WhatsUp Gold Ihr IT-Monitoring auf das nächste Level heben. Wir zeigen Ihnen die neuesten Funktionen, spannende Weiterentwicklungen der letzten zwei Jahre und geben einen exklusiven Ausblick auf das, was kommt – darunter das leistungsstarke Network Traffic Analysis Plus (NTA+).

View Video

WhatsUp Gold

Read more about WhatsUp Gold IT-Infrastruktur-Monitoring: Überblick, Entwicklungen & Neuerungen

What Can SCORCH, SCSM & SCVMM Do for SCOM? Find Out at Our Expert Session!

May 5, 2025 By NiCE IT Management Solutions In NiCE IT Mgmt

SCOMathon 2025 | Panel Session by Axians, NiCE, and Kelverion Are you making the most of Microsoft System Center 2025? Join us for a power-packed expert discussion where we break down how SCOM, SCORCH, SCSM, and SCVMM work together to supercharge your IT operations!

View Video

NiCE IT Mgmt

Read more about What Can SCORCH, SCSM & SCVMM Do for SCOM? Find Out at Our Expert Session!

Mastering Network Configuration for Stability and Security

May 5, 2025 By Yann Guernion In Broadcom

Your network is the central nervous system of your business. Its performance, reliability, and security have a direct impact on your organization’s operations, revenue, and reputation. Yet, lurking within this critical infrastructure is a common source of disruption and risk: network configuration changes.

Read Post

Broadcom

Read more about Mastering Network Configuration for Stability and Security

Microsoft SCOM Management Pack Wonderland by NiCE

May 5, 2025 By NiCE IT Management Solutions In NiCE IT Mgmt

SCOMathon 2025 | NiCE Session Get a quick intro to all NiCE Management Packs and learn how they can elevate your SCOM monitoring.

View Video

NiCE IT Mgmt

Read more about Microsoft SCOM Management Pack Wonderland by NiCE

How can we modernize payment infrastructure at a global scale?

May 5, 2025 By Kelly Manrique In Elastic

How SWIFT and Elastic are tackling infrastructure complexity, false alerts, and rising compliance demands.

Read Post

Elastic

Read more about How can we modernize payment infrastructure at a global scale?

Stop Playing IT Whack-a-Mole: The Smarter Way to Prevent Outages Before They Happen

May 5, 2025 By ScienceLogic In ScienceLogic

The challenges facing IT operations teams today are bigger than ever before. Hybrid cloud adoption, sprawling infrastructure, the explosive growth of telemetry data, and the accelerating pace of digital business have pushed traditional monitoring approaches to their breaking point. Yet for many organizations, the operational model remains stubbornly reactive: a never-ending game of IT whack-a-mole, where teams are trapped responding to incidents instead of preventing them.

Read Post

ScienceLogic

Read more about Stop Playing IT Whack-a-Mole: The Smarter Way to Prevent Outages Before They Happen

Bring third-party incidents into Better Stack

May 5, 2025 By Nuno Tomas In isDown

Incidents in cloud and SaaS tools block users just as hard as faults in your own code. The fix comes faster when the same on-call queue covers both. IsDown now plugs straight into Better Stack through a native API connection. Every outage that IsDown detects shows up as an incident in Better Stack, follows your existing escalation rules, and clears automatically once the vendor recovers.

Read Post

isDown

Read more about Bring third-party incidents into Better Stack

Agentic AIOps: Why Agent-Driven Solutions Are Defining the Future of IT Operations

May 5, 2025 By LogicMonitor In LogicMonitor

AIOps is overdue for reinvention. The last decade promised faster resolution and smarter alerts—but most tools are still built on outdated assumptions: linear workflows and deterministic rules. Now, a new model is emerging. Not reactive. Not rule-based. Agentic. Agentic AIOps is about taking action. Products like LogicMonitor’s Edwin AI go beyond recommendations—they correlate, decide, and remediate in real time.

Read Post

LogicMonitor

Read more about Agentic AIOps: Why Agent-Driven Solutions Are Defining the Future of IT Operations

Telemetry Flow from #WindowsEvents to #Bindplane Gateway and #GoogleSecOps

May 5, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane‬ Google SecOps deep dive workshop.

View Video

ObservIQ

Read more about Telemetry Flow from #WindowsEvents to #Bindplane Gateway and #GoogleSecOps

Logz.io Integration for AWS and Kubernetes Observability

May 5, 2025 By Jade Lassery In logz.io

Ever feel like you’re flying blind in your AWS environment? You’re not alone. In the sprawling universe of microservices, containers, and serverless functions, trying to troubleshoot without proper observability is like trying to find a bug in a datacenter… with the lights off… while wearing sunglasses.

Read Post

logz.io

Read more about Logz.io Integration for AWS and Kubernetes Observability

Reporting CSP Errors in Honeycomb With the OpenTelemetry Collector

May 5, 2025 By Martin Holman In Honeycomb

The HTTP Content-Security-Policy response header is used to control how the browser is allowed to load various content types. It is used to control which URLs, fonts, images, scripts, and more can be loaded onto the page. It’s a great defense against XSS (cross-site scripting), clickjacking, and cross-site vulnerabilities. The header can also specify a URL that will be used to send reports on violations of these properties.

Read Post

Honeycomb

Read more about Reporting CSP Errors in Honeycomb With the OpenTelemetry Collector

How Docker Logging Drivers Work

May 5, 2025 By Anjali Udasi In Last9

Troubleshooting containerized applications can quickly become complex when logs are scattered across multiple systems. Most DevOps teams face this challenge daily—what starts as a simple container deployment often evolves into a complex logging puzzle. This guide explores Docker logging drivers in depth, covering configuration options, best practices, and practical solutions.

Read Post

Last9

Read more about How Docker Logging Drivers Work

React Logging: How to Implement It Right and Debug Faster

May 5, 2025 By Faiz Shaikh In Last9

React logging is the practice of recording relevant information about your application's behavior during runtime. Unlike traditional server-side logging, React logging happens in the browser and focuses on frontend concerns: component lifecycle events, state changes, user interactions, performance metrics, and network requests. Effective logging creates breadcrumbs that help you understand application flow and quickly pinpoint problems.

Read Post

Last9

Read more about React Logging: How to Implement It Right and Debug Faster

Monitoring the Impossible & Other Use Cases - Webinar by 2Steps Tech with David Dick (co-founder)

May 5, 2025 By 2 Steps In 2 Steps

2Steps is changing the landscape of proactive monitoring. Now, in this lunch-and-learn, you get a deeper dive on the platform and how organisations are using it for previously-unsolved problems. Observability professionals have described 2Steps saying, “There is no better way to do it,” “It’s incredibly valuable,” “Nothing can really compare,” and “The only reason lots of businesses aren't doing this already is they simply don't know about it.”

View Video

2 Steps

Read more about Monitoring the Impossible & Other Use Cases - Webinar by 2Steps Tech with David Dick (co-founder)

Parse Windows Event Logs with XML Parser for Google SecOps #opentelemetry #google #secops

May 5, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane‬ Google SecOps deep dive workshop.

View Video

ObservIQ

Read more about Parse Windows Event Logs with XML Parser for Google SecOps #opentelemetry #google #secops

Unlocking the Power of LLMs and AI Agents for Network Automation

May 2, 2025 By Dallon Robinette In Selector

Artificial intelligence is reshaping how enterprises manage and secure their networks, but not all AI is created equal, and not all Large Language Models (LLMs) are ready for the job. While tools like ChatGPT and Google Gemini are transforming communication and productivity, applying general-purpose LLMs to something as specialized and high-stakes as network operations is an entirely different challenge. Networks are dynamic, complex, and context-heavy.

Read Post

Selector

Read more about Unlocking the Power of LLMs and AI Agents for Network Automation

Kubernetes Monitoring in 2025: The Complete Guide to Cluster Visibility

May 2, 2025 By Arpit Sharma In Motadata

Modern cloud-native applications rely on Kubernetes as their leading container orchestration platform. The adoption of Kubernetes in 2025 has achieved remarkable heights, making it the primary operator of vital enterprise systems across financial technology and healthcare organizations. Kubernetes environments continue to grow increasingly complex, and their dynamics are evolving, so monitoring has become an essential strategic practice.

Read Post

Motadata

Read more about Kubernetes Monitoring in 2025: The Complete Guide to Cluster Visibility

Grafana Alerting Overview Plus New Features Coming to Grafana 12 | Grafana Labs

May 2, 2025 By Grafana In Grafana

In this walkthrough, Grafana’s Ryan Kehoe dives into the biggest improvements designed to help teams create, manage, and route alerts with less friction and more power. Whether you're wrangling multi-source queries or managing alerts across large environments, these updates are for you.

View Video

Grafana

Read more about Grafana Alerting Overview Plus New Features Coming to Grafana 12 | Grafana Labs

Cribl Edge: Unify Telemetry Collection | Lightboard Demo

May 2, 2025 By Cribl In Cribl

Cribl Edge is a vendor-neutral, intelligent agent designed for the variety and scale of today’s modern architectures. With a unified telemetry collection system, you can have hundreds of thousands of agents at your fingertips to automatically discover and collect data from your Windows, Linux, and Kubernetes environments. Featuring a rich UI, centralized fleet management, and seamless upgrades, it’s time to transform your agent management.

View Video

Cribl

Read more about Cribl Edge: Unify Telemetry Collection | Lightboard Demo

April product updates

May 2, 2025 By Colin Bartlett In StatusGator

April brought some fresh updates to StatusGator! We’ve added a few new features and improvements to help you stay even more informed and in control. Here’s a quick recap of what’s new!

Read Post

StatusGator

Read more about April product updates

Empower Your ServiceNow Users and Eliminate 5 Common Pain Points with Nexthink Adopt

May 2, 2025 By Nexthink In Nexthink

Organizations use ServiceNow to triage incidents, deliver services, and track changes. But just because ServiceNow is frequently used it doesn’t mean it’s intuitive for everyone. The real challenge lies in how people use it.

Read Post

Nexthink

Read more about Empower Your ServiceNow Users and Eliminate 5 Common Pain Points with Nexthink Adopt

Product Watch - April 2025

May 2, 2025 By Sheshachalam Ratnala In OpsRamp

Welcome to the latest HPE OpsRamp product update blog! Here, you'll find the latest news on exciting features, enhancements, and releases designed to elevate your experience with our product. Let’s dive into what’s new for the month of April. Transformed Usability Experience, Dashboard Improvements and New Visualizations.

Read Post

OpsRamp

Read more about Product Watch - April 2025

How to triage code issues using AI Agent and the Rollbar MCP Server

May 2, 2025 By Rollbar In Rollbar

How can you triage software code errors and exceptions using an AI agent and the rich exception data from Rollbar's Error Monitoring solution with the Rollbar MCP Server.

View Video

Rollbar

Read more about How to triage code issues using AI Agent and the Rollbar MCP Server

Getting started with Jenkins dashboards

May 2, 2025 By Sameer Mhaisekar In Squared Up

Jenkins is an open-source automation server widely used for continuous integration and continuous delivery (CI/CD), enabling developers to automate the building, testing, and deployment of software projects. Jenkins requires a good layer of visualization as it provides real-time visibility into pipeline performance, build statuses, test results, and deployment progress.

Read Post

Squared Up

Read more about Getting started with Jenkins dashboards

How Quick User Tests Help Us Make Better UI Decisions in Icinga Web

May 2, 2025 By Florian Strohmaier In Icinga

Designing user interfaces for Icinga Web is always a bit of a balancing act. Once we’ve worked through all the technical and conceptual details of a new feature, it can be tough to step back and see things from a fresh user’s point of view. We as developers know too much — and that makes it hard to guess how others will understand what we’ve built.

Read Post

Icinga

Read more about How Quick User Tests Help Us Make Better UI Decisions in Icinga Web

Easily Query Multiple Metrics in Prometheus

May 2, 2025 By Preeti Dewani In Last9

In monitoring setups, working with a single metric rarely tells the complete story. The real power of Prometheus lies in its ability to query multiple metrics simultaneously, creating connections between different data points that reveal the true state of your systems. This guide will walk you through everything you need to know about crafting effective multi-metric queries in Prometheus – from basic concepts to advanced techniques that will help you monitor and troubleshoot your infrastructure.

Read Post

Last9

Read more about Easily Query Multiple Metrics in Prometheus

Observe VMWare vCenter Cluster and Cloud with Confidence: Achieve Full Stack Observability with DX Operational Observability (DX O2)

May 2, 2025 By Srikant Noorani In Broadcom

As enterprises continue their cloud and container journeys as part of modernization efforts, they are realizing “hybrid reality” is here to stay. For many, moving all services to clouds or containers is not a viable option. As a result, at least some services will be required to remain on premises. This presents unique challenges and ongoing complexity for monitoring and observability.

Read Post

Broadcom

Read more about Observe VMWare vCenter Cluster and Cloud with Confidence: Achieve Full Stack Observability with DX Operational Observability (DX O2)

Apache Logs Explained: A Guide for Effective Troubleshooting

May 2, 2025 By Faiz Shaikh In Last9

Apache logs are a critical tool for monitoring your web server, but they can often feel overwhelming. For DevOps teams, understanding these logs is essential for diagnosing issues and maintaining system reliability. In this guide, we'll explore the setup and analysis of Apache logs, offering practical tips to help you make sense of them and use them effectively for troubleshooting and optimization.

Read Post

Last9

Read more about Apache Logs Explained: A Guide for Effective Troubleshooting

A Practical Guide to Monitoring Ubuntu Servers

May 2, 2025 By Anjali Udasi In Last9

Running Ubuntu servers without proper monitoring can lead to unexpected issues. For DevOps engineers and SREs, effective tracking is crucial for maintaining system health and performance. This guide covers everything you need to know about monitoring Ubuntu servers, from the basics to advanced strategies, helping you keep your systems running smoothly, whether you manage a single server or a large fleet.

Read Post

Last9

Read more about A Practical Guide to Monitoring Ubuntu Servers

New Services Launched: MetrixInsight as a Service & EUC Managed Support

May 1, 2025 By GripMatix In GripMatix

We’re proud to introduce two new services designed to give IT teams better visibility, control, and support across the EUC landscape: MetrixInsight as a Service and EUC Managed Support & Consultancy.

Read Post

GripMatix

Read more about New Services Launched: MetrixInsight as a Service & EUC Managed Support

Monitoring & Debugging a Checkout Flow in Flask & React

May 1, 2025 By Will McMullen In Sentry

When your checkout flow breaks, customers disappear faster than most ‘cutting-edge’ JS metaframeworks. Thankfully, setting up observability for your critical paths—like a customer checkout—is painless with Sentry. Let's walk through how we instrumented, monitored, and fixed a major issue, with minimal effort.

Read Post

Sentry

Read more about Monitoring & Debugging a Checkout Flow in Flask & React

Mission-Critical Visibility: How Observability Empowers the DoD

May 1, 2025 By Jeanne Falick In Splunk

Tech is entering another wave of innovation with AI. With accelerated innovation comes increased complexity in already disparate environments. For Defense, those complexities are compounded by the need to maintain and operate mission critical infrastructure with highly sensitive data in air-gapped environments, often running on custom digital systems and applications. Accelerating the speed of innovation with leading technology is key for the military to maintain its competitive edge.

Read Post

Splunk

Read more about Mission-Critical Visibility: How Observability Empowers the DoD

Redoing My Progress WhatsUp Gold Home Lab with Proxmox: A Journey of Failover, Backup and Recovery

May 1, 2025 By Jason Alberino In WhatsUp Gold

Greetings, tech enthusiasts! I hope you’re all doing well. Today, I’m thrilled to share the story of my recent adventure in rearchitecting my home lab with Proxmox. This journey has been a rollercoaster of unexpected challenges, valuable lessons, and rewarding successes. I built a resilient and efficient setup that exceeded my initial expectations by leveraging modern virtualization and storage technologies.

Read Post

WhatsUp Gold

Read more about Redoing My Progress WhatsUp Gold Home Lab with Proxmox: A Journey of Failover, Backup and Recovery

Top 5 EdTech outages detected by StatusGator in April 2025

May 1, 2025 By Colin Bartlett In StatusGator

In April 2025, leading EdTech platforms experienced outages that impacted students, educators, and administrators worldwide. StatusGator’s Early Warning Signals played a key role in identifying and reporting issues before official sources did, enabling schools and institutions to respond swiftly. These real-time alerts helped reduce disruption during critical learning and administrative operations. Here are the top five EdTech outages detected by StatusGator in April.

Read Post

StatusGator

Read more about Top 5 EdTech outages detected by StatusGator in April 2025

Lunes Sin Luz: Spain and Portugal's Historic Outage

May 1, 2025 By Doug Madory In Kentik

On Monday April 28, 2025, the countries of Spain and Portugal experienced a widespread electrical blackout of historic proportions. In this post, we look into the internet outages caused by the loss of power including impacts outside of the Iberian Peninsula and to Starlink service in Spain.

Read Post

Kentik

Read more about Lunes Sin Luz: Spain and Portugal's Historic Outage

Configuring and Using the Sentry MCP Server

May 1, 2025 By Sentry In Sentry

MCP (Model Context Protocol) is quickly becoming the standard way that people pull external context into LLM interactions; so we've built a Sentry MCP server using Cloudflare to bring all the context of Sentry into your LLM interactions.

View Video

Sentry

Read more about Configuring and Using the Sentry MCP Server

Product Update - Public Status Pages

May 1, 2025 By Hrishikesh Barua In IncidentHub

A public status page is a page that you can share with your team to show a summary view of your third-party dependencies. We rolled out this feature recently to all IncidentHub users. This article is a quick tour of the feature and how to set it up.

Read Post

IncidentHub

Read more about Product Update - Public Status Pages

Why no one talks about querying across signals in observability?

May 1, 2025 By Srikanth Chekuri In SigNoz

In today’s complex distributed systems, observability has evolved from a nice-to-have feature to a mission-critical engineering discipline. Engineering teams across organizations depend on robust observability to maintain system reliability and quickly diagnose issues when they inevitably arise. However, current observability tooling significantly lags behind user expectations by failing to support a critical capability: querying across different telemetry signals.

Read Post

SigNoz

Read more about Why no one talks about querying across signals in observability?

Top 5 outages detected by StatusGator in April 2025

May 1, 2025 By Colin Bartlett In StatusGator

In April 2025, several major services faced outages that disrupted businesses and users globally. StatusGator provided early detection and real-time updates, helping users stay informed before official announcements. With its Early Warning Signals feature, StatusGator alerted users to potential disruptions even before the affected services acknowledged the issues—giving users a critical edge in responding to outages. Here are the top five outages detected by StatusGator in April.

Read Post

StatusGator

Read more about Top 5 outages detected by StatusGator in April 2025

Simulate Real User Workflows | Introduction to Grafana Cloud Synthetic Monitoring

May 1, 2025 By Grafana In Grafana

Just because your app is up doesn’t mean it’s working. Behind the scenes, users could be facing failed checkouts, broken workflows, or slow page loads — and you may not know until it’s too late. In this video, we’ll show you how Grafana Cloud Synthetic Monitoring helps you proactively simulate real user behavior and monitor the performance of your critical user flows, websites, and APIs from locations around the world — so you can catch issues before your users do.

View Video

Grafana

Read more about Simulate Real User Workflows | Introduction to Grafana Cloud Synthetic Monitoring

k8s-monitoring Helm chart Office Hours 2025-04-25

May 1, 2025 By Grafana In Grafana

In the April edition of the Kubernetes Monitoring Helm chart office hours, we discuss the recent updates to version 2.0, We also discuss the plan for the v2.1 release, where we go into some more detail about the implementation plan and the utilization of the Alloy Operator. Finally, we end with Q&A.

View Video

Grafana

Read more about k8s-monitoring Helm chart Office Hours 2025-04-25

Azure DevOps agent pools: diving deeper

May 1, 2025 By John Hayes In Squared Up

Most of the time the build and deployment pipelines we create will run on compute provided by the Azure DevOps cloud and the only decision we need to make is whether to select a Windows or Linux Agent. Sometimes though, the specification for the VM that Azure DevOps spins up may not be right for our needs. We may need more memory or a particular OS version. This is when custom agents and Agent Pools come into play.

Read Post

Squared Up

Read more about Azure DevOps agent pools: diving deeper

ScienceLogic Named a Leader in AIOps: Paving the Way for Autonomous IT Operations

May 1, 2025 By ScienceLogic In ScienceLogic

The challenges plaguing IT operations are not new. The exponential growth of hybrid and multi-cloud environments, increasing data volume, complexity, and accelerating pace of change have made traditional approaches to IT operations unsustainable.

Read Post

ScienceLogic

Read more about ScienceLogic Named a Leader in AIOps: Paving the Way for Autonomous IT Operations

Dynamic Demands, Dynamic Solutions: IT's Role in the Next AI Workflow Evolution

May 1, 2025 By Teneo In Teneo

I have just finished reviewing the Microsoft Work Trend Index Annual Report for 2025, which offers fascinating insights into the next wave of organizational evolution. I am particularly excited about the section titled ‘Journey to the Frontier Firm’ and what is possible in phase three, where employees will harness the power of multiple AI agents, creating an ‘agentic swarm’ capable of executing tasks at a scale and speed previously unimaginable.

Read Post

Teneo

Read more about Dynamic Demands, Dynamic Solutions: IT's Role in the Next AI Workflow Evolution

Top Microsoft Teams Metrics: How to Measure & Improve Call Quality

May 1, 2025 By Andrii Kernitskyi In Obkio

As an IT professional, you know that Microsoft Teams is only as good as the network it runs on. Poor call quality (choppy audio, frozen video, or sudden disconnections) can disrupt productivity and frustrate users. But how do you pinpoint the root cause? The answer lies in monitoring Microsoft Teams performance metrics.

Read Post

Obkio

Read more about Top Microsoft Teams Metrics: How to Measure & Improve Call Quality

Monitor the full end-user experience: k6 browser checks in Synthetic Monitoring are generally available

May 1, 2025 By Virginia Cepeda In Grafana

We continue to evolve Grafana Cloud Synthetic Monitoring to help you simulate even the most complex transactions and user journeys, and proactively monitor the performance of your web applications and APIs. In line with this effort, we’re excited to share that k6 browser checks in Synthetic Monitoring are now generally available.

Read Post

Grafana

Read more about Monitor the full end-user experience: k6 browser checks in Synthetic Monitoring are generally available

Operations | Monitoring | ITSM | DevOps | Cloud