Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

A deep dive into Database Monitoring index recommendations

Datadog Database Monitoring (DBM) Recommendations help you proactively optimize performance throughout your database fleet. DBM draws on a wide range of data sources in order to detect and provide actionable guidance on issues such as blocking queries, low disk space, and missing indexes. In this post, we’ll show you how DBM formulates targeted indexing recommendations to help you optimize database performance.

Increase control and reduce noise in your AWS logs using Datadog Observability Pipelines

Today’s SRE and security operations center (SOC) teams often find themselves overwhelmed by the sheer volume and variety of logs generated by critical AWS services such as VPC Flow Logs, AWS WAF, and Amazon CloudFront. While these logs can be valuable for detecting and investigating security threats, as well as troubleshooting issues in your environment, managing them at scale can be challenging and costly.

Breaking Free from Legacy Observability: Why Service Providers Choose Kentik Over Deepfield

Modern network operators need modern observability tools. In this post, we explore why Deepfield — a traditional network flow analytics platform — falls short in providing comprehensive insights required for today’s network operations, and how Kentik’s modern data platform is purpose-built for today’s infrastructure teams.

How Forbes delivers a premium digital experience with Datadog

Learn how Forbes, a global media powerhouse, successfully migrated to the cloud with Datadog. Discover how they enabled their teams across their entire tech stack to access IT data and make critical improvements. The team maintained a 99.5 percent uptime through proactive alerting and improved root cause analysis by 10 percent.

Making sure you get a Checkly alert for every detected failure

It’s every ops team’s biggest anxiety: a monitoring system detects a failure, but the notification either isn’t delivered or isn’t noticed by the team. Now we have to wait for users to complain before our team knows about the problem. Checkly sends an alert every time the system detects a failure, but how can you be sure you’re getting those alerts, and that those alerts are going to the right people?

How to Use OpenSearch with Python for Search and Analytics

If you're working with search and analytics, you’ve probably heard about OpenSearch—the open-source alternative to Elasticsearch. OpenSearch is a powerful tool, whether you're building a search engine, running log analytics, or implementing full-text search in your applications. And the best part? You can integrate it easily with Python.

OpenTelemetry Visualization Setup: A Developer's Guide

If you've ever tried to set up OpenTelemetry visualization, you know it can be a bit overwhelming. But don't worry—in this guide, we'll break it all down step by step. Whether you're just getting started or looking to fine-tune your existing setup, this walkthrough will help you get the most out of your telemetry data.

It was DNS Again: Why Your Status Page Needs Its Own Domain

On February 20, 2025, at 16:22 UTC, StatusGator detected an outage affecting Vultr. The issue appeared to stem from a DNS failure, causing vultr.com and any other services hosted on its domain to become inaccessible. But what does that include? The official Vultr status page. Because Vultr hosts its status page on status.vultr.com, the same domain hosting its primary website and dashboard, users were left without an official source of updates during the outage.