Operations | Monitoring | ITSM | DevOps | Cloud

Catch and remediate ECS issues faster with default monitors and the ECS Explorer

Organizations that run applications on Amazon Elastic Container Service (Amazon ECS) often juggle signals across container and task metrics, logs, and events while they hunt for the change or condition that broke a deployment. This work adds operational overhead and extends incident timelines as teams switch between tools and manually correlate symptoms.

Key learnings from the State of Containers and Serverless report

We recently released the 2025 State of Containers and Serverless report, which examines cloud usage data from tens of thousands of Datadog customers. The study shows adoption trends across container orchestration platforms and serverless offerings, and it explores how organizations use those resources to optimize workloads for efficiency, cost, and simplicity.

Bits AI SRE, Flex Frozen, and GPU Monitoring | DASH 2025

Get a first look at Datadog’s biggest product reveals from DASH 2025. Meet Bits AI SRE, your 24/7 autonomous AI Site Reliability Engineer, Flex Frozen for up to 7 years of managed log retention, and GPU Monitoring for full visibility into your AI workloads. Experience the future of observability in action.

Turn fragmented runtime signals into coherent attack stories with Datadog Workload Protection

Security teams face a constant trade-off between detection coverage and alert fatigue. Broad, rule-based detection approaches surface every possible indicator of compromise (IoC) but generate unmanageable alert volumes. Narrow, tightly scoped rules reduce noise but risk missing critical signals. And while individual indicators of compromise can highlight suspicious behavior, they often lack the surrounding context needed to tell a complete story of how an attack unfolded.

Triaging an Incident with a Critical Data Pipeline at #rivian

Rivian makes electric vehicles to advance its mission to keep the world adventurous forever. As software defined vehicles, Rivian’s R1T and R1S are connected to the cloud from day 1, and telemetry data is at the heart of enabling mobile notifications, remote diagnostics, fleet management, and more. With so many critical pipelines in the cloud, observability is a top priority for the data platform.

Safely Roll Out Features with Datadog Feature Flags

In this short demo, see how Datadog Feature Flags help teams release new functionality safely and efficiently. Datadog provides advanced targeting, progressive rollouts, and automatic rollbacks — all integrated with powerful observability data. Learn how you can use simple on–off flags or multi-variant configurations to test and deploy features with confidence. With built-in monitoring of key guardrail metrics, Datadog can automatically pause or reverse rollouts when issues are detected, keeping your releases stable.

Building Smarter AI Products #Datadog #DASH #AI

AI capabilities are advancing faster than ever — transforming how teams design, build, and ship intelligent products. In this teaser from Building Successful AI-powered Products at Datadog DASH, experts discuss the rise of agent-based systems, evolving model capabilities, and how to stay ahead in the new era of automation.

How Datadog is Reinventing On-Call #Datadog #OnCall #DevOps

Datadog is reimagining how engineers handle incidents—moving beyond simple alerts to an intelligent, voice-driven on-call experience. With Datadog On-Call, teams can acknowledge alerts, access runbooks, post to Slack, and collaborate in real time, all before even touching their computer. See how Datadog brings incident response, communication, and automation together so you can respond faster and keep customers informed.

Understand user experience through network performance with Datadog Synthetic Monitoring

When an application slows down or fails, pinpointing the cause isn’t always simple. Is it a backend regression, a misbehaving API, or a bottleneck somewhere deep in the network? Without full visibility, teams waste precious time troubleshooting across disconnected tools and layers. Datadog Synthetic Monitoring now supports Network Path to help you proactively identify whether user-facing issues stem from your code or from the underlying network.