Operations | Monitoring | ITSM | DevOps | Cloud

Unify and correlate frontend and backend data with retention filters

Teams can use Datadog Real User Monitoring (RUM) and RUM without Limits to get full visibility into the frontend health of their applications while retaining only the sessions that contain critical problems that affect the end-user experience. But application errors or slowness often result from backend issues, such as database bottlenecks. To diagnose these issues, you need to correlate the frontend data from RUM with the backend data from Datadog Application Performance Monitoring (APM).

Monitor Arista VeloCloud SD-WAN performance with Datadog

As organizations grow their cloud environments and branch office networks, maintaining reliable connectivity and application performance becomes more complex. VeloCloud SD-WAN provides dynamic, policy-based routing to help ensure that your connectivity is dependable and cost-efficient, and that your applications perform consistently.

Building reliable dashboard agents with Datadog LLM Observability

This article is part of our series on how Datadog’s engineering teams use LLM Observability to iterate, evaluate, and ship AI-powered agents. In this first story, the Graphing AI team shares how they instrumented their widget- and dashboard-generation agents with LLM Observability to detect regressions and debug failures faster. Visibility into how large language model (LLM) applications behave in real time is essential for building reliable AI-driven systems at Datadog.

How we built an AI SRE agent that investigates like a team of engineers

We built Bits AI SRE to help engineers investigate and solve production incidents, one of the most difficult aspects of operating distributed systems today. As environments grow more dynamic and complex, resolving issues becomes more challenging. Failures now span more services, involve noisier signals, and encompass larger volumes of telemetry data, making it hard for on-call engineers to find root causes quickly. Today, Bits AI SRE is already helping teams decrease time to resolution by up to 95%.

Automate flaky test fixes with the Bits AI Dev Agent and Test Optimization

Flaky tests are a significant source of inefficiency that impacts many engineering teams. Along with failing your build, they interrupt your entire development flow, generate excessive CI/CD noise, and, critically, compromise developer trust in the test suite itself. Datadog Test Optimization enables you to manage test suites at scale by pinpointing the flakiest tests, analyzing their history across hundreds of runs, and automatically surfacing the root cause.

Datadog integrations 2025 recap: Observability for AI, security, and hybrid cloud

The year 2025 marked a major milestone in the Datadog integrations ecosystem as we surpassed 1,000 integrations. Along the way, we also added over 110 new technology partners and expanded coverage across the fastest growing software categories, including AI, distributed security, hybrid infrastructure, and data intelligence. This recap highlights the most impactful integrations we released this year and how they connect to these broader technology trends.

Bring faster visibility into AWS Lambda functions with remote instrumentation

Comprehensive observability is critical for running performant, reliable, and secure serverless workloads. However, configuring and maintaining that visibility across hundreds or thousands of serverless functions can be difficult to scale and sustain. Developers across teams often manage serverless functions using different infrastructure as code (IaC) frameworks, as well as different review, deployment, and update processes.

Implement dbt data quality checks with dbt-expectations

dbt is one of the most popular solutions for data transformations and modeling. Many commercial data pipelines rely on dozens, or even hundreds, of individual dbt jobs. Data engineers, data platform engineers, and analytics engineers who own these pipelines need to maintain a testing framework to prevent mistakes in data processing that can compromise analysis.

Troubleshoot faster with the GitLab Source Code integration in Datadog

Developers and SREs who rely on GitLab to develop their services often face significant friction when troubleshooting errors or fixing issues that degrade code quality. To understand the context of a problem, they resort to tab-hopping between observability tools and GitLab, connecting stack traces, spans, and profiles back to the right files and commits.