Operations | Monitoring | ITSM | DevOps | Cloud

Leverage Cloudflare logs for cost optimization, troubleshooting, and security

Cloudflare is a content delivery network (CDN) that helps businesses accelerate, protect, and optimize their websites, applications, and APIs. It acts as a reverse proxy, sitting between users and a website’s origin server to provide DDoS protection, web application firewall (WAF), CDN caching, and load balancing.

Spotlight on Reference Tables Add Custom Metadata in Datadog! #Datadog #TMiDD #TechTips

This month we’re putting the spotlight on Reference Tables, which is now generally available and enables teams to add custom metadata to their existing Datadog telemetry. Check out the link in our bio to watch the new episode of This Month in Datadog.

Simplify multi-cloud cost management with FOCUS and Datadog

When your cloud environment spans multiple cloud service providers (CSPs) and SaaS providers, it can be challenging to collect cost and usage data in a way that gives you complete visibility. Each provider formats its data according to a unique billing model, and these inconsistencies can leave you with fragmented information about your total cloud spend.

This Month in Datadog: Reference Tables is generally available, Attacker Clustering, and more

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog.

This Month in Datadog - March 2025

On the March episode of This Month in Datadog, Jeremy Garcia (VP of Technical Community and Open Source) covers Attacker Clustering, Auto Test Retries, and new Observability Pipelines features, including keyword dictionaries and several integrations. Later in the episode, Jinwu Liu (Product Manager) spotlights Reference Tables, which is now generally available, and Yash Kumar (Product Lead, Cloud SIEM) shows how these tables can be used to add context to detection rules in Cloud SIEM.

Monitor the performance of queues and topics with Azure Service Bus

Azure Service Bus is a fully managed enterprise message broker that enables asynchronous messaging between distributed applications. It is designed to decouple application components, allowing them to communicate reliably, securely, and at scale. With Datadog’s Azure Service Bus integration, you can.

Enrich your existing Datadog telemetry with custom metadata using Reference Tables

As your applications scale and generate more telemetry, it becomes increasingly difficult to sift through the data and analyze it against cost, business functions, and security measures. Logs, events, and other telemetry on their own may not include enough meaningful context or readable details, leading to slower troubleshooting, inefficient business processes, and higher costs.

Remediate Kubernetes incidents faster using private actions in your apps and workflows

The Datadog Action Catalog provides more than 1,400 actions to help you accelerate remediation across your infrastructure directly within Datadog. With actions, you can use Workflow Automation to configure workflows that automatically address issues as they happen and build custom apps in App Builder that empower anyone in your organization to act when incidents occur.

How we structure on-call rotations at Datadog

A well-structured on-call rotation helps you ensure the reliability of your services and meet your customers’ expectations by designating staff to respond to emerging issues. But the pressures of on-call work—such as long shifts, overnight hours, and dynamic situations—can compromise the well-being of your team members. This makes it harder for them to maximize service uptime during their on-call shifts and can limit the velocity of the feature work they do outside of their on-call duty.

How to create an effective paging strategy

Empowered engineers and effective tools are the foundation of incident management, and having a solid on-call process can help facilitate both. In practice, however, many paging approaches have the opposite effect, often overwhelming responders and increasing burnout. To create an effective paging strategy, organizations should focus responder attention on the most important issues and help facilitate a sense of ownership over them.