Operations | Monitoring | ITSM | DevOps | Cloud

Latest Videos

Datadog Conversations: How Life360 Keeps Families Safe with Observability

Life360 is a family safety app driven by the mission to protect and connect people, pets, and things. Naveen Puvvula, Director of Cloud Operations, and Jesse Gonzalez, Senior Staff Site Reliability Engineer, discuss why observability is critical to achieving reliability and how they continue to deliver real-time location updates for their users even during high-traffic events. Finally, they share their advice for other tech leaders in the industry to choose partners that align closely to solve problems together and technologies that reduce friction and improve developer joy.

Accelerate incident investigations with Bits AI, Datadog's generative AI co-pilot

Learn how Datadog’s generative AI assistant, Bits AI, can help organizations accelerate incident investigations with auto-generated summarization to get you up to speed quickly, fetch information about past related events, update teams and statuses all through Slack.

This Month in Datadog: Bits AI for Incident Management, KSPM, New Observability Pipelines, and more

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog. This month, we put the Spotlight on Bits AI for Incident Management.

And What About my User Experience?

Monitoring backend signals has been standard practice for years, and tech companies have been alerting their SRE and software engineers when API endpoints are failing. But when you’re alerted about a backend issue, it’s often your end users who are directly affected. Shouldn’t we observe and alert on this user experience issues early on? As frontend monitoring is a newer practice, companies often struggle to identify signals that can help them pinpoint user frustrations or performance problems.

What is an Anomaly? Avoiding False Positives in Watchdog Detected Anomalies

In 2018 Datadog released Watchdog to proactively detect anomalies on your observability data. But what defines an anomaly? How do you avoid false positives? At Datadog Summit London 2024, Nils Bunge, product manager at Datadog, shared the story of the creation of the first Datadog AI feature (Watchdog Alert), what we learned from it and how we applied those lessons to all the added AI functionalities across the years.

Datadog on Site Reliability Engineering #shorts #datadog #observability

There are many different ways to implement Site Reliability Engineering (SRE). From team structures to roles and responsibilities to planning and prioritization flows, there’s no golden path for how to organize things. As Datadog has shifted from a startup to a quickly-growing public company, we’ve seen our own SRE practice evolve. With over 22,000 customers sending trillions of data points each day, keeping Datadog reliable is critical to our business.

Datadog on Data Science

In this episode we'll visit the world of predictive analytics and machine learning and uncover how these cutting-edge technologies are transforming the way Datadog monitors and improves its services. We’ll focus our conversation on two key aspects: using advanced statistical methods for proactive monitoring and the strategic implementation of machine learning for algorithm enhancement.