Paving the Road for Proactive Reliability

Paving the Road for Proactive Reliability

Jan 5, 2024

At Expedia Group, Kaushik Patel and Nikos Katirtzis have thousands of engineers and micro-services. Heterogeneity in terms of infrastructure and technologies used over the years created inefficiencies and posed the need for a set of automated best practices for our engineering teams. Over the past 2 years, using a data-driven approach, we’ve worked on creating a set of platforms that helps teams to adopt good reliability practices, including chaos engineering, release safety, or automatic failover between cloud regions. In this talk Kaushik and Nikos will cover the platforms they’ve built, including how they used data to drive their investment decisions. They’ll also describe how those platforms are integrated with other internal systems such as observability and continuous delivery. Finally, they’ll explain how, with the right buy-in from leadership, they got teams to adopt a proactive reliability mindset, helping them prevent and better prepare their team for incidents.

#shorts #datadog #devsecops #devops #microservicesarchitecture #cloudinfrastructure #sre #softwareengineer #reliability #expedia #observability