Operations | Monitoring | ITSM | DevOps | Cloud

Chaos Engineering

Tyler Wells on building a culture of reliability at Twilio

What does reliability look like at a company that has thousands of employees and provides critical communication services to over 150,000 customers? We talked with Tyler Wells, Senior Director of Engineering at Twilio, to learn how he and his team created a culture of reliability at Twilio. He talked in depth about his experiences developing reliability goals, building reliability practices, and aligning engineering teams on these objectives.

Improve M&A success rates by testing for system reliability

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Coming out of recessions, merger and acquisition volume typically picks up as lower interest rates drop the cost of capital and Corporate Development teams begin executing on the strategies they’ve developed during the holding periods. This year has been no exception, with $350 billion spent on tech acquisitions to date.

Podcast: Break Things on Purpose | Ep. 11: Ryan Kitchens, Senior Site Reliability Engineer at Netflix

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. We’re excited to kick off Season 2 of Break Things on Purpose next month. In anticipation of our next season, here’s a bonus show from our archives! Subscribe to Break Things on Purpose wherever you get your podcasts. Find us on Twitter at @BTOPpod or shoot us a note at podcast@gremlin.com!

What is Chaos Engineering and Why is it Important?

So, why would you deliberately try to break your services? Chaos engineering does just that – deliberately terminating instances in your production environment. Online video streaming service Netflix was one of the first organizations to popularize the concept with their Chaos Monkey engine.

How to make an ROI calculator and impress finance (an engineer's guide to ROI)

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Think back to the last time you wanted to purchase software for your organization. The software solves real problems and makes your team’s life easier. Then, finance delays or rejects your proposal. What’s going on?

Ensuring a smooth Kubernetes Dockershim Deprecation with Chaos Engineering

Trying to improve the reliability of your Kubernetes deployment? Start with these 5 chaos experiments. Kubernetes 1.20 is scheduled to be released next week, and this version contains a number of amazing enhancements including graceful node shutdown, more visibility into resource requests, and snapshotting volumes. But the change generating the most buzz is the deprecation of Docker as a container runtime.

Embracing virtual connections at AWS re:Invent 2020

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. This year has seen a complete re-imagining of tech conferences. Some were cancelled or postponed, while others have evolved and embraced the opportunity to go virtual. This meant innovating to bring the in-person event experience online.