Monthly Archive

With SRE, failing to plan is planning to fail

Feb 26, 2021 By Ayelet Sachto In Google Operations

People sometimes think that implementing Site Reliability Engineering (or DevOps for that matter) will magically make everything better. Just sprinkle a little bit of SRE fairy dust on your organization and your services will be more reliable, more profitable, and your IT, product and engineering teams will be happy. It’s easy to see why people think this way. Some of the world’s most reliable and scalable services run with the help of an SRE team, Google being the prime example.

Read Post

Google Operations

Read more about With SRE, failing to plan is planning to fail

Overview of Incident Lifecycle in SRE

Feb 23, 2021 By Biju Chacko In Squadcast

Incidents that disrupt services are unavoidable. But every breakdown is an opportunity to learn & improve. Our latest blog is a deep dive into best practices to follow across the lifecycle of an incident, helping teams build a sustainable and reliable product - the SRE way As the saying goes, “Every problem we face is a blessing in disguise”.

Read Post

Squadcast

Read more about Overview of Incident Lifecycle in SRE

On Not Being a Cog in the Machine

Feb 9, 2021 By Fred Hebert In Honeycomb

This is my first week here as the first dedicated SRE for Honeycomb, and in a welcoming gesture, I was asked if I wanted to write a blog post about my first impressions and what made me decide to join the team. I’ve got a ton of personal reasons for joining Honeycomb that may not be worth being all public about, but after thinking for a while, I realized that many of the things I personally found interesting could point towards attitudes that result in better software elsewhere.

Read Post