Operations | Monitoring | ITSM | DevOps | Cloud

SRE

The latest News and Information on Service Reliability Engineering and related technologies.

Promoting Continuous Learning with SRE

With the extreme changes we’ve all been through these last several months, it should come as no surprise that our jobs have changed drastically, too. We’re working remotely. We’re dealing with increased resource constraints. Our services are receiving more traffic than usual, and we’re tasked with keeping things up and running. Our work-as-done may not match what we did at the beginning of 2020.

SRE Report 2020 - Balancing 'Dev' and 'Ops'

We recently released Catchpoint’s SRE Report 2020 that analyzed results from the SRE survey we conducted early this year along with a recent addendum survey. The report offers a detailed look at the current state of SRE and how the shift to an all-remote work environment has impacted SRE teams. In this blog, we take a deeper look at one of the report highlights – ‘Heavy Ops Workload Comes at a Cost’.

SRE Leaders Panel: Managing Systems Complexity

In our previous panel, we spoke about how to overcome imposter syndrome in high tempo situations, and how culture directly affects the availability of our systems. Building on that last discussion, we gathered leading minds in the resilience industry to discuss how SRE can manage systems complexity, and how that's tightly intertwined with business health especially in the context of current health and social crises.

Moogsoft Express Helps DevOps and SRE Teams Develop More and Operate Less

“Welcome to Tomorrowland.” That’s how Moogsoft Chairman and CEO Phil Tee kicked off the launch event of Moogsoft Express, the next-generation AIOps and observability solution built from the ground up for DevOps and SRE teams. The reference to a better future is fitting. With its arrival, Moogsoft Express helps these teams maintain visibility and control over increasingly complex CI/CD pipelines, so they can detect issues earlier, fix them faster and prevent outages.

SRE: A Human Approach to Systems

In the world of technology, the stakes have never been higher. The move to the cloud and microservices to maximize agility has given way to digital disruptors and unprecedented competitive threats. As distributed systems become increasingly complex, the scale of ‘unknown unknowns’ increases. On top of this, customer expectations are sky-high. The cost of downtime is catastrophic, with customers willing to churn if their needs are not promptly met.