Operations | Monitoring | ITSM | DevOps | Cloud

Reliability is when customers aren't impacted

Ultimately, a system is reliable when customers and engineers can count on it. Full transcript:  When I get to hear stories like, "Hey, we just had our holiday sales event kick off and everything went smoothly and I didn't have to wake up in the middle of the night." That is really the true definition of reliability these people that are constantly hands-on keyboard in charge of making sure that people like myself and like you aren't impacted when we're going to, for example, buy a new pair of sneakers, or we're going to get some sort of limited edition release that's coming out, right?

Reliability Intelligence: your reliability expert

For the last decade, Gremlin has helped Fortune 500 organizations with critical uptime requirements proactively uncover reliability risks and prevent costly outages. We started with Chaos Engineering, then built Reliability Management to help teams standardize and scale their testing efforts. Today, we take another leap forward with the release of Reliability Intelligence. Reliability Intelligence draws on Gremlin expertise with each test to show you what happened and recommend remediation.

The riskiest thing you can do is not measure your risk

Hiring good engineers is important, but it’s not enough to prevent outages. You need to measure and track your risk to get real results. Full transcript:   My name's Jeff Nickoloff. I'm a principal engineer here at Gremlin.  What I hear non-technical functions talk about is really they are much happier to sort of lean on their great engineers. Oh, we've got a great engineering culture. "We don't have reliability issues because we hire the best people.".

Avoid the Chaos Engineering bottleneck

Chaos Engineering is great, but by itself it can create bottlenecks that limit your reliability journey. FULL TRANSCRIPT: One of the things we've learned while building Gremlin and being the first Chaos Engineering tool to market is with all the greatness that comes with this approach, we've learned some of the downfalls, some of the drawbacks. And one of those is how you scale this practice.

Beyond AI hype: put reliability at the forefront

Reliability is a constant for every technology, whether it’s cloud, microservices, or AI. Full transcript:  Just a few years ago everybody was screaming about microservices, "That's the wave of the future," and now everybody's looking at AI. No matter what the change in technology hot topic is, your reliability should still be at the forefront of everything that you're doing.

Reliability is not about mythical perfection

See what reliability means to Ganesh Seetharaman, Managing Director at Deloitte, and why it's more than high uptime. Full transcript:  Reliability to me is not about achieving mythical perfection. It's about embracing complexity, recovering quickly from failures or incidents, and building trust through transparency and adaptability.

What to expect in a Gremlin workshop

Gremlin workshops give your team hands-on training with Gremlin so they can get real results and dramatically improve your reliability. Full transcript:  The goal of our workshops is really to accelerate you and the team in your reliability journey. Whether you're starting out for the first time, or you're a more advanced user, this workshop is really designed for you to take you to the next level.