Everything You Need to Know About the 4 Stages of Software Reliability

Software reliability is a big deal, especially at the enterprise level, but too often companies are flying blind when it comes to the overall quality and reliability of their applications. It seems like every week, there’s a new report in the news calling out another massive software failure. Sometimes it’s just a glitch on social media causing usability issues, and other times it’s a serious issue in an aircraft system that leads to deadly crashes.


Continuous Reliability: How to Move Fast and NOT Break Things

Development and IT Ops teams commonly find themselves in a game of tug-of-war between two key objectives: driving innovation and maintaining reliable (i.e. stable) software. To drive innovation, we’re seeing the emergence of CI/CD practices and the rise of automation. To maintain reliability software, DevOps practices and Site Reliability Engineering are being adopted to ensure the stability of software in fast-paced dev cycles.


Spring Cleaning at OverOps: How (and Why) We Changed Our DB Cleaning Strategy

There comes a time in the life of any application that small things we let slide become unignorable issues. For us, it happened when, after years of writing and executing code, our DB’s free disk space started to run out. Each passing day brought us closer to our eventual doom, and we finally had to allocate the time to fix the problem.


The Cake is NOT a Lie: 5 Java Frameworks to Support Your Microservices Architecture

“The microservices trend is becoming impossible to ignore,” I wrote in 2016. It’s still true, although it’s certainly grown to more than just a passing fad. Back then, many would have argued this was just another unbearable buzzword, but today many organizations are reaping the very real benefits of breaking down old monolithic applications, as well as seeing the very real challenges microservices can introduce.

Maintaining Code Quality During the Container Revolution with OverOps

Microservices are taking over the software world, affecting how software is designed, written and delivered. Current tooling, such as log aggregators and APM solutions, struggle to provide the depth of context needed to maintain and troubleshoot these new containerized applications. Meanwhile, large enterprises don’t have enough data to correlate issues across containers, deduplicate them and find the root cause of the problem.