Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How to manage high cardinality metrics in Prometheus and Kubernetes

Over the last few months, a common and recurring theme in our conversations with users has been about managing observability costs, which is increasing at a rate faster than the footprint of the applications and infrastructure being monitored. As enterprises lean into cloud native architectures and the popularity of Prometheus continues to grow, it is not surprising that metrics cardinality (a cartesian combination of metrics and labels) also grows.

How to monitor Microservices?

Microservices are being used every where and for good reasons. They do provide you with many benefits especially improved focus and cutting the time to market. Microservices do bring complexities too. Monitoring microservices is complex because of simply the number of them. Monitoring a user transaction requires monitoring many microservices. Correlating the data from them to identify the root cause manually is a nightmare especially in a complex environment with 100s or 1000s of microservices.

ScienceLogic Achieves Record Q3, Meeting Strong Demand for IT Transformation

I am thrilled to announce that ScienceLogic posted its best quarterly sales bookings in our company’s history! Amid warnings from the broader tech industry slowdown, ScienceLogic proved once again to be an outlier, as current macro trends continue to drive strong demand for IT transformation.

Bringing "Blameless" to Traffic Court | J. Paul Reed (Release Engineering Approaches)

What do modern incident analysis techniques and moving violations have in common? This Quick Bite tells the story of taking the same retrospective techniques the most innovative technology companies in the world use to understand their operational incidents... to traffic court, to help us all understand what really happened? What happened next? Come find out!

When Cloud Native Stacks Misbehave - Pitfalls and Lessons Learned | Itiel Shwartz (Komodor)

In this session, Itiel Shwartz will demonstrate common failure scenarios - both app and infra related. We will laugh a little and cry a little, and then cover monitoring, observability & troubleshooting best practices methodologies such as metrics, distributed tracing, logging, network visualization and more. But cheer up! We’ll wrap up by introducing some helpful tools, in order to find and fix issues as fast as possible.

11 Top Website Uptime Monitoring Tools to Know

Uptime is the time period when your website and all of its contents are fully functional and accessible. It is expressed as a percentage, thus if your web is up 99% of the time, you can expect more than 7 hours of downtime every month. Downtime, as the name suggests, is when all or a portion of your service is unavailable. Disruptions result and potential customers are lost. Uptime monitors are useful for correctly logging, assessing, and informing on each downtime.

Zen and the Art of Kubernetes Monitoring

The real beauty of this modern, cloud-fueled, DevOps-driven world that we are living in is that it’s so highly composable. In so many ways, we’ve been freed from the limitations and structures of the previous annals of software and technology history to build things the way that we want to, and however we choose to do so.