Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Top 15 Infrastructure Monitoring Tools

Infrastructure monitoring tools ensure systems’ optimal performance and availability, enabling the identification and resolution of potential issues before they become complex. This article delves into the different infrastructure monitoring tools available and their impact on business continuity and operational efficiency.

Stile Education's Best-of-Breed Observability Strategy

"One of the best things we’ve gotten out of ChaosSearch is the ability to keep all of our data in S3. It’s cheap and easy to keep all of our data available and indexed. We can search through it at any time to dig deeper into problems that crop up." Learn more about how the Stile's team can now retain log data indefinitely, versus saving only a week or two of data in Elasticsearch. That change has increased the team’s capacity to use log data to solve business problems, and unlocked new opportunities to discover deeper product insights.

Our lessons from the latest AWS us-east-1 outage

In case you missed it, AWS experienced an outage or "elevated error rates" on their AWS Lambda APIs in the us-east-1 region between 18:52 UTC and 20:15 UTC on June 13, 2023. If this sounds familiar, it's because it's almost a replay of what happened on December 7, 2021, although that outage was significantly more severe and took longer to restore.

Top 10 Log Management Tools in 2023

Log Management tools are crucial for the security and performance of your IT infrastructure. With the right log management system, you can quickly detect and respond to any anomaly or performance issue. Presently, there are numerous log management platforms. Each with its own unique set of features and benefits. While most of these platforms offer industry-standard capabilities, what sets them apart from each other are the stand-out features, pricing, and overall user experience.

Short Descriptions in BindPlane OP

An easy way to write a short description to distinguish between different file types, fields, etc. About ObservIQ: observIQ is developing the unified telemetry platform: a fast, powerful and intuitive next-generation platform built for the modern observability team. Rooted in OpenTelemetry, our platform is designed to help teams reduce, simplify, and standardize their observability data.

Getting Your Logs In Order: A Guide to Normalizing with Graylog

If you work with large amounts of log data, you know how challenging it can be to analyze that data and extract meaningful insights. One way to make log analysis easier is to normalize your log messages. In this post, we’ll explain why log message normalization is important and how to do it in Graylog.

What Is A Time-Series Metric?

Today, businesses and organizations rely heavily on metrics and analytics to make informed decisions. Metrics are important whether you’re a developer, a marketer, or the head of a company. One type of metric that is widely used is a time-series metric. Time-series metrics provide insights into how data changes over time. With time-series data, businesses can track trends, detect anomalies, and make predictions.