Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

What to Expect in 2023: OpsRamp Technology Leaders Make Their Predictions

2022 saw a return to normalcy on the Covid front as offices re-opened, people gathered in large groups indoors again and mask mandates waned, even as Covid never really went away. Meanwhile, inflation raged through the summer months before subsiding somewhat later in the year and the Great Resignation gave way to mass layoffs, especially in the tech industry.

Author's Cut-A Sample of Sampling, and a Whole Lot of Observability at Scale

Brick by brick, block by block—if you’ve been with us throughout our Author’s Cut blog series (and if you haven’t, you can go catch up), you’ve seen us build the case for observability from the ground up. We’ve covered structured events, the core analysis loop, and use cases for managing applications in production—and that’s just to start.

How Apache Arrow is Changing the Big Data Ecosystem

This article was originally published in The New Stack and is reposted here with permission. Arrow makes analytics workloads more efficient for modern CPU and GPU hardware, which makes working with large data sets easier and less costly. One of the biggest challenges of working with big data is the performance overhead involved with moving data between different tools and systems as part of your data processing pipeline.

Cloud Providers Health Report - December 2022

Check our December 2022 health report on the top most popular cloud providers. We analyze the health of the cloud providers based on the number of outages and problems during the month. The source of the data is made available by the cloud providers themselves via their status page. We normalize it and use it to generate the report.

Kubernetes and the Service Mesh Era

Kubernetes is a game-changer for enterprise organizations. Automating deployment, scaling, and management of containerized applications allows organizations to embrace a cloud-native paradigm at scale and more easily employ best practices, such as microservices and DevSecOps. But as with all tech, Kubernetes has its limits. Kelsey Hightower famously tweeted that “Kubernetes is a platform for building platforms. It’s a better place to start; not the endgame.”

How to use the Grafana Ansible collection to manage Grafana Agent across multiple Linux hosts

Anyone who is trying to set up monitoring for multiple machines knows how tough it can get to manage multiple Grafana Agents across them. To make things easier, we recently added the Grafana Agent role to the Grafana Ansible collection, which will help users manage the Agent across multiple Linux hosts. (Need to know how to get started with the Grafana Ansible collection for Grafana Cloud?

Best practices to prevent alert fatigue

As your environment changes, new trends can quickly make your existing monitoring less accurate. At the same time, building alerts after every new incident can turn a straightforward strategy into a convoluted one. Treating monitoring as a one-time or reactive effort can both result in alert fatigue. Alert fatigue occurs when an excessive number of alerts are generated by monitoring systems or when alerts are irrelevant or unhelpful, leading to a diminished ability to see critical issues.

Cron Job Monitoring Beta - Because scheduled jobs fail too

Do your cron jobs (aka scheduled jobs) ever fail or not run as expected? Scheduled jobs are supposed to be predictable – as the name implies. But as with many things, predictable!= reliable. Cron jobs fail too and we think you should know when that happens, Crons allows you to monitor the uptime and performance of any scheduled, recurring job in Sentry. Once set up, you’ll get alerts and metrics to help you solve errors, detect timeouts, and prevent disruptions to your service.