Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Downtime happens, fix it faster - Uptime monitoring now in open beta

That moment when everything’s running smoothly—users engaged, conversions flowing—until your site takes a break, and you find out from a tweet. We’ve all been there, scrambling to fix an issue that’s been broken for who knows how long while social media lights up. A few minutes of downtime, and now you’re not just fixing the issue—you’re dealing with frustrated users and a reputation hit.

Top Tips for Querying OpenSearch

OpenSearch allows you to store a sizeable amount of data, commonly logs, metrics, and documents. You access useful data within OpenSearch by querying to get specific information, deep analysis, and insights for decision-making. With OpenSearch, you can perform complex searches by using natural language, Boolean operators, and filters to pinpoint relevant information efficiently.

What is MTTR in Networking?

When a critical system goes down, every second counts. That’s why IT and network professionals need to get comfortable with tracking incident response metrics like MTTR. MTTR (which you’ll soon come to find has several meanings) is a set of key metrics that measure how fast your team can repair and recover from incidents, directly impacting your system uptime and service quality.

The new era of observability: Why logs matter more than ever

20 years ago, software ate the world. The old ways of monitoring, failing over, or routinely rebooting quickly became inadequate and with a new focus on software excellence, how we monitor and maintain them had to be rethought. Even back then, when new software was released on an annual basis, it was clear that developers and futurists needed to build, inform, and optimize their approach, which required a deeper understanding of the application experience.

Obkio Autumn Updates: What's New and What's Coming!

At Obkio, we’re always working to enhance your experience with our app. Over the past few months, we’ve rolled out some exciting features and updates to improve the user experience and overall functionality of our app. Here’s a rundown of what’s new and what you can expect in the near future!

Troubleshooting Microservices with Splunk Observability Cloud and the AI Assistant for Observability

In this video, I’m going show you how to troubleshoot microservices in Splunk Observability Cloud using features like APM’s Service Map and Tag Spotlight to identify what’s causing our microservice to produce high error rates. We’ll then review Related Logs in Log Observer to determine why the error in our service is occurring.

Why use the Opslogix SCOM Data Source for Grafana?

In data-driven environments, effective monitoring and reporting is critical for IT operations. For organizations using Microsoft’s System Center Operations Manager (SCOM), the integration with visualization tools like Grafana can enhance data accessibility and understandability. One standout integration solution is the Opslogix SCOM Data Source for Grafana. In this blog post we will talk about why it can be the next game-changer for your organization.

Common Kafka Security Misconfigurations and How to Avoid Them

Apache Kafka is the go-to solution for companies needing to move data fast and efficiently, but here’s the catch—when you’re handling sensitive data, the stakes are high. One misstep in your security configuration, and you’re not just dealing with a hiccup; you could be looking at full-blown security breaches, unauthorized access, or lost data. No one wants that. Yet, many organizations still stumble into the same security pitfalls.