Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How To Monitor Status Pages of Popular Apps With Cloud Status

Remember the last time you noticed your app was acting weird, only to discover — after 30 minutes of debugging — that a critical service was down? We’ve all been there, frantically clicking through various status pages trying to figure out what’s broken, wishing you knew how to monitor status pages of your third party dependencies.

Catching Up With Fender: How Frontend Observability Powers Better User Experiences

For years, Fender Musical Instruments has been synonymous with iconic guitars and amplifiers. But in recent years, the company has expanded its legacy into the digital realm, offering tools like Fender Play, an innovative learning platform for aspiring musicians. Behind this digital evolution lies a focus on delivering exceptional user experiences for its consumer-facing applications—a mission supported by Honeycomb for Frontend Observability.

Realizing the business value of OpenTelemetry-native observability

Transform your organization's observability strategy with open standards and simplified data collection Modern organizations face an unprecedented observability challenge. As systems grow more complex and distributed, traditional monitoring approaches are struggling to keep pace. With data volumes doubling every two years and systems spanning multiple clouds and technologies, organizations need a new approach to maintain visibility into their operations.

Why Monitoring as Code Is the Future of Application Reliability for Modern Teams... and how it can save you $1 million!

I recently talked to a customer of Checkly and he shared some thoughts about Monitoring as Code. Let’s call him Karl in this article. Karl and I talked about why Monitoring as Code (MaC) is becoming essential for teams operating at scale. As the Head of Platform Engineering at a major e-commerce company processing millions of transactions daily, his experience shows how MaC solves a lot of the messy challenges that come with traditional synthetic monitoring setups.

How a Global Banking Leader Tackled Memory Overload with HEAL Software

In the financial sector, where system reliability directly impacts customer trust and revenue, even minor IT inefficiencies can spiral into costly crises. For one of the world’s largest banks—supporting 25 million customers, 2,000 branches, and 3,000 ATMs—a hidden challenge threatened its reputation: unpredictable memory consumption in critical applications.

The importance of error budgets for SREs and how to monitor them

Digital-first customers who are always on the go expect a seamless experience. But let’s face it—100% uptime is a myth. Trying to achieve it can drain resources and stifle innovation. This is where error budgets come in. They help site reliability engineers (SREs) find the sweet spot between delivering reliability and development velocity. With error budgets, teams can focus on building a robust system without burning out over perfection.

Finding Your Way: Using Metrics to Explore Organizational Architecture

Imagine being the new developer in a bustling tech company. Everyone is rushing to meet deadlines, and no one has time to explain the tangled web of services, databases, and messaging systems that make up the organization’s architecture. You search high and low for documentation, but the few diagrams you find are outdated or incomplete. Feeling lost? This is where metrics can come to the rescue.

What is synthetic monitoring?

Synthetic monitoring proactively assesses application performance, allowing us to detect potential issues before they impact users. When combined with tracing, it becomes more effective by linking synthetic tests to actual system traces. This integration offers deeper visibility and granular insights into application behavior, enabling more effective, data-driven decisions to optimize performance.

Create a Splunk pipeline to filter, mask, and route logs - without SPL2

In this video, we will take a look at how you can create a Splunk Data Management pipeline to filter, mask and route your logs with using any SPL2 code. For this demo we have used Ingest Processor to build our pipeline but the same concept can be used for Edge Processor as well.

Pod Exec in K8s: Advanced Exec Scenarios and Best Practices

Remember using SSH to access servers? It was the go-to method for troubleshooting or making changes to a system. But in the world of containers, SSH doesn't quite fit. Kubernetes and containers work differently; they're dynamic and spun up and down frequently. That’s where kubectl exec comes in. It lets you run commands inside a pod directly, without needing to rely on SSH or worry about the pod being ephemeral. It’s simple and fits the nature of modern, containerized environments.