Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Service Reliability Engineering and related technologies.

Database Monitoring Metrics: What to Track & Why It Matters

Let’s be honest—your database isn’t just another component. It’s the thing holding everything else together. When it slows down or fails, the ripple effects hit fast and hard. So keeping an eye on its performance? Non-negotiable. The challenge is, there’s no shortage of metrics you could monitor. But not all of them are useful.

The hidden costs of tool sprawl: An SRE's guide to observability consolidation

An overview of the benefits, challenges, and philosophy behind consolidating your observability tools Picture this: It's 3:00 a.m., and your phone is buzzing with alerts from what seems like a dozen different monitoring tools. As you blearily scroll through the notifications, you can't help but wonder, "How did we end up with so many tools, and why can't they just talk to each other?".

Observability vs APM: What's the Real Difference?

Remember when monitoring your apps meant checking if they were up or down? Yeah, those days are long gone. As systems have gotten more complex—microservices talking to other microservices, containers spinning up and down, serverless functions doing their thing—the approach to understanding system health has had to level up too. APM tools have been the bread and butter for DevOps teams for years, but now everyone's talking about observability.

Logging vs Monitoring: What's the Real Difference?

Let's talk about something central to DevOps work: logging vs monitoring. While both are essential components of maintaining system health and reliability, they serve distinct purposes and complement each other in different ways. The distinction between them isn't always clear-cut, especially as tooling continues to evolve. This guide talks about the practical applications, technical differences, and implementation strategies for both logging and monitoring in modern DevOps environments.

Debug Logging: A Comprehensive Guide for Developers

When an app breaks and there's no clear clue why, debug logs often hold the answers. They record what the code was doing at each step, making it easier to trace back and spot what went wrong. This guide covers what debug logging is, why it’s useful, and how to use it without turning logs into a wall of noise.

HAProxy vs NGINX Performance: A Comprehensive Analysis

When architecting high-performance infrastructure capable of handling substantial traffic loads, the choice of load balancer is a critical decision that can significantly impact system reliability, performance, and cost-efficiency. Among the leading contenders, HAProxy and NGINX stand out as mature, battle-tested solutions with distinct strengths and characteristics.

How to Use Prometheus for APM

Monitoring applications shouldn’t be a guessing game. But too often, DevOps engineers end up buried under a pile of metrics that don’t help when things go wrong. That’s where Prometheus APM comes in. It offers a straightforward way to make sense of your systems—especially when you're working with modern, distributed setups like microservices.

Squadcast Strengthens Its Leadership in IT Alerting and Incident Management in the G2 Spring Report

2025 has already started out to be a remarkable year for Squadcast—with our key wins in the G2 Spring Reports, our acquisition by SolarWinds, and a series of impactful product releases and improvements. Our mission has always been clear: to deliver a unified platform that seamlessly integrates On-Call Management and Incident Response, empowering teams to boost service reliability and productivity—all without the burden of context switching.

Metrics That Matter: Measuring Developer Productivity in the AI Era

In this episode, Ryan McDonald is joined by Mark Quigley, Head of Platform Engineering at Ninety.io, for a conversation that cuts through the noise around developer productivity metrics and AI. Mark dives deep into how teams can measure what matters—without falling into the trap of turning every measure into a target. He shares how tools like Developer NPS, DORA metrics, and balanced scorecards can help teams optimize for both output and well-being—but only when framed with the right intent.