Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Captain's Log: Diving into our scheduling design

On-call scheduling is tricky. Like, really tricky. It was one of the scariest parts when we decided to build a modern alerting system earlier this year. We knew we couldn't cut any corners on Day One of our release because it needed to be a fully loaded feature for someone to realistically use our product (and replace an incumbent). This meant including windowed restrictions, coverage requests, and simple to complex rotations.

Ping Command: A Comprehensive Guide to Network Connectivity Tests

The ping network test, a core utility since the 80s, plays a crucial role in confirming connectivity between IP-networked devices. In this guide, we'll delve into what the ping command is, how to run a ping network test, common IP addresses to ping, interpreting results, and troubleshooting errors.

Events vs. Alerts vs. Incidents

Event. Alert. Incident. These terms are bandied about, often interchangeably, in IT operations management. Broadly speaking, they all refer to situations where something is potentially amiss and needs to be investigated and resolved. Each of these three words does, however, have a distinct definition. Because they are used in scenarios where clear communication and timeliness are critical, it’s important to understand the differences and use them appropriately.

Reducing the burden of incident response on your teams

In this webinar, a panel of engineering leaders, including Chris Evans, CPO at incident.io, share how they reduce the burden of incident response for their teams. They advocate for a culture of shared responsibility across the board, offering practical strategies to educate the business about engineering practices during the chaos of an outage.

How to Route Alerts to Subject Matter Experts Using Squadcast Tagging & Routing Rules?

Effective Incident Management is crucial for ensuring customer satisfaction and brand loyalty. As systems grow more complex, efficiently directing alerts to the right teams becomes crucial. This article delves into the challenges, implementation, and benefits of automating incident categorization.