Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

5 reasons why you shouldn't buy incident.io

Not many companies will tell you why you shouldn’t use their product, but any product that tries to be everything to everyone is doomed to failure. When you build without a specific user in mind, your target becomes the intersection of many viewpoints, and what you build is the lowest common denominator. What usually follows is software that can technically do everything, but feels unfocused, complex, and unpleasant to use. Something everyone is equally unhappy with.

Fast Track series: easily integrate monitoring alert sources

Integrating all of your monitoring alert sources is quite a task. Large enterprises often struggle to aggregate millions of data records from dozens of monitoring, change, and topology tools in real-time. Filtering out the noise and prioritizing the most important alerts are crucial to a team’s success. BigPanda makes it simple to integrate with any monitoring alert sources with Open Integration Hub. Currently, we have more than 50 easy-to-use integrations to choose from.

RIA Vendor Selection Matrix for AIOps 2022

In July, the research firm Research In Action (RIA), published the 2022 edition of their annual Vendor Selection Matrix™. Despite AIOps being a well established technology (Moogsoft has customers who have been reaping the benefits of AIOps for many years) selecting a vendor can still be quite difficult, given the plethora of vendors who quickly re-branded their solutions as AIOps. So a vendor selection guide is a valuable resource.

Honeycomb Announces Major Updates to PagerDuty Integration

Today, we’re announcing major new updates to Honeycomb’s PagerDuty integration. These updates put more of the information you need into PagerDuty notifications and allow for greater configurability. These enhancements are available to all users who leverage Honeycomb Triggers and Burn Alerts to send notifications via PagerDuty.

New Feature: New Component Status Types

What’s just as important as resolving an impacted service? Providing detailed yet digestible updates to your communities and stakeholders. A recent update to StatusCast, involves the addition of three new status types that can be assigned to your components. Detailed communications is an essential component of incident response and management, and additional status types provide your users with a more granular view of incident activity.

SignalFlows to SLOs

How are you tracking the long-term operation and health indicators for your micro and macro services? Service Level Indicators (SLIs) and Service Level Objectives (SLOs) are prized (but sometimes “aspirational”) metrics for DevOps teams and ITOps analysts. Today we’ll see how we can leverage SignalFlow to put some SLOs Error Budget tracking together (or easily spin up same with Terraform)!

What you need to know & do to be a world-class cyber incident responder

World-class incident responders are a strategic asset in today’s world where the frequency and sophistication of cyber security attacks continue to increase every year, as do the associated financial damages: As such, more and more organizations are looking to grow their cyber incident response expertise, both with inhouse staff as well as by engaging with third-party experts.

We're making our on-call calculator free

We've all done it: "that'll be simple, I'll just write a quick script and..." In the case of calculating on-call pay, we really have done it before: our team have built the on-call pay scripts for several companies, and each attempt was a painful, error prone process. While we believe everyone on-call should be paid for their inconvenience, relying on someones side-project or back-of-napkin maths to calculate pay leads to mistakes, frustration, and wasted time.

What is a Security Operation Center and how do SOC teams work?

With the growing complexity of IT environments, it is essential to have robust security processes that can safeguard IT environments from cyber threats. In this blog, we will explore how security operation centers (SOCs), help you monitor, identify and prevent cyber threats to safeguard your IT environments. This blog covers the following pointers.