Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Alerts to Incident Response in Three Easy Steps

You may already be using Splunk to ingest data and generate alerts and dashboards so you can take quick action on problems, but did you know you can quickly start a VictorOps trial and in three easy steps, have great Splunk alerts escalated to the right teams and people with a mobile app notification, SMS message or a live phone call?

PagerDuty and IBM Watson AIOps Team Up to Automate Real-Time Responses

“The way we work has changed forever.” Those are words that our CEO Jennifer Tejada used in her interview with Yahoo Finance a couple of weeks ago. Those words made me stop and think about how much of our customers’ daily work has changed irreversibly. Working from home has changed from a luxury to a necessity, so how do folks in the IT world adapt to this change?

Elastic Observability in SRE and Incident Response

Software services are at the heart of modern business in the digital age. Just look at the apps on your smartphone. Shopping, banking, streaming, gaming, reading, messaging, ridesharing, scheduling, searching — you name it. Society runs on software services. The industry has exploded to meet demands, and people have many choices on where to spend their money and attention. Businesses must compete to attract and retain customers who can switch services with the swipe of a thumb.

Modern ITSM Solutions: Creativity in Incident Response (Bring Your Own Tools)

The IT landscape is constantly evolving. A tool that is heavily used this month, may be virtually obsolete the next. In a such a dynamic ecosystem, the methods used to implement these tools are unique to every organization. Therefore, it has become crucial for organizations to implement an incident response process that incorporates any combination of tools, even those that are highly siloed and departmental.

Incident Resolution for Remote Teams

People working in IT support and incident management right now are faced with unusual difficulties supporting large remote workforces and managing unpredictable workloads. On Reddit, system admins and other IT pros are bemoaning the hiccups and hassles of working in isolation while trying to resolve issues and maintain high SLAs. You can’t go grab your indispensable SME for troubleshooting, because that person is also home and inundated with messages and alerts from many different tools.

Configure an Intuitive Service Dashboard & Reduce Response Time

Leverage Multiple Alert Sources in Squadcast to reflect your actual system infrastructure on your Service Dashboard Having your Incident Management Tool reflect your system architecture is a big milestone in reducing cognitive load on your on-call team. In order to help our users move one step closer to this milestone, we recently released the functionality to add multiple alert sources to a service. You can now model your service dashboard to mimic your actual system architecture.