June 2021

Threat Stack and Squadcast Integration Streamlines Alerts with Greater Context

Jun 30, 2021 By Anusha Ravindra In Squadcast

This is a guest post collaboration between Squadcast & Threat Stack. The move to the cloud has rapidly expanded the cyber threat surface of modern cloud apps. This blog in partnership with Threat Stack, outlines how you can stay on top of your game with help of context-rich alerting & resolve security incidents rapidly along with few best practices to follow for faster incident response.

Read Post

Squadcast

Read more about Threat Stack and Squadcast Integration Streamlines Alerts with Greater Context

Wiley Relies on PagerDuty as the World Moves Towards Digital Learning

Jun 30, 2021 By PagerDuty In PagerDuty

John Wiley & Sons, Inc., commonly referred to as Wiley, is a global publishing company founded in 1807 that focuses on academic publishing and instructional materials. Sean Mack, CIO and CISO of Wiley, discusses how PagerDuty is empowering teams to own and support services 24/7/365 as digital learning becomes more prevalent.

View Video

PagerDuty

Read more about Wiley Relies on PagerDuty as the World Moves Towards Digital Learning

xMatters Lunar Lander Release - New Product Features - xMatters Demo

Jun 30, 2021 By xMatters In xMatters

xMatters Lunar Lander release is here! Join Sr. Director of Customer Success, Kerin Munro, and Product Manager, Daniel Reich as they discuss some of the latest and greatest product features that went live with the Lunar Lander release. These updates include new possibilities in xMatters Flow Designer with a create alert step and an incident severity step, updates to Event Flood Control, and more!

View Video

xMatters

Read more about xMatters Lunar Lander Release - New Product Features - xMatters Demo

3 Steps For A More Strategic Approach to Incident Reduction

Jun 28, 2021 By Prabhu Kaliaperumal In Nexthink

When an IT incident negatively impacts employee experience, IT teams rush to remedy the issue – understandably, as a widespread incident can have major effects on employees’ productivity, security, and overall experience. Yet, so many IT teams find themselves drowning in support tickets even as they continue to resolve top call drivers (the incidents that affect the most employees and drive the most support requests).

Read Post

Nexthink

Read more about 3 Steps For A More Strategic Approach to Incident Reduction

How Integrations Lead to Easier, Quicker and Better Decision-Making

Jun 25, 2021 By Richard Whitehead In Moogsoft

Whether from a monitoring tool such as Datadog, a collaboration tool such as Slack, an automation tool such as Chef or a ticketing tool such as ServiceNow or JIRA, AIOps seamlessly integrates data from all of your IT sources. A robust AIOps solution with integrations can help your DevOps and SRE teams better know where to begin fix problems, resolving incidents before they affect services and reducing downtime.

Read Post

Moogsoft

Read more about How Integrations Lead to Easier, Quicker and Better Decision-Making

Why You Need Real-Time for Faster MTTR

Jun 25, 2021 By Dave McAllister In Splunk

“If you ain't first, you're last.” While that famous one-liner from Ricky Bobby (Will Ferrell) in the cult hit Talladega Nights is more joke than catchphrase, it hits home for those of us in the world of DevOps and Observability. Faster is better. And in our technology-driven world of online transactions and complex environments, faster isn’t just better — it’s crucial.

Read Post

Splunk

Read more about Why You Need Real-Time for Faster MTTR

IT Alerting with SIGNL4

Jun 25, 2021 By SIGNL4 In SIGNL4

How SIGNL4 and its mobile alerting app streamline on-call duty in 24/7 IT operations and prevents alert fatigue

View Video

SIGNL4

Read more about IT Alerting with SIGNL4

7 Essential Tools for SREs

Jun 25, 2021 By Quentin Rousseau In Rootly

From chaos engineering to monitoring and beyond, SREs rely on several key types of tools to do their jobs.

Read Post

Rootly

Read more about 7 Essential Tools for SREs

Have You Herd? Episode 2 | Getting into the DevOps Culture

Jun 25, 2021 By Moogsoft In Moogsoft

Join the Moogsoft Engineering team for our second episode of Have You Herd?! This episode we talk about how you can get into the DevOps culture covering questions like... How do you contribute to a DevOps culture as an individual contributor? What pipelines and tools should a company have set up before embarking on the DevOps journey? What kind of skills should you have to market as a DevOps leading engineer?

View Video

Moogsoft

Read more about Have You Herd? Episode 2 | Getting into the DevOps Culture

Solarisbank Banks on PagerDuty to Keep Financial Services Online

Jun 24, 2021 By PagerDuty In PagerDuty

Solarisbank is Europe’s leading Banking-as-a-Service platform that enables any business to offer their own financial services. Satyajit Ranjeev, Daria Kameneva, and Jens Hermann discuss how PagerDuty helps teams implement a “you build it, you own it” model and reduce incident response times.

View Video

PagerDuty

Read more about Solarisbank Banks on PagerDuty to Keep Financial Services Online

Customers Choose PagerDuty for Real-Time Operations

Jun 24, 2021 By PagerDuty In PagerDuty

Organizations need a solution that’s designed for today’s dynamic digital reality. Hear customers like Carrefour Bank, IG, The Trevor Project, Vodafone, and Zoom explain how PagerDuty empowers them in an always-on, real-time world.

View Video

PagerDuty

Read more about Customers Choose PagerDuty for Real-Time Operations

PagerDuty Summit21 Fireside Chat: Connecting Customer Experience & Customer Service

Jun 24, 2021 By PagerDuty In PagerDuty

Listen in as PagerDuty SVP Jonathan Rende chats with Clara Shih, CEO of Service Cloud at Salesforce, on how PagerDuty helps Customer Service teams improve CSAT, meet SLAs, and more.

View Video

PagerDuty

Read more about PagerDuty Summit21 Fireside Chat: Connecting Customer Experience & Customer Service

Can Emails Initiate xMatters Workflows? - Ask Adam

Jun 24, 2021 By xMatters In xMatters

You’ve spotted an incident, but how do you get your team to start working on it? xMatters workflow expert Adam can show you how. Email triggers in xMatters are a fast and effective, and a great way to get workflows going with minimal fuss. There are a few steps to getting them configured right so let's go through it from the beginning.

View Video

xMatters

Incident Management

Read more about Can Emails Initiate xMatters Workflows? - Ask Adam

How to Introduce Automation to Incident Response with Slack and PagerDuty

Jun 24, 2021 By Slack In PagerDuty

Major-incident war rooms are synonymous with stress. Pressure from executives, digging for a needle in a haystack, too much noise—it’s all weight on your hardworking technical teams. Incident responders clearly need a more effective way to collaborate across various technical teams. A method that both minimizes interruptions and keeps stakeholders up to date while ensuring everyone has the right level of context to do their job.

Read Post

PagerDuty

Read more about How to Introduce Automation to Incident Response with Slack and PagerDuty

Reliable ticket and incident alerts with ConnectWise and SIGNL4

Jun 23, 2021 By SIGNL4 In SIGNL4

With SIGNL4 your on-call teams and field services engineers will never miss a critical ticket. And they won't suffer alert fatigue, either. SIGNL4 adds reliable mobile alerting by push, text and voice call, event filtering, duty scheduling and much more to ConnectWise within a few minutes.

View Video

SIGNL4

Read more about Reliable ticket and incident alerts with ConnectWise and SIGNL4

Resilience in Action E8: Vanessa Yiu on Crafting Enterprise Architecture

Jun 23, 2021 By Blameless Community In Blameless

‍Resilience in Action is a podcast about all things resilience, from SRE to software engineering, to how it affects our personal lives, and more. Resilience in Action is hosted by Kurt Andersen. Kurt is a practitioner and an active thought leader in the SRE community. He speaks at major DevOps & SRE conferences and publishes his work through O'Reilly in quintessential SRE books such as Seeking SRE, What is SRE?, and 97 Things Every SRE Should Know.

Read Post

Blameless

Read more about Resilience in Action E8: Vanessa Yiu on Crafting Enterprise Architecture

PagerDuty Summit21 Keynote: DigitalOps Now: Go Digital First with Modern Digital Ops Management

Jun 23, 2021 By PagerDuty In PagerDuty

To succeed in a world of digital first customer experiences, operations must also be digital first. Join PagerDuty CEO Jennifer Tejada & CPO Sean Scott as they share the latest PagerDuty innovations and our vision for the future of work. Don't miss exclusive fireside chats with Fox Corporation executives Paul Cheesbrough, CTO & President of Digital and Jeff Dow, EVP for Media and Broadcast, as well as Kim Hammonds, Investor and Board Member at Zoom, Box, Tenable and UiPath and The Goldman Sachs Group, Inc.

View Video

PagerDuty

Read more about PagerDuty Summit21 Keynote: DigitalOps Now: Go Digital First with Modern Digital Ops Management

Leverage Observability With OpenTelemetry to Understand Root Cause Quickly

Jun 23, 2021 By Clay Smith In PagerDuty

An observability solution should help any incident responder understand what changed and why. A lot has been written on the difference between monitoring and observability, but an easy way to understand how both are integral to incident response is to consider how customers use PagerDuty—with both monitoring and observability tools—to get to the right answer.

Read Post

PagerDuty

Read more about Leverage Observability With OpenTelemetry to Understand Root Cause Quickly

SREview Issue #14 June 2021

Jun 22, 2021 By Blameless Community In Blameless

Hoping you're headed towards a fun summer season and some time without masks. Let's avoid a new kind of tan-line! This newsletter shares useful industry content and an exciting Blameless product announcement. Find our fave tweets and events in the SRE and resilience engineering community. We're hiring! Check out the job openings here.

Read Post

Blameless

Read more about SREview Issue #14 June 2021

xMatters Makes Workflow Automation as Simple as Drag and Drop

Jun 22, 2021 By xMatters In xMatters

xMatters’ low to no code integrations makes creating automated workflows that align your team and processes as simple as drag and drop. With just a few clicks, your teams can be building workflows that integrate, automate and accelerate your incident response and resolution capabilities. Best yet, xMatters is free to use and you can get started today at xmatters.com/free!

View Video

xMatters

Incident Management

Read more about xMatters Makes Workflow Automation as Simple as Drag and Drop

Red Canary says 43% Lack Readiness to Notify Customers of a Security Breach

Jun 22, 2021 By AlertOps In AlertOps

The phrase ‘stakeholder management” assumes that stakeholders are truly informed by alerts. However, managers can only send communications out, they cannot force people to address them. To ensure your stakeholders are engaged during an incident, it is vital to set up a defined communication process. Yet, a recent Red Canary report1 found that 43% of surveyed participants lack readiness to notify the public and/or its customers in the event of a security breach.

Read Post

AlertOps

Read more about Red Canary says 43% Lack Readiness to Notify Customers of a Security Breach

Everything You Need to Know About Emergency Risk Management

Jun 21, 2021 By Christopher Gonzalez In OnPage

Emergency risk management (ERM) is the process of identifying potential threats and minimizing the impact of disasters on business operations and people. The process requires leaders within an organization to determine how they will keep stakeholders informed and safe during critical events. Leaders must also craft disaster recovery plans to quickly remedy the effects of a catastrophic event on communities, government agencies and organizations.

Read Post

OnPage

Read more about Everything You Need to Know About Emergency Risk Management

IT Ops Maturity Model - an explainer video

Jun 21, 2021 By BigPanda In BigPanda

Mateo explains what an IT Ops Maturity Model is, and how you can use it to understand your systems' current state and see where you want them to be.

View Video

BigPanda

Read more about IT Ops Maturity Model - an explainer video

SCOM Connection Center for Cherwell - Introduction in 3 mins

Jun 17, 2021 By Cookdown In Cookdown

Find out how to convert critical SCOM alerts into actionable Cherwell incidents with real-time, two-way synchronization. This short video illustrates how we use out-of-the-box, code-free integration to get SCOM and Cherwell working as one!

View Video

Cookdown

Read more about SCOM Connection Center for Cherwell - Introduction in 3 mins

SCOM Connection Center for Cherwell Introduction

Jun 17, 2021 By Cookdown In Cookdown

Find out how to convert critical SCOM alerts into actionable Cherwell incidents with real-time, two-way synchronization. This deep dive illustrates how we use out-of-the-box, code-free integration to get SCOM and Cherwell working as one!

View Video

Cookdown

Read more about SCOM Connection Center for Cherwell Introduction

Practical Guide to SRE: Incident Severity Levels

Jun 17, 2021 By Nancy Chauhan In Rootly

Incident severity levels are a measurement of the impact an incident has on the business. Classifying the severity of an issue is critical to decide how quickly and efficiently problems get resolved.

Read Post

Rootly

Read more about Practical Guide to SRE: Incident Severity Levels

Monthly Moo Update | May 2021

Jun 16, 2021 By Adam Frank In Moogsoft

Goodbye May, Hello June! It’s summertime in the northern hemisphere and the sun is shining bright, along with updates we’ve got for you this month. The team at Moogsoft is working on a few big items that will be sure to put a smile on your face. But, lest we forget about some of the smaller items that help you day in and day out.

Read Post

Moogsoft

Read more about Monthly Moo Update | May 2021

Manage incidents on the go with the Datadog mobile app

Jun 15, 2021 By Sacha Guyon In Datadog

The Datadog mobile app enables you to check your alerts and dashboards from anywhere, so you can triage issues—and stay up to date—regardless of whether you have access to a laptop. You can now be even more productive when responding to issues while away from your keyboard by declaring incidents and notifying responders directly from your mobile device.

Read Post

Datadog

Read more about Manage incidents on the go with the Datadog mobile app

Uptime is Money

Jun 15, 2021 By PagerDuty In PagerDuty

In today’s digital world, any issue can cost millions. That’s why over 13,000 companies, and more than 60 of the Fortune 100, rely on PagerDuty to identify issues and automate responses for fast resolution. Keep your business always on, customers happy, and protect your bottom line at pagerduty.com/uptime-is-money/

View Video

PagerDuty

Read more about Uptime is Money

Everbridge Control Center - Integrate control of your physical assets

Jun 15, 2021 By Everbridge In Everbridge

For many organizations, physical security management can be a daunting task. Threats are on the rise and risks are becoming increasingly diverse. Operations also continue to grow, involving more systems, more data, and many more users.

View Video

Everbridge

Incident Management

Read more about Everbridge Control Center - Integrate control of your physical assets

WEX Automates the Triage Process and Delivers a Better Services Experience - xMatters Demo

Jun 15, 2021 By xMatters In xMatters

Does your internal triage process keep you up at night, literally or figuratively? If so, WEX used to have triage and onboarding issues that got in the way of their success too, but with xMatters, they’ve found a better way. Join James Molchanoff (JT), Information Systems Engineers at WEX, John Kallio, Information Systems Engineer at WEX, Will Derksen, Product Advocate at xMatters, and Zoe Na, Customer Success Manager at xMatters, as they discuss how WEX has embraced xMatters to reduce triage and call-out time and simplified their onboarding process.

View Video

xMatters

Incident Management

Read more about WEX Automates the Triage Process and Delivers a Better Services Experience - xMatters Demo

Observability and the Monitoring Maturity Model

Jun 15, 2021 By Allyson Barr In StackState

In incident management, observability is the ability of an organization or team to infer a system's internal state from its external outputs.

Read Post

StackState

Read more about Observability and the Monitoring Maturity Model

Maximize Collective Knowledge to Deliver Patient Care

Jun 15, 2021 By Ritika Bramhe In OnPage

Medical practitioners must move beyond their own expertise to make informed patient care decisions. This can be achieved by normalizing team collaboration, encouraging providers to access information gathered by other specialists along the patient’s continuum of care. However, healthcare is plagued with fragmented communication due to archaic technology. There is also a lack of accountability when establishing communication roles and responsibilities across care teams.

Read Post

OnPage

Read more about Maximize Collective Knowledge to Deliver Patient Care

SRE For Enterprise

Jun 15, 2021 By Blameless In Blameless

Kurt (Head of Strategy), Nicolas (Product Manager), and Paul (Customer Success Manager) from Blameless talk about: They conclude the webinar with an exciting product announcement! Stay tuned, stay blameless.

View Video

Blameless

Read more about SRE For Enterprise

7 key processes for running a top performing NOC

Jun 14, 2021 By Eyal Katz In Exigence

Much of the fuel for today’s business organizations is comprised of cloud computing and digital and SaaS applications. So, if something goes wrong with them, there will be a grave impact on productivity, customer satisfaction and even loyalty, as well as on the costs required for resolving the incident, remediating damage, and getting back to business.

Read Post

Exigence

Read more about 7 key processes for running a top performing NOC

Complete Guide to Service Level Objectives (SLOs) That Work

Jun 11, 2021 By Noor-ul-Anam Ruqayya In Blameless

Wondering what Service Level Objectives (SLOs) are? In this article, we will explain service level objectives and how they relate to SLAs, SLIs, and error budgets. A Service Level Objective (SLO) is a reliability target, measured by a Service Level Indicator (SLI) and sometimes serves as a safeguard for a Service Level Agreement (SLA). SLOs represent customer happiness and guide the development team’s velocity.

Read Post

Blameless

Read more about Complete Guide to Service Level Objectives (SLOs) That Work

BigPanda and xMatters Can Do What??? - xMatters Demo

Jun 10, 2021 By xMatters In xMatters

Have you ever dealt with two or more separate incidents, but something about them seems suspiciously similar? Well, BigPanda and xMatters might just be the toolset you need to start connecting the dots.

View Video

xMatters

Read more about BigPanda and xMatters Can Do What??? - xMatters Demo

The MTTR that matters

Jun 10, 2021 By Robert Ross In FireHydrant

“Mean time to X” is a common term used to describe how long, on average, a particular milestone takes to achieve in incident response. There’s mean time to detect, acknowledge, mitigate, etc. And then there’s the elusive “mean time to recover,” also known as “MTTR.” MTTR, a hotly debated acronym and concept, measures how long it takes to resolve an incident on average. The problem with MTTR, though, is that it doesn’t matter.

Read Post

FireHydrant

Read more about The MTTR that matters

Here's what SLIs AREN'T

Jun 10, 2021 By Emily Arnott In Blameless

SLIs, or service level indicators, are powerful metrics of service health. They’re often built up from simpler metrics that are monitored from the system. SLIs transform lower level machine data into something that captures user happiness. Your organization might already have processes with this same goal. Techniques like real-time telemetry and using synthetic data also build metrics that meaningfully represent service health.

Read Post

Blameless

Read more about Here's what SLIs AREN'T

Press Release: iLert achieves Amazon RDS Ready designation

Jun 9, 2021 By iLert In iLert

Cologne, Germany – iLert GmbH, a SaaS company for alerting, on-call management, and uptime monitoring, announced today that it has achieved the Amazon RDS Ready designation, part of the Amazon Web Services, Inc. (AWS) Service Ready Program. This designation recognizes that iLert has demonstrated successful integration with Amazon Relational Database Service (Amazon RDS).

Read Post

iLert

Read more about Press Release: iLert achieves Amazon RDS Ready designation

Faster Incident Resolution with Context Rich Alerts

Jun 9, 2021 By Roshan Shetty In Squadcast

Labelling your alert payloads although simple can significantly improve the time it takes for your team to respond to incidents. In this blog learn how Squadcast's auto-tagging feature can be a game changer by enabling intelligent labelling & routing of alerts to ultimately reduce your MTTR. A frequent problem faced by on-call engineers when critical outages occur is pinpointing the exact point of failure.

Read Post

Squadcast

Read more about Faster Incident Resolution with Context Rich Alerts

AIOps as a modern cockpit, and why that matters

Jun 9, 2021 By BigPanda In BigPanda

Join us in a CTO Perspective discussion with Jason Walker, Chief Customer Officer at BigPanda and former marine pilot, to find out exactly how IT Ops is following in the footsteps of the modern cockpit and why that should matter to anyone looking to adopt AIOps.

View Video

BigPanda

Read more about AIOps as a modern cockpit, and why that matters

AIOps as a modern cockpit, and why that matters

Jun 9, 2021 By Yoram Pollack In BigPanda

Our human capacity for ingesting information and acting on it, is constant. As the systems we operate grow more complex, we need to make sure we use technology that presents us with only the relevant information we need, exactly when we need it. In aviation, this lesson was learned long ago, and now IT Ops is catching up.

Read Post

BigPanda

Read more about AIOps as a modern cockpit, and why that matters

5 Steps to Building an Effective Clinical Communication Plan

Jun 8, 2021 By Christopher Gonzalez In OnPage

Organizations require a well-crafted clinical communication plan to streamline workflows across care teams. The communication plan must include processes, hardware and software that improves how providers perform. An effective communication plan eliminates barriers across departments and ensures that all providers are informed of patient-related incidents. High-level healthcare administrators are responsible for designing, managing and launching the clinical communication plan.

Read Post

OnPage

Read more about 5 Steps to Building an Effective Clinical Communication Plan

Chapter 7: In Which Sarah Experiments with Observable Low-Code

Jun 8, 2021 By Helen Beal In Moogsoft

This is the seventh chapter in a series of blog posts exploring the role that intelligent observability plays in the day-to-day life of smart teams. In this chapter, our DevOps Engineer, Sarah, experiments with low code and Moogsoft in her team’s DevOps toolchain to rush a new feature out the door to keep up with a competitor.

Read Post

Moogsoft

Read more about Chapter 7: In Which Sarah Experiments with Observable Low-Code

Streamline incident management with BigPanda's offering in the Datadog Marketplace

Jun 7, 2021 By Kai Xin Tai In Datadog

BigPanda is a domain-agnostic AIOps platform that helps organizations detect and resolve incidents in their complex IT environments. By unifying and correlating data from monitoring, change, and topology tools, BigPanda enables teams to quickly pinpoint the root cause of issues and prevent costly outages.

Read Post

Datadog

Read more about Streamline incident management with BigPanda's offering in the Datadog Marketplace

Are you an MS Teams shop? We've got you Covered with Blameless Incident Resolution

Jun 7, 2021 By Blameless Community In Blameless

We have an exciting announcement. Blameless is providing early access to our Microsoft Teams integration. SRE and engineering teams can now resolve incidents faster without leaving the comfort of their favorite messaging tool. With the Blameless incident resolution product, Microsoft Teams users can now reduce toil in routine incident response processes through automation, codify processes with checklists, and craft retrospectives with the ‘add to timeline’ command.

Read Post

Blameless

Read more about Are you an MS Teams shop? We've got you Covered with Blameless Incident Resolution

Take the Lead: Jennifer Tejada & Hayden Brown

Jun 4, 2021 By PagerDuty In PagerDuty

Our CEO, Jennifer Tejada recently sat down with CEO of UpWork Hayden Brown to discuss how companies are transitioning to a digital world and utilizing freelancers. Listen in on their conversation!

View Video

PagerDuty

Read more about Take the Lead: Jennifer Tejada & Hayden Brown

Using Elastic for Root Cause Analysis

Jun 4, 2021 By Elastic In Elastic

Elastic allows you to store logs, metrics, and traces in a single datastore. This makes it easier to have unified visibility of your observability data. In this video, you'll learn how this helps performing root cause analysis.

View Video

Elastic

Read more about Using Elastic for Root Cause Analysis

Have You Herd? Episode 1 | DevOps vs SRE

Jun 3, 2021 By Moogsoft In Moogsoft

Join the Moogsoft Engineering team for their inaugural stream as we tackle the big questions - How do we define DevOps? And as it becomes more mainstream - will the roles of development & ops combine forever into super powered developers, or does the complexity of our systems require further specialization between the two roles?

View Video

Moogsoft

Read more about Have You Herd? Episode 1 | DevOps vs SRE

The Incident Review: 4 Times When Typos Brought Down Critical Systems

Jun 3, 2021 By JJ Tang In Rootly

Sometimes, as these 4 incidents highlight, major failure results from a mere typo or configuration oversight.

Read Post

Rootly

Read more about The Incident Review: 4 Times When Typos Brought Down Critical Systems

Who is on standby? Simple question, simple answer.

Jun 3, 2021 By Derdack In Derdack

In our feature session for the current Enterprise Alert release, we were asked if it was possible to make the on-call page available to every employee regardless of whether they have a user account in Enterprise Alert or not. This option has existed in Enterprise Alert for a long time, but admittedly it is not very well documented. So I would like to take this opportunity to show you what the on-call overview can offer you and how to share the on-call page.

Read Post

Derdack

Read more about Who is on standby? Simple question, simple answer.

Copy and Paste Multi-Team Schedules

Jun 3, 2021 By Derdack In Derdack

With the release of Enterprise Alert 9, not only have our capabilities for tighter integration with almost any source system imaginable been massively expanded, but our front end has also received some much requested updates. Among them are our multi-team schedules. These allow – especially for international companies – a simple and clear planning of readiness of several teams across different time zones.

Read Post

Derdack

Read more about Copy and Paste Multi-Team Schedules

Integration of Enterprise Alert 9 with AzureMonitor

Jun 3, 2021 By Derdack In Derdack

Our Azure Monitor connector provides seamless 2-way integration of Enterprise Alert 9 with Azure Monitor. Once added to your Enterprise Alert instance, the connector will read your Azure Monitor alerts fully automatically and trigger alert notifications, e.g. to your team members on duty. It also synchronizes the alert status from Enterprise Alert 9 to Azure Monitor so that if alerts are acknowledged or closed, this status is also updated on the according alert in Azure Monitor.

Read Post

Derdack

Read more about Integration of Enterprise Alert 9 with AzureMonitor

Error Budgets Explained (And How to Make One for Your Team)

Jun 2, 2021 By Noor-ul-Anam Ruqayya In Blameless

Wondering what error budgets (EBs) are and how they are useful? We explain what they are, how they are defined, and how they can help your team. An error budget is the amount of acceptable unreliability a service can have before customer happiness is impacted. If a service is well within its budget, the developers can take more risks in their releases. If not, developers need to make safer choices.

Read Post

Blameless

Read more about Error Budgets Explained (And How to Make One for Your Team)

In the Heat of the Page: Coping with the Root Cause of Incident Stress

Jun 1, 2021 By PagerDuty In PagerDuty

J. Paul Reed, Senior Applied Resilience Engineer at Neflix walks us through how to cope with incident response stress.

View Video

PagerDuty

Read more about In the Heat of the Page: Coping with the Root Cause of Incident Stress

CEO Keynote: It's Time for Digital Ops

Jun 1, 2021 By PagerDuty In PagerDuty

It’s Time for Digital Ops, listen to the inspiring Jennifer Tejada (CEO of PagerDuty) as she discusses the last year in digital transformation with Diya Jolly, Bret Taylor, Andy Jassy, David Williams, and Ebony Beckwith.

View Video

PagerDuty

Read more about CEO Keynote: It's Time for Digital Ops

Operations | Monitoring | ITSM | DevOps | Cloud

June 2021