Operations | Monitoring | ITSM | DevOps | Cloud

August 2022

RESOLVE '22: Warp speed to digital innovation

The pandemic accelerated digital transformation in the business world by forcing companies to double down on areas in which they’d already begun investing. The mass move to video conferencing solutions in industries such as healthcare and education are two examples. In other industries, companies were only able to survive by jumping into completely new areas: brick-and-mortar retailers diving feet-first into e-commerce after lockdowns and health concerns kept shoppers indoors, for example.

Using incidents to level up your teams

I joined GoCardless as a junior engineer. It was one of my first coding jobs, and in my time there I progressed to senior much faster than I had expected. When I reflect on how this happened, one pattern stands out to me; the big step changes in my understanding, and my ability to solve larger and more complex engineering problems, came as a result of incidents.

What's New: Updates to PagerDuty Process Automation Software & PagerDuty Runbook Automation, Integrations, and More!

We’re excited to announce a new set of updates and enhancements to the PagerDuty Operations Cloud. Recent development and app updates from the product team include PagerDuty® Process Automation, our Partner Integrations and App Ecosystem, as well as Community & Advocacy Events updates. We continue to help customers automate everywhere to optimize cloud operations and reduce the amount of issues escalated to other teams.

7 Best Practices for Emergency Managers

By recognizing that hazards, including severe weather events, are unpredictable and cannot be completely prevented, emergency managers can instead focus their efforts on promoting a resilient organization. A community is resilient when it can recover from a disaster or other stressor and get back on its feet as quickly as possible. There are many best practices for emergency managers to prepare for severe weather events such as hurricanes, floods, tornadoes, wildfires, and other risks.

RESOLVE '22: Bit by bit

It is difficult to define a single, solid maturity model for IT Operations. As moderator Jason Walker, BigPanda’s COO, said in our RESOLVE ’22 event Bit by bit, maturity models in “almost every other domain of IT” have not turned into a workable set of guideposts and indicators in the Ops domain. We welcomed Insurity’s Lead Cloud Operations Performance & Monitoring Admin, Ronnel Vergara, to take the stage and talk over this high-level topic at our event.

Round Robin Escalation: An Efficient Way to Distribute On-Call Responsibilities

Nowadays, organizations address a high volume of incidents everyday. With so much happening, responders can be overwhelmed by the volume of incidents and may end up de-prioritizing certain important incidents. Hence, it is important to have an efficient on-call scheduling and escalation process in place. In this blog, we will explore how Round Robin Escalations can help distribute on-call load and set up efficient on-call schedules. This blog covers the following pointers.

Bridging the gap between Engineering and Customer Support during incidents

Customer trust and satisfaction are the most important currency your business can own. No matter how brilliant your product, without happy customers your business will struggle. When everything is running smoothly, it’s easy to feel that heady dose of customer love. It’s when things break during an incident that these relationships are really put to the test.

The Five Main Components of a Fully Developed EHR System

The adoption of electronic health record (EHR) systems has seen tremendous growth across geographies, especially in the US. According to American Hospital Association data shared by the Office of the National Coordinator for Health Information Technology, over 93% of American hospitals are enabled by some form of EHR in their organization. Implementing an EHR system in your clinic or hospital is a big decision.

Get started with Grafana OnCall and Terraform

Managing on-call schedules and escalation chains, especially across many teams, can get cumbersome and error prone. This can be especially difficult without as-code workflows. Here on the Grafana OnCall team, we’re focused on making Grafana OnCall as easy to use as possible. We want to make it easier to reduce errors with your on-call schedules, create schedule and escalation templates quickly, and fit on-call management into your existing as-code patterns.

Healthchecks + Squadcast Integration: Routing Alerts Made Easy

Healthchecks is a cron job monitoring service which listens to HTTP requests and email messages ("pings") from your cron jobs and scheduled tasks ("checks"). It lets you update your job to send an HTTP request to the ping URL every time the job runs. When your job does not ping Healthchecks.io on time, then you will receive an alert! If you use Healthchecks for your monitoring needs, you can now integrate it with Squadcast to route detailed alerts from Healthchecks to the right users in Squadcast.

Introduction to Service Catalog | Service Ownership | Service Classification Squadcast

To make service management a breeze, we bring to you our improved Service Catalog. The Service Catalog is designed to improve Service Classification and bring more transparency to Service Ownership within your org. This video explains how a consolidated summary of all active services from a single dashboard can help you better track your service health.
Sponsored Post

How To Reduce Incident Tickets

In IT environments, incidents happen all the time and it's impossible to prevent all of them. Regardless of the available software solutions or the level of technical training of both users and developers, no organization is immune to incidents. The increased dependence on IT infrastructure to provide core services means that any disruption in IT services can cause any organization significant financial and reputational harm. For example, IT service providers need to resolve customer support tickets following the service-level agreements (SLAs), and failing to do so makes them liable for breaching such agreements.

Sponsored Post

What are Runbooks? And why are they needed?

Imagine being an Ops engineer in a team just struck by tragedy. Alarms start ringing, and incident response is in full force. It may sound like the situation is in control. WRONG! There's panic everywhere. The on-call team is scrambling for the heavenly door to redemption. But, the only thing that doesn't stop - Stakeholder Inquiries. This situation is bad. But it could be worse. Now imagine being a less-experienced Ops engineer in a relatively small on-call team struck by tragedy. If you don't have sufficient guidance, let alone moral support- you're toast.

RESOLVE '22: Expert predictions for AIOps 2022-2025

BigPanda’s RESOLVE ‘22 conference hosted a number of luminaries in the AIOps and IT Ops world, so naturally we needed to get their thoughts on the future of the market and where they see AIOps going in the next few years. Our guests for the session titled Expert predictions for AIOps 2022-2025 were from the press, investor community, analyst community and vendor world.

Using StatusPage at squadcast | SRE Best practices | Squadcast

Let your customers know how your Services are doing, without them having to ask you about it. One of the core principles of SRE is Transparency and Status Pages help you communicate the status of your Services to your customers at all times, as opposed to you getting to know the status of your Services through support tickets logged by your customers.

What are Canary Deployments and Why are they Important?

Every modification to software comes with the potential for production problems. Application failures often have serious consequences which can result in a loss of revenue and a poor customer experience. Additionally, organizations constantly try to improve their services for a better customer experience. How can you minimize the chance of error and update your application with confidence?

incident.io + Indent - on-demand system access

At incident.io, we empower teams to run incidents quickly and effectively from start to finish. One of the ways we help is by taking the manual admin out of your incidents. More often than not, folks are spending too much time thinking about the process, when the time would be better spent focusing on fixing. Our automated workflows, nudges and prompts help to embed best practices and unlock time for more impactful work.

Mattermost Playbooks How-to: OKR Management

Creating, managing, and tracking high level goals can be incredibly burdensome and complex for organizations with numerous stakeholders and cross-functional collaboration. Team leads and executives manage multitudes of reporting tools and departments while contributors often have little visibility into the process of creating goals or the progress towards achieving those goals.

Performing Postmortems & Postmortem Templates at Squadcast | SRE Best practices | Squadcast

Postmortems are a way to summarize the resolution for an incident once it is resolved. It is also a way for you to create a knowledge-base of failures and fixes that can be shared across your team to help build a culture of shared learning and learning from failures.

Defining a Strategy for Process Automation

As business systems grow to encompass more locations, tools, and organizations, defining processes that keep pace with these changes can’t be left to a hodgepodge of disconnected programs—or worse, manual implementation of paper documentation. You need to automate. Automation within businesses first arose in the 1960s, alongside resource planning systems.

As Tech Innovation Reshapes the Workforce, Employee Safety Looms Large

Everbridge partner Atos, a global leader in cloud and the digital workplace, recently published an interesting take on the future of work and how new technology trends will impact businesses and employees alike, now and in years to come. The company suggests that the combination of 5G and technologically augmented humans will drive some of the most significant changes in the way we work.

Everbridge Live Severe Weather Preparedness and Response

Watch our Everbridge Live 25-min session and learn from a real life emergency services rescue expert on the intricacies of coordinating a response. You will also get an in-depth view of how Everbridge can help you obtain accurate advance notice of severe weather and make sound decisions. In this session, you'll see how you can: Detect and assess threats. Locate impacted employees, assets, and suppliers. Take quick, decisive action to mitigate or eliminate the impact of the threat. Evaluate actions taken to improve risk resiliency. Here expert analysis from a current civil defense responder.

Feeling zen, finding DORA, and the policy police

We’ve had a bumper month here at incident.io HQ. We’ve welcomed 3 new joiners, celebrated two 1 year incident.io anniversaries (congrats Lisa and Lawrence!), released a whole load of exciting new features and (for those of you wondering what’s been causing the recent heatwave) we’ve redesigned our website and it is on fire 🔥 😎 Here’s a round-up of some of this month's highlights…

Updating our data stack

It’s been over 6 months since Lawrence’s excellent blog post on our data stack here at incident.io, and we thought it was about time for an update. This post runs through the tweaks we’ve made to our setup over the past 2 months and challenges we’ve found as we’ve scaled from a company of 10 people to 30, now with a 2 person data team (soon to be 3 - we’re hiring)!

PagerDuty Service Standards helps organizations better configure services at scale

Service ownership, a DevOps best practice, is a method that many companies are pivoting towards. The benefits of service ownership are varied and include boons such as bringing development teams much closer to their customers, the business, and the value being delivered. The “build it, own it model” has tangible effects on customer experience, as developers are incentivized to innovate and drive customer-facing features that delight.

RESOLVE '22: AIOps: Not just a buzz phrase anymore

Thinking back to the rapidly expanding tech world of the 2010s, it’s easy to list off a number of buzzwords and phrases that became IT Ops mainstays over time. “Internet of things,” “big data” and even ideas as simple as the cloud were all once considered little more than slick marketing talk.

Crisis Management

Everbridge Crisis Management is integrated with the Critical Event Management (CEM) Platform, providing a single solution for business continuity, disaster recovery and emergency communication. In one application, crisis teams coordinate response activities, team members and resources to accelerate recovery times during unanticipated scenarios. Communicate with all stakeholders through one integrated platform, so you will never have to worry that your response plans are not getting executed.

Mean Time to Recovery (MTTR) explained

It's Friday afternoon, and you have mail. Apparently, a user received a 500 error when attempting to sign in. She contacted Customer Service. They didn't know what to do, so they forwarded the email to your engineering team. A close look at the email thread reveals that Customer Service received it... on Tuesday. And they sat on it until today. ‍ Hopefully, it was just this one user. You open your browser, navigate to the web application, and attempt to sign in. You also get a 500 error.

PagerDuty and Arize: Integrations for ML Observability

Arize is an ML Observability platform aimed to detect, troubleshoot, and eliminate ML problems faster. Use Arize to monitor your production models and send alerts to PagerDuty when your models deviate from a certain threshold. Arize and Pagerduty help keep your teams in the loop, send more comprehensive metadata through alerts, and debug your models faster than ever before.

A new channel per incident - helpful or harmful?

I caught the tail-end of a Twitter thread the other day which centred around the use of Slack channels for incidents, and whether creating a new channel for each new incident is helpful or harmful. It turns out this is a much more evocative subject than I thought, and since I have opinions I thought I’d share them!

Uptime + Squadcast Integration: Routing Alerts Made Easy

Uptime is a site monitoring solution used to reach various endpoints & notify users via push notifications when downtime is detected. It collects and stores downtime & response time data & which is then made available as reports to the users. If you use Uptime for your monitoring needs, you can now integrate it with Squadcast to route detailed alerts from Uptime to the right users in Squadcast. The below steps will help you set up Uptime and Squadcast integration.

That Rogers Outage is Going to be More Expensive Than You Think

On July 8 of 2022, the Canadian telecom company Rogers Communications suffered a major outage that impacted most of Canada for almost two days. This wasn’t completely unprecedented (they’d had an outage in 2021 that impacted their wireless servers for several hours) but the breadth and severity of this one is going to end up costing them far, far more than it seems at first glance.

See the big picture with the Service Dependency Graph

Understanding the impact and scope of an incident when degradation occurs is critical for returning your service online. This requires modeling the many downstream and upstream relationships between your services. Our new Service Dependency Graph provides a shortcut – a way to surface dependencies quickly, understand the relationship between services, and determine the scope or impact of an incident.

UBS invests in BigPanda to help drive digital disruption and innovation in AIOps

UBS is one of the leaders in the financial sector and one of the early adopters that are levering AI to do things better, cheaper and faster to bring their IT Operations in line with their cloud migration and digital transformation strategy. BigPanda is thrilled to have UBS as a customer and an investor to drive real transformation.

August 2022 Update - Change duty status of colleagues, configurable duty notifications and revised password change

Our August update now allows administrators and team administrators to change the service status of other users in the portal. We also made service settings more granular and e.g. introduced the ability to turn off certain push messages when colleagues’ service statuses change. We have also revised the way of changing personal password or remote action PIN in the portal. All details are available in this article.

RESOLVE '22: The SOC and the NOC

In our RESOLVE ’22 event The SOC and the NOC, moderator and 3 Tree Tech VP of Cybersecurity Kris Taylor welcomed two esteemed guests to the stage: As Kris noted at the top of the event, we brought our panelists together to talk about “the culture of the network operating center (NOC) and security operations center (SOC).” Along the way, they discussed different philosophical and practical takes on the high-level topics of networking and security.

IHS Markit: Centralizing Incident Management With PagerDuty & ServiceNow

In today’s digital world, organizations are constantly undergoing change. They’re moving to the cloud and rolling out DevOps at scale—all in the name of driving innovation. But moving from a monolith to microservices can lead to applications becoming increasingly distributed. When problems arise, customers don’t care how many teams and services you have, or how complex your architecture is. They only care that your services work when they need them to.

How DORA will impact incident management at financial entities

The Digital Finance Strategy is a European directive that aims to support and develop digital finance in Europe whilst maintaining financial stability and consumer protection. There are three main components to the package: In this blog post, we’ll attempt to summarise the 113-page DORA proposal, highlighting how it will apply to incident management at financial entities.

New Feature: StatusCast now integrates with Google Translate

Here at StatusCast we understand the importance of a resourceful and communicative status page. A status page is the ambassador of your incident response management process, and like any good ambassador, it needs to speak the language. If your status page is now hosted by StatusCast, it is now fully integrated with Google Translate, a powerful tool that allows your subscribers and even viewers to translate your page into the language most comfortable to them.

Minimizing Data Science Model Drift by Leveraging PagerDuty

PagerDuty has an Early Warning System (EWS) model which helps the Customer Success and Sales departments ascertain the wellness of existing PagerDuty customers based on product usage and external business factors. This Early Warning System model has become critical infrastructure and the first line of defense in identifying poor product usage that could result in account churn.

Fast track video series: Integrate ticketing and messaging tools with BigPanda

BigPanda’s Agnostic Integrations provides powerful bi-directional integration for enterprise ticketing, service desk and collaboration tools such as chat and incident response, so operators can easily share BigPanda incidents with other users in their ticketing and collaboration tools of choice. With BigPanda, teams can easily automate ticket creation as well as notifications and war room creation in chat tools.

Connecting to incident.io with Zapier

At incident.io, we believe that incidents are for everyone. As part of enabling that mission, we think it’s essential to ensure that all users can create, configure, and maintain business processes related to an incident. Today, we have two approaches to support different people, products, and organisational structures: We’re excited to announce that we’re taking this further and adding Zapier to our growing list of options to automate your processes (and focus on fixing)!

Get to the Root (Cause Analysis) in 5 Easy Steps

What is one of the first things you should do when you are assigned an incident via PagerDuty? If you immediately thought “Acknowledge!” you are not wrong, but after that, it’s all about resolving the issue as quickly and painlessly as possible. The first step to resolution is to investigate what caused the incident in the first place so you can easily get a fix in place.

Understanding Cloud Services: IaaS, SaaS, and PaaS

Cloud services have skyrocketed in popularity in the past few years, providing a vast array of resources as well as a cost-effective path for the migration from on-premises servers to the cloud. In fact, cloud services are handling all the computing needs of many businesses. It’s very likely you’re already using cloud services and will continue to use more as time goes on.

Using Squadcast's SLO Tracker | Error Budget | Setting up SLOs and configuring SLIs | Squadcast

With Squadcast, you can define and monitor Service Level Objects for your services. SLOs allow you to define and enforce an agreement between two parties regarding the delivery of a given service. A Service Level Objective (SLO) is a reliability target, measured by a Service Level Indicator (SLI), and sometimes serves as a safeguard for a Service Level Agreement (SLA). SLOs represent customer happiness and guide the development team’s velocity.

Interrupts in software teams: using unplanned work to your advantage

Interrupts are often seen as a problem that eats away at your team’s productivity, and gets in the way of shipping important things for your customers. It’s often consciously accrued from the tech debt we accept to ship features sooner. However when a team doesn’t have a good strategy for dealing with the consequences of those decisions, the pain is felt much more acutely and much sooner.

PagerDuty Debuts as a Leader in 2022 GigaOm Radar for AIOps Solutions

Every year there is a surprise in a Radar report. While it won’t be a surprise to our thousands of customers who are seeing tremendous benefits with us, PagerDuty is excited to be named a Leader in the 2022 GigaOm Radar for AIOps Solutions. GigaOm uses extensive criteria to evaluate vendors in their Radar.

PagerDuty Incident Response Demo (Extended)

Enjoy this demo that showcases a day in the life of a team handling an incident with PagerDuty's Automated Incident Response solution. PagerDuty enables teams to orchestrate the right response for every incident. It also helps organizations protect revenue and improve customer experiences by resolving critical incidents faster and preventing future occurrences. Now you can bring major incident best practices to your organization with end-to-end response automation and friction-free postmortems.

Arize integration with PagerDuty

Streamline Model Monitoring with Integrated Alerts Arize is an ML Observability platform aimed to detect, troubleshoot, and eliminate ML problems faster. Use Arize to monitor your production models and send alerts to PagerDuty when your models deviate from a certain threshold. Arize and PagerDuty help keep your teams in the loop, send more comprehensive metadata through alerts, and debug your models faster than ever before.

RESOLVE '22: How to get multi-cloud done right

Multi-cloud is inevitable. With AIOps, struggling in its complexity doesn’t need to be. Business technology stacks don’t appear out of a vacuum. For the modern cloud-enabled, cloud-dependent company (that is to say, most of them), the look from the inside looks more like an ongoing evolution than a monolithic choice.

The Power of using Enterprise Alerts Remote Actions via Cloudbridge

For over 20 years Derdack has been developing products that meet the challenges of incident management. It is well documented how Enterprise Alert and SIGNL4 not only filter through the noise with advanced alert policies, but also target the right on-call engineer with the use of sophisticated scheduling, anywhere ad-hoc collaboration and 2way communication back to the originating event source.

We've made it even easier to manage your FireHydrant configuration with Terraform

Many of our customers use FireHydrant’s verified Terraform provider to track configuration changes, ensure consistency, and automate repetitive configuration tasks. Back in March we streamlined our Terraform provider support for service catalog configuration. Today we are releasing extensive Terraform provider improvements for configuring runbooks, task lists, service dependencies, incident roles, and more.

Monitor 3rd-party outages in PagerDuty

We’ve integrated IsDown with PagerDuty so you can manage alerts in the same place you manage all your other alerts. The PagerDuty integration is part of our strategy to make it easy to monitor all the business dependencies that companies nowadays have. We live in a world where SaaS rules the world, and companies prefer to buy vs. build. But with that comes the problem of monitoring all these dependencies, which are critical to daily operations.

MTTJ - What is Mean Time to Join (MTTJ)?

MTTJ – The time taken to join a meeting, and delays caused in ensuring right people are available, can be avoided using software automation and tools. This is not an often talked about topic, but am sure everyone is affected directly from this. We discuss this in detail here. What, why and how it can be avoided?

Driving a customer-focused incident response process

Deep into an incident, Slack firing, up to your ears in decisions, not sure where to turn next? It’s easy for external communication with your customers to fall far down the list of priorities in these moments. However, these are the exact situations where comms are vital, and where underestimating their importance can having damaging and lasting effects on your organisation.

SRE: From Theory to Practice | What's difficult about tech debt?

In episode 3 of From Theory to Practice, Blameless’s Matt Davis and Kurt Andersen were joined by Liz Fong-Jones of Honeycomb.io and Jean Clermont of Flatiron to discuss two words dreaded by every engineer: technical debt. So what is technical debt? Even if you haven’t heard the term, I’m sure you’ve experienced it: parts of your system that are left unfixed or not quite up to par, but no one seems to have the time to work on. ‍

New! Common Automated Diagnostics for AWS Users

Today’s modern cloud architectures centered on AWS are typically a composite of ~250 AWS services and workflows implemented by over 25,000 SaaS services, house-developed services, and legacy systems. When incidents fire off in these environments—whether or not a company has built out a centralized cloud platform—distinct expertise is often a necessity.

The Do's and Don'ts of Blameless Incident Postmortems

When an incident inevitably occurs, many organizations have a well-prepared incident management team that springs into action. Whether it’s a power outage or security breach, an incident can damage your company’s operations if not handled properly. A strong incident response team is critical to mitigating any negative impacts successfully. Furthermore, once your team resolves the problem, you should initiate a postmortem to detail the incident and record any lessons learned.

Blameless Demo: Streamline ServiceNow Incident Ticketing Workflows

Our Director of Product, Nicolas Phillip, shows you how to create ServiceNow incident tickets from your preferred chat tool or the Blameless interface. Watch his step-by-step tutorial and begin leveraging Blameless to create incident tickets in ServiceNow today.

Episode 6: Mooving to... Real release strategies with Jake Laverty

Every product or application needs a release strategy. It’s how you can double check that everything in your deployment is appropriately tested, validated and verified. Having a standardized release strategy in place allows your team to follow a protocol and reduce the number of unknowns they must face in the product life cycle. However, there are a few considerations to make this critical process run smoothly.

Automate incident response workflows with Eventarc and Datadog

Eventarc is a Google Cloud offering that ingests and routes events between GCP products, such as Cloud Run, Cloud Functions, and Pub/Sub, making it easy to build automated, event-driven workflows in complex environments. By taking care of event ingestion, delivery, authorization, and error handling, Eventarc reduces the development overhead that is required to build and maintain these workflows and helps you improve application resilience.

Tell the story of your incident with timeline curation

It isn’t the first time you’ve heard us say this and it won’t be the last: getting your post-incident process right is a game-changer. Being able to run effective debriefs and create useful postmortems helps us learn from our mistakes, respond better to future incidents and identify how we can build resilience in our product and teams. In short, it’s the thing the shifts the dial from just “fixing” to actually improving.

Anti-patterns in Incident Response that you should unlearn

It is important to invest time and effort in understanding why a system performs the way it does and how we can improve it. Companies continue with practices that yield successful results, but ignoring anti-patterns can be far worse than choosing rigid processes. In this blog we will explore anti-patterns in incident response and why you should unlearn those.

What is Event Orchestration? 7 ways to start using this powerful new feature from PagerDuty to reduce noise and automate away manual toil today

Does your team deal with too much noise? Does your heart sink a bit when you think about how much your rulesets have sprawled in order to manage your event processing needs? That’s why we released Event Orchestration earlier this year to help teams reduce the amount of manual work that goes into event management. Event Orchestration is the next evolution of our Event Rules feature set, which helps to route, enrich, and modify events on ingest to remove noise and automate processes.

Dedicated Incident Channel Improvements for Slack on Webhooks V3 - Early Access

Today, we are excited to open Early Access for our improved Dedicated Incident Slack Channel. These improvements include: In order to take advantage of this feature you need to upgrade to Slack on WebHooks V3 and request Early Access from PagerDuty support. Once you are on the right version and have early access, there are two ways to create a dedicated incident channel.

Integrating Slack & Squadcast- Trigger, Acknowledge, Resolve & Reassign incidents from Slack channel

You can integrate Squadcast and Slack to collaborate efficiently with your team while working on incidents. Squadcast sends a notification to the configured Slack Channel as soon as an incident is triggered.

Integrating Microsoft Teams & Squadcast - Acknowledge, Resolve & Reassign Incidents | Squadcast

Teams using MS Teams can now integrate with Squadcast and easily Acknowledge, Resolve & Reassign incidents using MS Teams. You can configure Squadcast to send a notification to the configured MS Teams channel as soon as an incident is triggered.

Tagging & Routing at Squadcast | Incident Management | Squadcast

Event Tagging is a rule-based, auto-tagging system with which you can define customized tags based on incident payloads, that get automatically assigned to incidents when they are triggered. Auto-add relevant information like priority, severity or alert type to make incoming incidents context-rich. Route alerts to the right responder(s) based on the tags they carry

Alert Suppression Rules in Squadcast to prevent Alert fatigue | Squadcast

Alert suppression can help you avoid alert fatigue by suppressing notifications for non-actionable alerts. Squadcast will suppress the incidents that match any of the Suppression Rules you create for your Services. These incidents will go into the Suppressed state and you will not get any notifications for them.

What's New: Automation Actions in the PagerDuty Application for Zendesk

The past few years have led to a significant increase in customer demands, and customer service agents are feeling the pressure. According to a recent Zendesk CX Trends report, 68% of agents report feeling overwhelmed. Here at PagerDuty, we believe that happier customer service agents lead to more positive customer interactions and stronger relationships with your brand.

Key considerations before signing up for cyber insurance

With 2021 seeing 5.1 billion records breached and an annual increase in attacks at 11%, the risk of security incidents is only getting greater every year. And when an attack hits, the cost to recover, which includes fines, penalties, legal fees, and much more, are also great. To help minimize the scope of financial damage, many organizations turn to cyber insurance. Albeit a relatively new branch of insurance, demand is already huge and ever increasing.

To require or not require (fields): that is the question

Required fields have been a hot topic at FireHydrant. Choose too many (or the wrong ones), and you unnecessarily annoy your team during an incident or encourage sloppy data entry that someone has to come back and clean up manually. Don't use them at all and risk insufficient data to efficiently propel an incident toward resolution.

Overcome the integration bottleneck with self-service onboarding tools

The amount of data volume and complexity within tech stacks is continuing to increase with no sign of slowing down. As a result, many organizations are facing significant challenges related to tool sprawl and the overwhelming amount of data that needs to be exchanged between all the different systems. The result is this new rapid pace of data which brings a need for faster flow and exchange of information.

Analytics in Squadcast | Incident Management | On-call | SRE | Squadcast

Analyzing incident data plays a key role to do better SRE. Squadcast's Analytics Dashboard helps you analyze the performance of your Organization/ Team, for a given time period. It also gives you more insight into past outages that affected your systems.

Integrating Squadcast with Jira (Cloud & Server) - Create tickets & bidirectional sync | Squadcast

You can use this integration guide to install and configure the Squadcast extension in Jira Cloud & Jira Server to create issues in Jira projects when there is an incident in Squadcast. Also learn to automatically or manually sync the status bidirectionally.