Alerting

logsign

The Importance of Alert Grouping

Alerts are one of the most important information sources when it comes to cyber security. They notify and inform your IT team about ongoing cyber threats, security events and any other incident that might threaten your organization. In this article, we will focus on alert grouping and why it is important for the security of your organization.

pagerduty

IDC Finds Substantial ROI for Enterprises Using PagerDuty for Digital Operations Management

In order to keep digital services running around the clock, teams need to be able to solve problems faster—or, ideally, in real time. Many vendors claim to provide value and help organizations bolster their digital operations management.

bigpanda

Root Cause Changes: Real Examples of Modern Root Cause Analysis from our Beta Customers

Root Cause Analysis (RCA) is an all-encompassing process. It is usually very complicated and often requires many people with many different skills – all trying to tackle an incident to determine what happened, when, why, how and ultimately who (to blame). There is, however, secret sauce today that can help solve many issues before a “full-scale” RCA process is initiated – and that is Root Cause Changes (RCC).

pagerduty

Cherwell & PagerDuty: Getting Real (Time) About Digital Transformation

Digital transformation may be the largest shift the IT industry will experience in a lifetime. It’s a term used throughout the tech industry and in various contexts. Gartner defines it as “…anything from IT modernization (for example, cloud computing), to digital optimization, to the invention of new digital business models,” which has massive implications for almost every organization.

victorops

The Production Environment Review Checklist

You’ve written code, you tested it and built it. Now, your release is ready to deploy into production. But, is your production environment ready for the release? That’s a question that every IT professional and platform engineer should be asking before accepting a new release – whether the release is an update of an existing app or a totally new deployment.

Internet Leader Natural Intelligence Now Resolving Glitches in Minutes Rather than Days

Natural Intelligence runs comparison websites that generate millions in ad traffic. A glitch could easily cost the company thousands in ad revenue. CTO Lior Schachter and other members of the NI team share the difference Anodot Autonomous Analytics has made across the company.
logicmonitor

LogicMonitor and PagerDuty: Beyond the Basics

Out-of-the-box integrations are great, and they help organizations see an immediate return on investment when the technologies they have invested in work together seamlessly. However, a little customization to these integrations can dramatically increase productivity and reduce mean time to resolution. Here we will address a couple of best practices and customizations that can take your PagerDuty and LogicMonitor integration to the next level.

victorops

Incident Management in a Complex Serverless Framework

Serverless frameworks can lead to highly efficient, scalable systems that allow developers to build complex software faster and more reliably. Serverless frameworks allow engineering teams to focus on individual functions across multiple applications or microservices and eliminates numerous problems with maintaining physical hardware. Serverless capabilities are also often referred to as Functions as a Service (or FaaS).

LISA19 - Lightning Talk by Squadcast : How to SRE without an SRE on Your Team

Squadcast is an incident management tool that’s purpose-built for SRE. Create a blameless culture by reducing the need for physical war rooms, centralize SLO dashboards, unify internal and external SLIs and automate incident resolution with Squadcast Actions and create a knowledge base to effectively handle incidents.