Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Release Notes: Process Automation and Rundeck OSS 4.4.0

Product managers Forrest Evans and Jake Cohen show off new features and enhancements in PagerDuty Process Automation and Rundeck Open Source version 4.4.0. Version 4.4.0 features two new plugins for #AWS:#Lambda Custom (ephemeral) scripts#ECS/#Fargate Commands For more details on other improvements in this release, see the full Release Notes.

3 common pitfalls of post-mortems

Small confession: we currently use the term 'post-mortem' in incident.io despite preferring the term 'incident debrief'. Unless you have particularly serious incidents, the link to death here really isn’t helping anyone. However, we're optimising for familiarity, so we're sticking to the term 'post-mortem' here. Ask any engineer and they’ll tell you that a post-mortem is a positive thing (despite the scary name).

Zero Trust Security: Key Concepts and 7 Critical Best Practices

Zero trust is a security model to help secure IT systems and environments. The core principle of this model is to never trust and always verify. It means never trusting devices by default, even those connected to a managed network or previously verified devices. Modern enterprise environments include networks consisting of numerous interconnected segments, services, and infrastructure, with connections to and from remote cloud environments, mobile devices, and Internet of Things (IoT) devices.

Automating Common Diagnostics for Kubernetes, Linux, and other Common Components

This is the second piece in a series about automated diagnostics, a common use case for the PagerDuty Process Automation portfolio. In the last piece, we talked about the basics around automated diagnostics and how teams can use the solution to reduce escalations to specialists and empower responders to take action faster. In this blog, we’re going to talk about some basic diagnostics examples for components that are most relevant to our users.

What Is a Secure SDLC?

The Software Development Lifecycle (SDLC) framework defines the entire process required to plan, design, build, release, maintain and update software applications, including the final stages of replacing and decommissioning an application when needed. A Secure SDLC (SSDC) builds on this process, integrating security at all stages of the lifecycle. When migrating to DevSecOps (collaboration between Development, Security, and Operations teams), teams typically implement an SSDLC.

StatusCast Top Picks: 10 More Awesome Customer IT Status Pages

IT services are a critical backbone to the operations and functioning of most every business and organization. As more and more IT departments have embraced the need for good governance, this has driven greater transparency. From the perspective of IT service management, this has manifested itself as much greater openness when communicating about IT service availability.

Remote Actions for IT Remediation, IoT Actions and more

SIGNL4 supports the remote execution of automated tasks or workflows in IT or IoT systems using Remote Actions. These remote actions offer a wide range of applications. You can execute remote actions in response to an alert to trigger some kind of remediation action. But there are many more possible use cases. This article provides some examples and ideas about what is possible.

DevOps Tools

A tool that aids in automating the software development process is called DevOps Tool. It largely concentrates on interaction and cooperation between experts in product management, software development, and operations. A DevOps solution also enables teams to automate the majority of software development procedures including build, conflict management, dependency management, deployment, etc. and lessens human labour.