Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Using Real-Time Operations to Save Lives

Voices wield power. Staying silent is not an option. We must speak up and honor those who do. October is National Domestic Violence Awareness Month, when communities come together to support victims and survivors of domestic abuse across the world. Earlier this month, SisterDuty, one of PagerDuty’s Employee Resource Groups (ERG), led a campaign to build toiletry kits and raise funds to benefit Casa de las Madres, which offers shelter and support to those at risk of abuse.

ITSM Incident Management + the Need for Speed

ITSM incident management might seem like a lot of words and letters thrown together. However, when examined in the light of managing changes to IT functionality, you quickly realize their importance. ITSM incident management quickly become realized as a way to define how teams should organize themselves and operate their IT services. And key to an effective understanding of this structure is ensuring rapid resolution of IT issues.

Have we discovered the secret sauce for successful offsites?

Offsite meetings can be great for getting things done. Being out of the office can clear the cobwebs, break down barriers, and lead to real breakthroughs. At BigPanda, the marketing team has started experimenting with how we run offsites, with the aim of trying to find a “secret sauce” that leads to success – maximizing both team building and task execution that we tackle in our offsites.

This IS NOT Fine: Putting Out (Code) Fires

So the dumpster is on fire. Again. The site’s down. Your boss’s face is an ever-deepening purple. And you begin debating whether you should join the #incident channel or call an ambulance to deal with his impending stroke. Firefighters have clear procedures and a strong hierarchy. The first truck at a scene immediately begins assessing the situation.

Reducing Noise with Event Intelligence

Learn how Event Intelligence, the next-gen approach to Event Management and AIOps, helps teams to cut through the noise and operate at scale. This introductory session will walk through key best practices and requirements such as reducing noise via adaptive machine learning, accelerating triage via integrating machine data with human response, and much more.

Introducing Jira Ops: Respond Faster with Atlassian + PagerDuty

Atlassian’s mission is to unleash the potential of every team. Atlassian’s newest product, Jira Ops, is built on top of Jira with a direct connection to PagerDuty to ensure teams can be successful and respond quickly when things break. This session will cover how PagerDuty and JiraOps work together to help teams respond to incidents, quickly and in real-time.

Another Journey of Chaos Engineering

Chaos engineering is here to stay. There's a thriving community, numerous open source projects, a few books, even a startup. Companies are hiring chaos engineers and creating entire teams focused on chaos engineering. This talk is about strategies for launching a chaos engineering movement at your company, as well as the challenges and results you can expect.