Operations | Monitoring | ITSM | DevOps | Cloud

Service disruption on October 20, 2025

When the internet goes down, our primary job is to help everyone get back up, as fast as possible. Of the almost half a million incidents we've helped our customers solve, there are some which stand out for both their scale and impact. One of these happened on Monday, October 20, when AWS had a widely covered major outage in their us-east-1 region, from 07:11 to 10:53 UTC. We’re hosted in multiple regions of Google Cloud and so the majority of our product was unaffected by the outage.

Recapping SEV0 San Francisco 2025

Earlier this week, we gathered in San Francisco for our second SEV0—almost a year after our very first event. SEV0 has always been about shining a light on the biggest challenges (and opportunities) in incident response. Last year, we were still talking about the fundamentals: blameless culture, strong processes, and lessons from the best in reliability. This year felt different. AI has moved from background noise to front and center in every conversation, every team, everywhere.

Impact review: Scribe under the microscope

In December 2024 we launched Scribe to help responders never miss a detail from their incident calls. By automatically transcribing calls and highlighting key information, Scribe eliminates manual note-taking, reduces time spent getting up to speed, and preserves valuable context for post-incident analysis. The feature quickly gained popularity among our customers, but with success came an influx of requests for bug fixes, extra functionality, and wider call platform support.

Using Claude to power up your onboarding

I joined incident.io about ten weeks ago, having been in my previous role for four and a half years. Being a new starter was an unusual feeling for me, and there's been a huge amount to learn; but by lunch on my second day (!) I had started shipping value to our customers. A large part of hitting the ground running has been having a colleague alongside me, who I can pester with questions, who doesn’t get offended when I write in all capitals, and often praises me for being absolutely right!

Ready, steady, goa: our API setup

At incident.io, speed is essential. Our product is growing faster than ever; in scope, range of features and the number of people contributing to it. In the early days, when you’re a small startup with just a few hundred endpoints, a basic API setup gets you by. But as things scale, you need to make creating endpoints easy, fast, and reliable.

Pager fatigue: Making the invisible work visible

No matter how hard you try to prevent it, your product will break. And sometimes, it breaks in the middle of the night. Getting paged at 3 a.m. is rough. Getting paged again two hours later because of a follow-up issue you missed the first time is even worse. So how can a manager stay aware when their team is having a tough night or a tough week on call, without relying solely on direct reports?

Breaking through the Senior Engineer ceiling

You’ve made it to Senior engineer. Now what? You’re now staring at the next level, Staff typically, sometimes Principal, or whatever your company calls it. The path feels murky. Your manager gives you feedback like “show more technical leadership” or “think bigger picture”, but what does that actually mean day-to-day? I’ve been there. I’ve also been on the other side, helping engineers grow through whatever explicit (or implicit) levels a company has.

Vibe coding with the incident.io API

Many, many years ago, I was a computer science major at the University of Illinois, hoping someday I’d be able to write code for a living. I started my career in QA hoping to learn the ins and outs of software development. But it turns out I wasn’t very good at coding. I was just good enough to get a role as a sales engineer, where all I had to do was write code that could hold together for 30 minutes in a demo.