Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Improving your on-call schedule with runbooks

Incidents are a stressful time for your team: your service isn't working the way you expect and your customers/stakeholders want to know what's going on. The last thing you want to do is let your team improvise everything when it comes to responding to incidents. Google's own SRE book has great overall tips for incident management, part of which involves "develop(ing) and document(ing) your incident management procedures in advance", which this article dives into.

I built my HTTP API docs from scratch

You might be thinking “building HTTP API docs from scratch? in 2024? wtf?”, and you’re probably right. After all redoc has been around since 2016, and there are hundreds of “generate beautiful documentation from your OpenAPI spec” startups around, some even use AI now. To be honest, I didn’t even know it was possible to do-it-yourself when I started looking into it.

How to get your first ten customers

It'll soon be the third anniversary of publicly launching OnlineOrNot on Twitter, and I often get asked what I did to get my first paying customers - so I felt like sharing. I assume when most folks ask this that they're looking for the one thing they can do to finally start getting paid customers. Let me be clear: it's never just one thing.

Scaling AWS Lambda and Postgres to thousands of simultaneous uptime checks

When you're building a serverless web app, it can be pretty easy to forget about the database. You build a backend, send some data to a frontend, write some tests, and it'll scale to infinity with no effort, right? Not quite. Especially not with a tiny Postgres server. As the number of users of your frontend increases, your app will open more and more database connections until the database is unable to accept any more. That's just the frontend - it gets worse on the backend.