Many of us (indeed 1 billion plus users worldwide) rely on Microsoft for essential work activities and were impacted yesterday (Wednesday January 25, 2023) when the cloud service provider experienced a prolonged outage. Internet Resilience is a business priority because when critical workforce services like Microsoft go down, global teams are hugely disrupted.
Network outages are both common and expensive – usually far more expensive than people realize. Yes, the network is down and the organization is losing money, but do you really appreciate how much money? And how much an outage can actually cost on a per minute basis? It’s not only more than most people think, it’s something that can be mitigated fairly easily.
Each year of the SRE Report, there’s a trend or anti-pattern that leaps out and makes us pause and reflect. Last year, for example, we found a huge drop in global toil levels. With the whole world working from home for a full year, it made sense that global toil levels would drop, right? But this year, despite the great reopening underway, toil levels dropped even further - it's a paradox, one which no doubt will require its own scrutiny.
Not all Internet outages take a website down. Some may impact a smaller subsection of users or only affect one part of a site’s functionality. Moreover, because of their relative “hidden” nature, organizations may not always know about them immediately since fewer users will be making complaints. However, such incidents can still have serious consequences, thus you want to detect them as soon as possible so you can quickly mitigate and resolve issues.
Dear Santa, I’ve been an extremely good IT Operations Manager this year (which is saying something considering the state of the world at the moment) and I have a few items on my wish list.
Most Internet-centric organizations today use some form of APM tools, as they should. But they are insufficient. Over the last ten years, the world has completely changed. If you think about it, in the first decade of this millennium, most businesses had an Exchange server, maybe Siebel CRM, a file share, and a range of other business apps, usually hosted in the same building. Everything was on the LAN. Today, it is the exact opposite. Everything is distributed.
We are at the cusp of an important technology transformation. A discontinuity in technology as Peter Drucker would call it (precipitated by Covid). For decades, IT organizations invested in building, managing, and monitoring LANs. Everything was on your local network: your CRM, your Exchange email, the file shares, and the print server. Today, many companies are shutting down their “old legacy network” and are running their enterprise without a LAN, WAN, or an OnPrem datacenter.