The latest News and Information on Observabilty for complex systems and related technologies.
As applications in the cloud become more distributed and complex, the Mean Time To Resolution (MTTR) for production issues is getting longer. Modern systems are built with hundreds of distinct, ephemeral, and interconnected cloud components, which can make it exceptionally hard for engineers to understand the current state of their applications, what problems are impacting customers, and why those problems are occurring.
Most SREs and IT Ops manage Java applications without source code access or communication with AppDev teams. When applications have performance issues those SREs or IT Ops teams deploying and maintaining the infrastructure often have to prove that it is the application at fault and supply information to the app supplier which provides evidence of the issue.
Cloud-native developers and practitioners gathered from around the world to learn, collaborate, and network at KubeCon/CloudNativeCon North America 2023 between November 6th and 9th at McCormick Place in Chicago, IL—myself included. This wasn’t my first time attending—I’ve been coming to KubeCon since 2016—but it was easily one of the most exciting experiences I’ve had as part of the Cloud Native community.
This is the final article of a three-part series. To start at the beginning, read Part 1: Benefiting from multi-cluster setups requires familiarity with common variations and Part 2: Exploring the facets of a multi-cluster observability strategy. As companies scale software production, they lean on Kubernetes as a crucial container orchestration platform for managing, deploying and ensuring software availability.