Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Centralized Logging with Open Source Tools - OpenTelemetry and SigNoz

Modern-day software systems emit millions of log lines per minute. Cloud computing and containerization have made it easy to have distributed systems. Distributed systems emit logs from multiple sources. While developers have always used logs to debug stand-alone applications, centralized logging solves the challenges of modern-day distributed software systems.

Kubernetes Monitoring - What to Monitor, Tools and Best Practices

Kubernetes has since emerged as “THE” container orchestration platform for deploying and managing containerized workloads as a result of its robust capabilities. However, the complexity of its architecture and its dynamic nature present significant challenges in monitoring deployed workloads and the platform itself. Kubernetes monitoring is crucial for maintaining the health, performance, and reliability of containerized applications.

Scaling Runtime Diagnosis System w/ Grafana Pyroscope | Roblox at ObservabilityCON on the Road 2024

In this video, Xiaofeng and Jialin from Roblox introduce their journey in building a robust runtime diagnostic system using Pyroscope. With over 70 million daily active users and 4.4 million creators contributing to the platform, ensuring reliability and efficiency is paramount. They discuss the challenges faced in debugging production issues and the manual, inefficient methods previously used. Through thorough investigation and collaboration with Grafana Labs, they developed an on-demand profiling workflow, enabling engineers to identify and address performance bottlenecks effectively.

How to Achieve Observability as Code with Grafana | LiveRamp at ObservabilityCON on the Road 2024

Leveraging Terraform alongside Grafana, Kubernetes, and Helm providers, the SRE team at LiveRamp has transformed every aspect of their operational toolkit. From agent installations and synthetic checks to Grafana k6 performance testing, notification policies, contact points, and alerts into modular, code-based components, the team is crafting a cutting-edge observability solution powered by Grafana Cloud. Learn how this seamless integration ensures a robust, scalable, and easily manageable infrastructure that is setting new benchmarks for system reliability and efficiency around the business.

What Is the Impact of Digital Operational Resilience Act (Dora) on My IT?

If you’re in banking, you know the drill. Adhering to stringent EU regulations is a standard practice. This involves undergoing extensive audits, closely managing IT assets, maintaining your CIA (Confidentiality, Integrity, Availability) rating, conducting and responding to fire drills, and establishing continuity plans. So far, nothing new, and if you’re in other highly regulated environments, you know that these measures are commonplace.