Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

Agentless monitoring for cloud VMs: Simplify scaling and observability

Managing cloud infrastructure is challenging enough without adding the burden of deploying and maintaining monitoring agents. What if there was a simpler, more efficient way to monitor your virtual machines (VMs)? In the first part of this series, we looked at the (link) and presented a better solution: agentless monitoring. Agentless monitoring is an efficient approach to observability that eliminates the need to install and manage software agents on each monitored device.

The One Where We Meet Cribl Copilot

We’re kicking off our new live weekly product demo series—streaming on YouTube, X, and LinkedIn! Each week, we’ll dive into the latest features and hidden gems from the Cribl Suite of tools to help you unlock the full potential of your telemetry data. For our first session, we’re thrilled to welcome Nikhil Mungel, the visionary behind Cribl Copilot. This AI-powered assistant is designed to: Instantly surface answers from the documentation Build pipelines with just a simple request.

How to Build Observability into Chaos Engineering

If you've ever deployed a distributed system at scale, you know things break—often in ways you never expected. That’s where Chaos Engineering comes in. But running chaos experiments without robust observability is like debugging blindfolded. This guide will walk you through how observability empowers Chaos Engineering, ensuring that your experiments yield meaningful insights instead of just causing chaos for chaos’ sake.

OpenTelemetry Is Not "Three Pillars"

OpenTelemetry is a big, big project. It’s so big, in fact, that it can be hard to know what part you’re talking about when you’re talking about it! One particular critique I’ve seen going around recently, though, is about how OpenTelemetry is just ‘three pillars’ all over again. Reader, this could not be further from the truth, and I want to spend some time on why.

Optimizing Observability Data Volume and Cost with AI

Struggling with high observability costs? In this video, Jade Lassery breaks down the challenges of managing excessive data and skyrocketing expenses. She introduces the Logz.io AI agent, a powerful solution designed to optimize data usage, reduce unnecessary costs, and improve efficiency. Learn how to take control of your observability spending while maintaining high performance. Watch now to discover smarter data management strategies!

Increase control and reduce noise in your AWS logs using Datadog Observability Pipelines

Today’s SRE and security operations center (SOC) teams often find themselves overwhelmed by the sheer volume and variety of logs generated by critical AWS services such as VPC Flow Logs, AWS WAF, and Amazon CloudFront. While these logs can be valuable for detecting and investigating security threats, as well as troubleshooting issues in your environment, managing them at scale can be challenging and costly.

Integrating OpenTelemetry with Grafana for Better Observability

Modern application observability is essential for ensuring system performance, diagnosing issues, and optimizing user experiences. OpenTelemetry (Otel) and Grafana serve as two key components in achieving end-to-end visibility. While OpenTelemetry focuses on instrumenting applications to collect telemetry data, Grafana specializes in visualizing this data, making it actionable and insightful.