Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

When we say "Observability AI Reckoning," what are we actually talking about?

We’ve spent the last decade collecting more telemetry. Now AI is analyzing it. Here’s the catch: AI needs the full dependency chain to reason correctly. If it sees spans but not storage contention… Services but not Kubernetes scheduling… Frontend metrics but not downstream providers… It will confidently optimize the wrong thing. AI doesn’t lower the need for observability. It raises the standard.

Profiling Java apps: breaking things to prove it works

Coroot already does eBPF-based CPU profiling for Java. It catches CPU hotspots well, but that's all it can do. Every time we looked at a GC pressure issue or a latency spike caused by lock contention, we could see something was wrong but not what. We wanted memory allocation and lock contention profiling. So we decided to add async-profiler support to coroot-node-agent. The goal: memory allocation and lock contention profiles for any HotSpot JVM, with zero code changes. Here's how we got there.

Accelerate Your OpenTelemetry Migrations With Honeycomb's Agent Skills

Since releasing our hosted MCP server last year, we've been thrilled to see customers not just adopt it but build Honeycomb deeply into their agentic development and observability workflows. Users have embraced it, leveraging Honeycomb to stay in conversation with their code and understand how it runs in production.

AI Needs Better Inputs: Why Observability Is Becoming the Foundation of Enterprise AI Maturity

Organizations across industries are accelerating their investments in AI for operations, yet the path to meaningful impact is proving far more complex than early expectations suggested. Analysts at Gartner, Forrester, Deloitte, and McKinsey continue to highlight the same structural barrier. AI cannot produce accurate predictions or safe automation when the operational data feeding it is fragmented, incomplete, or inconsistent.

Grafana Cloud Demo in Under 5 minutes | Full Stack Observability and more

Overview & demo of how Cloud provides an end to end Observability Platform that empowers users who have adopted open standards like or to improve their systems reliability using & a shift left approach with performance testing while optimizing their observability costs.

Observability and Security for the AI Era

Datadog has always been driven by a broader vision of helping teams understand and operate complex systems. In this session, you’ll hear from Yrieix Garnier, VP of Product, and Hugo Kaczmarek, Senior Director of Product, as they share the latest updates across the Datadog product suite and discuss how that vision continues to shape the platform’s evolution and support the next generation of AI-driven applications.

The Observability Gap: Why Monitoring Data Should Drive Tests

Most teams already know a lot about production. They have dashboards. They have traces. They have alerts. They have enough telemetry to explain what happened after an incident and enough graphs to argue about it for the rest of the week. Then they go to test a change and start from scratch. The integration tests hit a hand-written mock that returns {"status": "ok"}. The load tests replay a CSV somebody exported months ago. Staging is close enough to production right up until it matters.

Observability Is Now a Boardroom Priority Even If Nobody Wants to Say It Out Loud

Executives rarely state the full truth publicly, but inside boardrooms the conversation has changed. Observability, once viewed as a technical capability deep within operations, has become a strategic requirement for understanding business performance. Leaders may not always use the term itself, yet they focus intensely on the outcomes it promises. Their environments have grown too fast, too fragmented, and too interdependent for traditional visibility approaches to keep pace.

Scary Things Happen in Production. Context Helps You Find Them.

Production is a rowdy place of chaos, especially at scale. When you have millions of requests per second flowing through your system, weird things are always happening. Outliers, unusual request patterns, spikes and pulses of traffic from unknown sources, port scanning…it’s all there. To the naked eye, it looks like noise. If you know what you are looking for…patterns emerge. The night sky: every dot is a request. Without intent, it's an undifferentiated field of light.

Smarter Alerts, Faster Root Cause, & Proactive IT Ops with SolarWinds AI Observability

Discover how AI is transforming IT operations with SolarWinds Observability. In this video, we showcase powerful new AI-driven features designed to help you detect issues faster, reduce alert noise, and stay ahead of performance problems across your entire stack. From applications and databases to networks, cloud infrastructure, and end-user experience SolarWinds AI delivers deep insights where it matters most.