Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Distributed Tracing and related technologies.

Datadog vs Jaeger - Features, Pricing & Use Cases [Updated for 2025]

Datadog and Jaeger are both leading tools in the observability space, but they represent two fundamentally different philosophies. Datadog is a commercial, all-in-one SaaS platform that unifies metrics, traces, and logs. Jaeger is a popular, open-source project focused specifically on distributed tracing. Choosing between them isn't just a technical decision; it's about balancing the convenience of a fully managed, integrated platform against the power and control of a self-hosted, specialized tool.

What Are Traces? A Developer's Guide to Distributed Tracing

One of the most common challenges in modern software engineering today is understanding how requests flow through applications. As system architectures shift to favor widely distributed, cloud-native designs, keeping track of how an application processes user actions is more difficult than ever. A single user action may trigger events processed in dozens of backend services. Traces are helping software developers today with this challenge.

OpenTelemetry Collector: A Complete Guide [2025]

The OpenTelemetry Collector is a stand-alone service that acts as a powerful, vendor-neutral pipeline for your telemetry data. It can receive, process, and export logs, metrics, and traces, giving you full control over your observability data before it reaches a backend. This guide will provide a comprehensive overview of the OpenTelemetry Collector, its architecture, deployment patterns, and how to configure it for production use.

Improve Consistency Across Signals with OTel Semantic Conventions

It’s 2 AM. Your API is timing out. Logs show a slow query. Metrics flag a spike in DB connections. Traces reveal a 5-second delay on a database call. But then the questions start:- Which database?- Does the query match the delay?- Why doesn’t this align with the connection pool metrics? Each tool uses different labels, db.name, database, sometimes nothing at all. Without a shared schema, connecting the dots is slow and frustrating.

Faster incident response through distributed tracing: Inside Glovo's use of Traces Drilldown

It’s almost 1 p.m. on a Monday afternoon and you’re hungry. You pull up your meal delivery app and select your favorite restaurant and dish. Then you go to check out and nothing happens. Your frustration mounts as you get hungrier by the minute. But there’s frustration on the other side of that transaction as well—engineers are scrambling to figure out what’s wrong as orders drop and revenue losses rise.

Perform Distributed Tracing for your MCP system with OpenTelemetry

2025 has truly been the year of Agentic AI, with MCP (Model Context Protocol) emerging as one of its flashy and most talked-about innovations. While many products have seamlessly integrated MCP servers into their systems, these servers are increasingly being labelled as black boxes, opaque components that handle critical tasks but offer little visibility into what’s happening under the hood. We prompt an agent, a tool gets invoked, and a response is generated. But what really happens in between? And when something breaks, how do we trace the failure and debug it effectively?

Understanding APM and Distributed Tracing in the Observability Stack

To keep modern applications running smoothly, you need more than just basic monitoring. APM (Application Performance Monitoring) gives you a broad overview, tracking metrics like latency, errors, and system health. Distributed Tracing, on the other hand, shows the full journey of each request across services, helping you pinpoint the root cause of slowdowns or failures.