Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Application Performance Monitoring and related technologies.

What is DEX? And Why DEX is Important

Digital Employee Experience (DEX) refers to how employees interact with the digital tools, systems, and technologies they use at work-and how those interactions affect their productivity, satisfaction, and overall work experience. DEX encompasses the quality of the digital interactions and services that employees encounter while using workplace technologies. It includes various factors such as application performance, network connectivity, device usability, and overall user satisfaction.

The Hidden Costs and Concerns of Iceberg Maintenance

Everyone talks about how great Apache Iceberg is, but nobody warns you about this: without proper maintenance, your tables will bloat, queries will slow down, and your catalog will run out of memory. Here are the 4 critical operations you MUST run regularly. Expiring snapshots prevents metadata bloat (Datadog learned this the hard way with catalog memory pressure). Deleting orphan files cleans up failed writes. Compacting data files keeps streaming workloads fast. Compacting manifests optimizes query planning.

Improve log utilization with Datadog log exclusion filters | Datadog Tips & Tricks

Want to make your logs easier to work with? Excluding unneeded logs from indexing reduces noise and may reduce log management costs. In this video, you’ll learn how to: See for yourself how to improve log utilization with Datadog Log Patterns and log exclusion filters. Then set up an alert to track ingestion spikes.

OpenTelemetry Agents - The Complete Beginner's Guide (2025)

If you search for “OpenTelemetry Agent”, you will likely encounter two completely different definitions. This ambiguity often leads to confusion between infrastructure teams and application developers. SREs and DevOps engineers would describe it as a component deployed as a sidecar, whereas application developers would understand it as a language-specific library. Let’s break it down in the next section.

Setup and Explore OpenTelemetry Demo Application (with Examples)

Everyone knows that debugging is twice as hard as writing a program in the first place. So, if you’re as clever as you can be when you write it, how will you ever debug it? — Brian W. Kernighan and P. J. Plauge, The Elements of Programming Style, 2nd ed. Maybe you can let SigNoz do some heavy lifting for you!

Training Foundation Models on a Trillion Data Points with Apache Iceberg

Training an AI foundation model on over a trillion data points sounds impossible without hitting your production systems. Here's how Datadog did it with Apache Iceberg for their time series forecasting model TOTO. The key challenge: extracting massive historical observability data (metrics spanning years) and running incremental preprocessing pipelines without overwhelming production services. Iceberg solved this by providing schema governance, consistency guarantees, and seamless integration with ML tools like Ray and PyTorch.

OpenTelemetry Metrics with 5 Practical Examples

Picture this, your observability tool already nails the basics like request rates, latency and memory usage, but you need more insight. Think user churn rates, engagement spikes, or even how many carts get abandoned mid-checkout. That’s where OpenTelemetry steps in, providing a way to track those critical custom metrics with ease.

How Inkeep Monitors Their AI Agent Framework with SigNoz

AI agents are fundamentally different beasts to monitor compared to traditional applications. A single user request can trigger a cascade of 10+ internal operations: sub-agent transfers, tool executions, LLM calls, API requests, each with unpredictable latency and failure modes. When something goes wrong (and with LLMs, things go wrong in creative ways), you need to see the entire execution flow to debug effectively.

Overcoming ClickHouse's JSON constraints to build a high-performance JSON log store

Customer logs data is always messy. Being (and building!) an observability platform, we get to see all the beautiful, creative ways it can be messy, every single day. And yet, our customers expect, quite fairly, I might add, perfect query results and peak performance. Info SigNoz is an open-source observability platform that can be your one-stop solution for logs, metrics and traces.