Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Pastries with SREs: Enriched logs and filled donuts

In this episode of Pastries and SREs, we take a sweet dive into one of the most exciting evolutions in observability: enriched logs, also known as wide events. Gone are the days of toggling between tools and stitching together logs, metrics, and traces. Enriched logs consolidate the context, providing everything you need to understand and resolve issues in a single log entry. We explore.

How to Monitor Java Applications on Windows with SolarWinds Observability | APM Setup Guide

This video provides a step-by-step walkthrough for configuring monitoring for Java applications running on Windows using SolarWinds Observability. The demonstration covers the complete process—from adding a new service to instrumenting the application with the Java APM library and verifying connectivity. Topics covered in this video include: This guide is designed for developers, DevOps engineers, and system administrators who need to instrument Java applications on Windows for performance monitoring, distributed tracing, and full-stack observability.

Introducing Bits AI SRE, your AI on-call teammate

Bits AI SRE is your AI on-call teammate, built to autonomously investigate alerts and coordinate incident response. Integrated with Datadog, Slack, GitHub, Confluence, and more, Bits analyzes telemetry, reads documentation, and reviews recent deployments to determine the root cause of alerts—often before you’ve even opened your laptop. In fact, if you're using Datadog On-Call, you can view Bits’s findings right from your phone—so you’re always one step ahead, no matter where you are.

Mezmo + Catchpoint deliver observability SREs can rely on

For SREs juggling multiple services, third-party dependencies, and constant alerts, a critical service slowdown can quickly turn into chaos. APM Dashboards may show everything is fine, yet users are still experiencing problems. That gap—between application telemetry and real-world performance—can turn a five-minute fix into a two-hour war room. ‍

Build custom apps in seconds with conversational AI in App Builder

Datadog App Builder is a low-code tool for creating internal apps, making use of a drag-and-drop interface that allows engineering teams to troubleshoot issues, optimize operations, and enable self-service while connecting directly to their Datadog data and permissions. Now, with conversational AI, teams can go from idea to working prototype even faster.

What's Special About MCP?

AI agents can interact with the world using tools. Those tools can be generic or specific. For example: Generic: Specific: The most general ones, like “run a bash command” and “read and write files” are built into the agent. More specific ones are provided through Model Control Protocol (MCP) servers. Every tool provided to the agent comes with instructions sent as part of the context.

Installing TrackJS on Certkit

I recorded a video showing how to properly set up TrackJS for a new production website, specifically CertKit, our new certificate lifecycle management tool. The key to effective error monitoring isn’t just installing the tracking snippet, it’s configuring the system to surface real issues while filtering out the noise. I configure a forwarding domain (errors.certkit.io) to bypass ad blockers that might prevent error reporting.

<100ms E-commerce: Instant loads with Speculation Rules API

In e-commerce, we all know that speed = money. I know it, you know it, Amazon knows it, eBay knows it, Shopify knows it, everyone knows it. In this article we’ll see how we can improve the perceived performance of our site’s critical pages, like the Product Details page, the Cart page, the Checkout page. We’re going to use the Speculation Rules API (SRA) to prerender/prefetch them, and also explain how certain frameworks like Next.js offer their own prefetching mechanisms.