Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Why Alert Fatigue is a Major Challenge in Observability (2025 Survey Insights) | Grafana Labs

Over 1,200 engineers, leaders, and teams shared their biggest observability challenges in our third annual Observability Survey — and the results are in. In this video, Marc Chipouras (Head of Emerging Products, Grafana Labs) breaks down the top insights: Thanks for watching!

Elastic Powers GitHub's Seamless Developer Experience

David Tippet, Search Engineer at GitHub, shares how Elastic powers GitHub’s massive search platform and enables a seamless developer experience. He explains how GitHub balances AI-driven semantic search with traditional keyword search, ensuring accuracy for millions of diverse users, from engineers to security researchers.

The Observability Problem Isn't Data Volume Anymore-It's Context

For years, the observability industry has been obsessed with one thing: data volume. We've built incredible pipelines, optimized agents, and scaled storage to handle petabytes of logs, metrics, and traces. The promise was simple: collect more data, get more visibility. But we've hit a wall.

How to monitor Claude usage and costs: introducing the Anthropic integration for Grafana Cloud

Generative AI is becoming a core part of modern applications, making it essential to monitor and manage how these services are used. That’s why, today, we’re excited to introduce the Anthropic integration for Grafana Cloud, a new solution that lets you connect directly to the Anthropic Usage and Cost API from within Grafana Cloud.

How to use AI tools more effectively: Tips from Datadog Engineers

A growing number of engineering organizations have adopted or are trialing agentic AI-based coding tools and LLMs in an effort to increase their teams’ development velocity. If you’re a developer, this means you’ve likely had to try out different agentic tools and models and determine how to best incorporate them into your existing workflows.

Why (Enriched) Flow Data Belongs in Every Network Operator's Daily Toolbox

Flow data has always held immense potential, but was often inaccessible because it lacked context and speed. Kentik removes that friction by automatically enriching flow with human-readable context, making it a daily driver for everyone, not just specialists.

AI-Driven Application Monitoring with Checkly and Claude Code

In this webinar, Stefan Judis (Developer Relations at Checkly) and Dan Giordano (VP of Marketing at Checkly) dive into how LLMs and AI tools can be used with application monitoring. You’ll see a live demos of integrating Claude Code, Playwright MCP, and Checkly’s Monitoring as Code. ⸻ Timestamps ⸻ Resources & Next Steps ⸻ Subscribe for more sessions on application reliability, testing, and AI-powered DevOps!

What is Real User Monitoring

Real User Monitoring (RUM) measures how real users interact with your application in production. Unlike synthetic monitoring, which relies on scripted tests, RUM collects data from actual sessions. This means performance is observed across different devices, networks, and usage patterns. The result is a clear view of how the application behaves under real conditions, where latency is introduced, which features take longer to load, and at what points users drop off.

From SEO to AEO: Why Web Performance Is the Key to AI Search Success

Search isn’t what it used to be. The way people discover information online is shifting. Instead of clicking through search results, many now ask AI answer engines like ChatGPT and Perplexity to do the research for them. In March 2025, 13.1% of Google desktop searches featured AI Overviews— doubling from over 6% in January, according to Semrush analysis of 10+ million queries.

Tracking Errors in Absinthe for Elixir with AppSignal

GraphQL provides a powerful approach to building APIs, and Absinthe is the leading GraphQL implementation for Elixir applications. While GraphQL offers many benefits, it can introduce a set of errors and performance bottlenecks that might be challenging to track and debug. In this article, you’ll learn how to use AppSignal to monitor, debug, and resolve errors in your Absinthe-based GraphQL API.