Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

AI Working for You: MCP, Canvas, and Agentic Workflows - Part 2

In our previous post in our series on observability for the agent era, we looked at how Honeycomb provides unique visibility into LLMs operating in your production environment. Now, let’s flip it around and explore how Honeycomb provides observability insights uniquely suited to helping your AI agents rapidly diagnose and fix production issues, and build production feedback into the next round of development.

New Features: Team Members and Additional Email Recipients

DNS Check now supports two features for Enterprise accounts that make it easier to work as a team: Team Members and Additional Email Recipients. Team Members lets multiple people log in and work with your DNS records using their own credentials. Additional Email Recipients sends notification emails to people who need to stay informed but don't need to log in.

Employee Monitoring Software for the Modern Workplace in 2026

Most managers don't want to spy on their employees. But when your team is spread across three time zones and half of them work from home, knowing what's actually getting done isn't spying. It's just good management. Employee monitoring software has changed a lot in the past few years. It's no longer just about clocking in and out or taking screenshots every 10 minutes. The best tools today help teams work better, not just track whether they're working at all.

VictoriaMetrics March 2026 Ecosystem Updates

Welcome to the March release roundup of VictoriaMetrics Stack, covering key enhancements in VictoriaMetrics and VictoriaLogs. These updates deliver improved UI scalability, enhanced authentication flexibility, improved query performance, and logging tools that streamline observability workflows in production environments. This roundup covers releases for.

The single pane of glass approach to cloud monitoring

Dozens of SaaS services you depend on, starting from Google Workspace and Slack to Shopify, may experience downtime, partial outages, or degraded performance. And most have their own status pages, APIs, or RSS feeds. Juggling all these sources is exhausting, and many teams suffer from alert fatigue, missed early warnings, and fragmented visibility.
Sponsored Post

How to Centralize Incident Notifications in Slack

Even a brief outage in a critical service can disrupt projects. Customers get frustrated and flood the support team with tickets. What's the solution? Centralizing incident notifications and real-time status alerts in Slack. Many teams already collaborate there anyway. So let's take a look at how teams can streamline service monitoring, alerting, and incident workflows in Slack using integrations, automation, and tools like StatusGator.

From alerts to action: Where reliability is actually won

Observability has evolved dramatically in the past decade. The industry has moved from basic uptime checks to full-stack observability (FSO), including metrics, logs, traces, and real user monitoring. Observability tools like ManageEngine FSO can detect anomalies in little time. And yet, outages still last longer than they should. Observability has matured. Response hasn’t. Most IT teams today have the tools to know when something breaks. But knowing is not the same as resolving.

Profiling Java apps: breaking things to prove it works

Coroot already does eBPF-based CPU profiling for Java. It catches CPU hotspots well, but that's all it can do. Every time we looked at a GC pressure issue or a latency spike caused by lock contention, we could see something was wrong but not what. We wanted memory allocation and lock contention profiling. So we decided to add async-profiler support to coroot-node-agent. The goal: memory allocation and lock contention profiles for any HotSpot JVM, with zero code changes. Here's how we got there.

When we say "Observability AI Reckoning," what are we actually talking about?

We’ve spent the last decade collecting more telemetry. Now AI is analyzing it. Here’s the catch: AI needs the full dependency chain to reason correctly. If it sees spans but not storage contention… Services but not Kubernetes scheduling… Frontend metrics but not downstream providers… It will confidently optimize the wrong thing. AI doesn’t lower the need for observability. It raises the standard.