Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

ActiveMQ Dead Letter Queue (DLQ) Management: The Complete Guide

If your Apache ActiveMQ deployment has a growing ActiveMQ.DLQ, you are not alone, and you are looking at the right problem. An unbounded, unmonitored dead letter queue is one of the most common root causes of "invisible" message loss in enterprise messaging environments. DLQ messages land without fanfare, nobody notices, and business-critical data quietly disappears from the processing pipeline.

Apache ActiveMQ vs Apache Artemis: The 2026 Definitive Guide

When engineers search for "Apache ActiveMQ vs Apache Artemis," most of what they find is either a shallow feature checklist or a confident recommendation to "just migrate to Apache Artemis." Neither helps a senior architect deciding whether to stay on a stable, battle-hardened Apache ActiveMQ deployment, or a platform team evaluating both options for a new system with clear eyes.

Setting the Bar for Agentic NetOps

AI has quickly become part of the language of network observability. Many vendors across the observability landscape can describe, summarize, correlate, or explain some data or situation, leveraging basic LLM capabilities. At a distance, many of these offerings sound similar. They promise faster insight, efficient operations, and a more intelligent path through rising complexity. But the industry has reached a point where surface-level similarity is creating noise, not value.

Managing OpenTelemetry Semantic Convention Migrations With the Collector

Real production data tells the story better than I can. Juraci Paixão Kröhling, a friend and fellow observability practitioner at OllyGarden, recently shared an example from an anonymized production environment: 1,830 occurrences of http.url and 23,984 occurrences of url.full in the same dataset. Both attributes describe the same thing. Both are actively being written to the same backend at the same time.

VictoriaMetrics at KubeCon Amsterdam: Community Highlights

KubeCon + CloudNativeCon Europe in Amsterdam brought together about 13,500 attendees this year, the largest turnout yet. The size of the event showed just how much the cloud-native space has grown, and how central observability, platform engineering, and cost control have become. For VictoriaMetrics, this year’s event was a mix of talks, booth conversations, and a lot of direct feedback from users.

Take Control of Cloud Costs with Proactive Budget Alerts

Proactive budget alerts turn cloud cost optimization into an everyday operational practice. If you are responsible for managing cloud infrastructure, you already know the pattern. Costs creep up quietly, and by the time anyone notices, it is the end of the month and you are explaining instead of preventing overruns. According to Flexera’s 2026 State of the Cloud Report, 85% of their respondents say managing cloud costs is their number one priority for the year.

Why Your PromQL Availability Query Returns Nothing When Services Are Healthy

Your SLI query shows 100% availability as No Data. Here's why PromQL returns empty results instead of zero — and the label-preserving fix. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

How Recurring Instability Turns into Clinical Trial Delays

In pharma, reliability becomes an operational priority because research and trial work depend on systems performing consistently across different teams, locations, and conditions. Much of that work sits inside scientific workflows, remote sessions, and compute-heavy environments where behaviour can shift with configuration or load. When that consistency starts to break down, teams keep moving, but time is lost in small increments across the day.