Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Why Does Network Topology Decide How Fast Your Network Recovers?

In this video, learn why network topology plays a critical role in network resilience, troubleshooting, and recovery. Discover how understanding network dependencies, eliminating single points of failure, and maintaining clear visibility can help IT teams reduce downtime and accelerate incident response. In this video, you'll learn.

Telemetry Talks ep. 5 - OpenTelemetry in the AI agents era

Telemetry Talks explores how OpenTelemetry’s CNCF graduation arrives at a pivotal moment for AI-powered development. Together with Alex Marshalov, we dive into vibe coding, AI agents, and the growing need for observability in GenAI systems — from prompts and token usage to reasoning chains and distributed traces — using the VictoriaMetrics stack and OpenTelemetry as the foundation for understanding the next generation of autonomous software.

ActiveMQ Protocol Comparison: AMQP vs MQTT vs OpenWire vs STOMP

One of ActiveMQ's most powerful and underappreciated capabilities is its protocol polyglotism: a single broker can simultaneously accept Java JMS clients over OpenWire, Python services over AMQP, IoT sensors over MQTT, and Ruby scripts over STOMP, all routing messages between each other without protocol bridges or translation middleware.

What Is Your Operating Model Costing Your Business?

The biggest cost in your business may not appear anywhere on your balance sheet because some of the most expensive problems are rarely measured directly. Lost productivity, recurring technology issues, underused applications, and the effort required to manage them all accumulate over time without ever appearing as a line item in a financial report.

Features in Icinga Web 2 Worth Knowing About

When you work closely with Icinga Web 2, developing modules, building dashboards, poking around the internals, you naturally pick up on features that most users never think about. Some are usability improvements that deserve more attention than they get. Others are developer conveniences that turn out to be genuinely useful in the right user situation too. They’re just the kind of thing that rarely makes it into the getting-started guide. Not all of these will apply to your daily workflow.

How Worker Safety RTLS Creates Safer Industrial Work Environments

Step onto the floor of any heavy stamping plant, automotive fabrication cell, or high-velocity distribution hub, and you see safety treated like an afterthought wrapped in a compliance checklist. You find yellow lines painted across the concrete, warnings stuck to every pillar, and flashing blue strobe lights mounted on the backs of forklifts. Yet close calls, near-misses, and serious floor injuries keep happening. These old-school safety methods fail because they place the entire burden of survival on human vision and split-second reflexes.

New: Save time during incidents with incident templates

Creating incidents often means filling out the same information over and over again. That’s why we’ve added Incident Templates – a faster way to create incidents using pre-configured settings. With templates, you can save commonly used incident details and apply them with a single click whenever you need them.

Analysing Claude Code telemetry with SquaredUp - diving deeper

In our previous article we looked at the basics of: In this article, we are going to take a deeper dive into some of the complexities of configuration as well as some of the nuances of analysing Claude telemetry. Before we dive into the code, let us just remind ourselves that our telemetry pipeline looks like this: That is, we are emitting Claude Code telemetry to an OpenTelemetry Collector. The telemetry is then exported to an Application Insights endpoint and stored in Log Analytics tables.