Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Log Management, Log Analytics and related technologies.

OpenTelemetry Fleet Management: Scalable Control

OpenTelemetry has turned observability pipelines into production infrastructure, but managing them at scale often creates a massive operational burden. In this demo, we show how Coralogix Fleet Management acts as the central control plane for your OTel ecosystem, providing the governance and orchestration required for modern DevOps. Stop the "manual marathon" of PRs and Helm upgrades. Move toward a safer, more predictable operating model where telemetry is consistent, audited, and scalable.

The Best Kubernetes Monitoring Tools of 2026

Effective Kubernetes monitoring in 2026 is critical due to increased cluster scale and microservices complexity, demanding a shift toward unified observability (logs, metrics, and traces). The core focus is leveraging AI-driven features to automate anomaly detection, correlate diverse data, and significantly reduce Mean Time to Recovery (MTTR).

AURA in Practice: Mezmo's SRE bot, demo walkthrough

A walkthrough of the Slack-based SRE bot Mezmo's engineering team built on AURA, the open-source agent harness, running against Mezmo's own production tooling. Adrian Furlong shows the bot answering questions in a DM with tool calls visible inline, then in a shared channel where it reads the conversation before responding. He opens a fresh PagerDuty incident on camera. The webhook fires AURA, and within seconds, the agent posts a triage note back on the incident and a structured analysis in the dedicated incident channel.

Managing OpenTelemetry at Scale: Why OTel Pipelines Need a Control Plane

OpenTelemetry made telemetry possible everywhere – turning observability pipelines into distributed production infrastructure. Distributed infrastructure requires a control plane for inventory, governance, and safe change. At 500 collectors across hybrid environments, operational overhead becomes a production risk. The moment telemetry pipelines become a distributed infrastructure, they inherit the operational problems of one.

Federated Search | From Silos to Insight | AWS S3 Schema Discovery with Splunk-Managed Tables

This walk-through shows how Splunk's crawler, available through the Data Management app, can discover schema and partition keys for S3 backed datasets and create Splunk managed catalog tables. Once the data is mapped, analysts can search AWS S3 data through Splunk and bring it into broader security, observability, and operational workflows.

The Journey to Production AI: Five Steps for SRE and Platform Teams

In a recent webinar, The Journey to Production AI, Andre Elizondo walked through what separates a working agent demo from an agent worth trusting on a 2 a.m. page. Live polls during the session put numbers behind a pattern most platform teams already feel. ‍ ‍ Most teams are early. The ones who are further along did not get there by shipping a flashier demo. They got there by treating production AI as a platform problem.

Operational Intelligence and the Hidden Structure in System Logs

Most IT teams do not suffer from a lack of data. They suffer from the amount of effort required to make sense of it. Every network device, application, cloud service, and infrastructure component generates a constant stream of machine output. Logs capture state changes, failures, retries, warnings, and thousands of other small signals about how systems behave. The problem is that raw logs are hard to use at operational speed.

From noise to knowledge: How GenAI is revolutionizing log management and analytics

Focusing on GenAI and logs for IT efficiency Efficiency is everything for managing today’s digital systems. Technology is constantly transforming and expanding operations are driving an explosion in data. Consequently, data ingest and storage costs have soared. But it’s not just storage data costs that keeps teams behind.The challenge of managing all that observability data forces IT teams to choose between efficiency and the bottom line.

Eliminate noisy log lines with Adaptive Logs drop rules

Most platform and observability teams have logs they know are noise. These could be throwaway health check logs, forgotten DEBUG logs, or verbose INFO logs from little used services that only serve to inflate your bill. Regardless of what they contain and why they're there in the first place, the hard part is getting rid of them. Centralized teams want to easily and quickly prevent these logs from being ingested, without having to work with toilsome infrastructure change management to do so.

How one partnership powers search for over 2 million WP Engine users

How do you make search faster, smarter, and more scalable? During our recent webinar, I sat down with Luke Patterson, senior product manager at WP Engine, and Delphin Barankanira, independent software vendor partner engineering lead and data & AI specialist at Google Cloud, to answer that question. We dug into the mechanics behind WP Engine’s ability to deliver near-instant updates to over 2 million users.