Operations | Monitoring | ITSM | DevOps | Cloud

Platform engineering unplugged: What nobody tells you about platform engineering at scale

Most platform engineering stories are told in hindsight, with the rough edges smoothed out. On June 17th, we are doing it differently. Join us for Platform Engineering Unplugged, a frank conversation with a practitioner who has navigated the real challenges of building and scaling platform engineering. What worked, what didn't, and what they would do differently. If you lead engineering teams and are thinking seriously about platform engineering, this is the session for you.

From event correlation to autonomous IT: Why observability isn't enough anymore

Most IT war rooms have plenty of data, but not enough time or clarity to find the real answer. Dashboards are crowded, alerts keep piling up, and the real issue gets lost in all the noise. Ever dealt with this situation? You’re not alone, and there’s a simpler way to deal with it. OpManager Nexus closes this gap by moving beyond visibility to help teams actually diagnose and fix problems faster.

Datadog Data Observability: Be the first to know when data fails

Bad data doesn't announce itself. Datadog Data Observability gives you unified visibility across your entire data stack—from source systems and pipelines to dashboards and AI applications—so you catch silent failures before they cascade. Detect data quality and pipeline issues before stakeholders do, pinpoint root causes with end-to-end lineage, and reduce pipeline costs with job, cluster, and query recommendations.

Real-Time CPU and Memory Insights for Harness CI Cloud Builds | Harness Blog

When a CI pipeline runs on cloud infrastructure, the build machine is ephemeral. It spins up, executes your build, and disappears. During that window, you have zero visibility into how much CPU and memory your pipeline actually consumes. This blind spot creates real problems. Teams over-provision VMs "just in case," wasting compute spend. Others under-provision and deal with silent OOM-kills or CPU throttling — the only clue being a cryptic exit code 137.

What's New in InfluxDB 3.10: Performance Beta Expanded with New Enterprise Features

In our last release, we introduced a beta of performance updates designed for heavier, more complex time series workloads. InfluxDB 3.10 expands that beta to include enterprise features that give teams more control as they scale and manage larger workloads in InfluxDB 3. This release adds end-to-end backup and restore, row-level deletes, bulk import from Parquet, user management, and an RBAC preview to the previous performance beta.

Resilience for an AI-Powered Future: PagerDuty's FY26 Impact Report

The impact vision for PagerDuty.org is to enable mission-driven teams to build a resilient world and a sustainable future for all. As a leader in modern, AI-first operations, we know that operational excellence supercharges social impact. As artificial intelligence rapidly reshapes the social sector, this commitment to resilience and efficiency has never been more vital.

Route Critical Alerts Evenly and Move Faster from Message to Phone Call

It’s been a busy quarter at OnPage. We recently rolled out our updated Enterprise Management Console to a select group of beta customers, and the early feedback has been exciting to see. The new experience gives teams a cleaner, more modern way to manage critical communication workflows, on-call schedules, alerting activity and team visibility from one place. But we have not slowed down there.

OnCall Rotation Software for IT Ops Boosts Response (2026)

The chaos of manual on-call management is a familiar story for many IT Operations teams: frantic phone calls, confusing spreadsheets, missed alerts, and frustrated engineers on the verge of burnout. This reactive approach doesn’t just strain your team; it risks service-level agreement (SLA) breaches and customer churn.

We wrote the docs

Most security vendors hide their documentation behind a login. Some don’t write it at all. You get a sales page, a demo, and a request to install an agent on your servers, and you’re expected to trust that the thing does what the marketing says. That’s backwards. So we wrote the docs, and we put all of them at certkit.io/docs. No login, no account gate, no “contact us for details.” You can read every page before you create an account.