Operations | Monitoring | ITSM | DevOps | Cloud

Microsoft 365 backup best practices: A practical guide for IT teams

Microsoft 365 plays a critical role in modern business communication and collaboration with services such as Exchange Online, SharePoint Online, and OneDrive for Business. However, many organizations overestimate Microsoft 365’s native protection and recoverability. In reality, Microsoft 365 operates under a shared responsibility model. While Microsoft ensures infrastructure availability and uptime, organizations are responsible for protecting and recovering their data.

FinOps KPIs for IT Infrastructure: A Practical Field Guide for Cost Visibility

Infrastructure cost visibility has become a critical part of IT decision-making. Performance still matters, but for many infrastructure leaders, that’s no longer the full conversation. Leadership teams increasingly want clarity around cost movement, upgrade exposure, underutilized resources, and whether infrastructure decisions are financially defensible. That creates a different requirement for operations teams: visibility that connects technical behavior to business impact.

The Hybrid Shift: Where Workloads Are Headed and How to Move Them

Businesses migrating from a single, public cloud provider has been the direction of travel of UK digital infrastructure for years. As far back as 2020, Barclays found that 43% of enterprise CIOs were already planning to bring workloads back from the public cloud to on-premises or private cloud infrastructure. Since then, IDC, Gartner and a host of vendor surveys have tracked an increase in this intention.

Everything We Talked About at O11yCon 2026

We just wrapped O11yCon 2026, and this year's conversations hit differently. Agent-based software development is here, now. It's no longer an optional choice, and everybody is struggling to understand what their agents are doing and how to make them cost less and perform better. Over the course of fifteen talks, we saw clearly that the old assumptions on how and who (or what) writes our software has been upended. Here are some highlights. We'll have videos available in the near future.

You don't need to pick one: how Sentry and OpenTelemetry work together

You already instrumented the backend with OpenTelemetry. Your services emit spans. Your teams know the OTel APIs. Maybe you already run a Collector. So when you start evaluating Sentry, the obvious question is: Do you need to replace your OpenTelemetry setup with the Sentry SDK? No. The practical answer is usually: keep OpenTelemetry where it already works, add the Sentry SDK where it gives you more application context, and send OpenTelemetry Protocol (OTLP) events to Sentry.

The AI Agent Accountability Gap: Why Network Policies, API Gateways, And RBAC Are Not Enough

In The Five Pillars of AI Agent Accountability: A Diagnostic Framework for Engineering Leaders, we walked through each pillar of AI agent accountability (traceability, authorization provenance, identity and ownership, policy at scale, and human oversight) and argued that most enterprises today sit at Level 0 or Level 1 of the Accountability Maturity Model. The most common reaction we get when we share that framework is some version of: “We’re already covered. We have network policies.

10 Privacy-First Engineering Intelligence Platforms 2026

Engineering leaders need more than raw metrics, they need actionable insights they can trust with their data. When evaluating engineering intelligence platforms, privacy controls and centralized repository oversight should top your criteria list. The platforms on this list each offer distinct approaches to tracking DORA metrics, developer productivity, and code quality while keeping your data secure.

How to Use Git Blame in Your Editor in 6 Steps (2026)

Tracking down who made a specific change in your codebase can feel like detective work. Whether you’re debugging an issue or trying to understand why a particular piece of logic exists, knowing the history behind each line is invaluable. GitKraken makes this process simple with tools like GitLens for VS Code and GitKraken Desktop, which bring blame annotations directly into your workflow.

Healthy PR Lifecycle Time: Benchmarks & Targets (2026)

Your pull request has been open for three days. Your reviewer hasn’t commented. You’re starting to wonder if anyone will ever look at it—and whether the code you wrote on Monday still makes sense on Thursday. This feeling is common. PR lifecycle time—the duration from first commit to merged code—directly impacts how quickly you ship features, how fresh your code stays, and how engaged your reviewers remain.

Builder in the loop: Eric Lake on making AURA smarter after every incident

Builder in the Loop is a Mezmo interview series focused on the engineers, product leaders, and operators shaping AURA, an open-source, MCP-native agent harness for production operations. The goal is to get past the polished product layer and talk through the decisions that matter when AI starts interacting with real systems. Key questions include: What should agents be allowed to do? How do they get better over time? Where should humans stay in the loop?