Operations | Monitoring | ITSM | DevOps | Cloud

Introducing the ChangeTower Website Monitoring Chrome Extension

Setting up website monitoring has always meant a small but annoying detour. You spot a page worth watching, copy the URL, switch tabs, log into your monitoring tool, paste, configure, save. By the time you’re done, you’ve lost whatever train of thought sent you there in the first place. We’re fixing that. Today we’re excited to announce the ChangeTower Chrome Extension — now open for waitlist signups.

How to automate environment sleeping and stop paying for idle Kubernetes resources

Scaling your deployments to zero is only half the battle. If your cluster autoscaler does not aggressively bin-pack and terminate the underlying worker nodes, you are still paying for idle metal. True environment sleeping requires tight integration between your ingress layer and your node provisioner to actually realize FinOps savings.

Bringing observability data hosting to the UK on AWS

UK organizations are increasingly required to design systems that account for data residency requirements, ensuring that operational data remains within national boundaries. Many teams already run their applications on AWS infrastructure in the UK, but telemetry data can still be processed outside the region, creating gaps in visibility. Datadog’s upcoming UK availability zone solves this by keeping telemetry data in the same region as the workloads that generate it.

Identify and fix code issues faster with Datadog's Azure DevOps Source Code integration

Developers and SREs who rely on Microsoft Azure DevOps often face fragmented workflows when investigating issues or reviewing code quality. Troubleshooting an error can require jumping between observability tools and source code repositories as you manually connect traces, stack frames, and commits. At the same time, security vulnerabilities, misconfigurations, and flaky tests may go undetected until later stages of the software delivery life cycle (SDLC), where they are more costly to fix.

Claude Opus 4.7 Pricing In 2026: What It Actually Costs (And Whether It's Worth It)

Claude Opus 4.7 holds at $5/$25 per million tokens — but a new tokenizer inflates costs up to 35% on identical text. Here's what Opus 4.7 actually costs at production scale, how it compares to Sonnet 4.6, and the six levers that determine where your bill lands.

How Any FinOps Practitioner Can Use AI Right Now To Save 3-4 Hours/Week Of Tedium

Make AI do the dirty work while you focus your energy on strategy. CloudZero's Ryland Bowles shows you how. Every FinOps engineer is worried that AI is going to steal their job. I’ve worried about it. But I’ve also experimented extensively with AI, and I’ve got a pretty clear sense of what it can and can’t do in a FinOps context.

Why Threshold Monitoring Fails in Distributed Systems

For years, infrastructure stability could be approximated through static limits. If CPU utilization exceeded a defined percentage or response time crossed a fixed boundary, risk was assumed to increase in a predictable way. Monitoring systems were designed around that assumption, and for contained environments, it largely held true.

Grafana 13 release: get value from your data faster, manage operations at scale, and more!

Who says 13 is unlucky? With the release of Grafana 13, we're giving the community the most streamlined, flexible, and intuitive Grafana experience yet. Unveiled during the opening keynote of GrafanaCON 2026, the latest major release is all about helping you get value from your data faster, whether you’re spinning up dashboards, operating Grafana at scale, or extending the platform as your requirements change. Download Grafana 13.

Introducing o11y-bench: an open benchmark for AI agents running observability workflows

Evaluating agents is hard. Verifying observability tasks is harder. Yes, AI agents have gotten dramatically and quantifiably better at coding and tool use, but observability presents a different kind of challenge. In a real incident, the hard part is rarely just writing a query. It's deciding which signal matters, figuring out whether a spike is noise or symptom, correlating metrics with logs and traces, and sometimes making a change in Grafana without breaking the dashboard another engineer depends on.