Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Cloud monitoring, security and related technologies.

Cultural ROI In FinOps: People Drive Pivots

When I ask clients to picture cloud cost optimization, they think dashboards, policies, maybe a clever right-sizing purchase. What they don’t picture? Meetings. Misunderstandings. Mistrust. To avoid FinOps failures, we need a new starting line; one that gets to the root of spend misalignment.

From Code To Clicks: A Visual Way To Build Dimensions In CloudZero

In early October, we launched Dimension Studio, a new visual editor for engineers and others that brings point-and-click simplicity to the same powerful, precise allocation engine CloudZero is known for. Before that, when CloudZero users built cloud cost allocations, they got it from our YAML-based CostFormation engine, a code-driven way to describe how cloud and AI costs roll up to products, customers, or teams.

How to solve authentication failures when you have an Azure setup

It is not just your business. Enterprises worldwide face recurring technical issues related to authentication failures and access problems. These errors often pop up, especially in scenarios with service connection setups, pod/start failures, or integration issues. Most of the time, these errors indicated failed deployments, pods failing to pull images, or intermittent authentication/access errors.

Single-Cloud Dependency Is a Disaster Waiting to Happen

The impact of the AWS outage has reminded many businesses of the risk for businesses that rely heavily on centralised cloud infrastructure, especially when so many essential services are concentrated in a single region. But at the wider industry level, this is also a warning around the widespread lack of contingency planning for cloud failures. Reactive response must give way to strategically planned disaster recovery protocols that engender a resilient cloud market.

Optimizing GPU Efficiency and AI Costs with Pepperdata

As AI workloads explode, platform owners face an increasingly common challenge: a massive gap between GPU demand and supply. Pending workloads, idle GPUs, and rising costs make it harder than ever to scale AI efficiently. In this video, we explore how Pepperdata.ai helps enterprises regain control of their GPU environments with two breakthrough solutions: Demand Optimization – Get granular visibility into GPU usage across your entire infrastructure. Identify inefficiencies, balance supply and demand, and uncover hidden capacity.

AI Agent for Cloud Cost Optimization: From Blind Spots to Smarter Spend

Cloud has become the backbone of digital enterprises, but managing its cost footprint is proving increasingly difficult. With multiple providers, diverse pricing models, and ever-changing workloads, organizations often find themselves facing spend leakage and unanticipated overruns. The stakes are high—not only in terms of IT budgets but also in ensuring cloud resources deliver maximum business value.

Data Centre Colocation: What UK Businesses Need to Know About Costs

As more UK companies go digital, many are missing critical cost factors when choosing colocation data centres, with location, power bills and regulatory compliance proving far more expensive than many anticipate. With insights from Pulsant, a digital edge infrastructure provider, we take a look at true cost of colocation.

Kubernetes For AI: The CTO's Guide

Kubernetes began as a tool to help teams keep thousands of microservices running without falling apart. It gave them a way to schedule workloads, recover from failures, and scale services without constant firefighting. Now, AI has brought back the same chaos, only magnified. Training jobs sprawl across GPUs. Inference traffic spikes without warning. Pipelines stretch across clusters, clouds, and compliance boundaries. Left unchecked, it can break both your workload and cloud budget.

Service disruption on October 20, 2025

When the internet goes down, our primary job is to help everyone get back up, as fast as possible. Of the almost half a million incidents we've helped our customers solve, there are some which stand out for both their scale and impact. One of these happened on Monday, October 20, when AWS had a widely covered major outage in their us-east-1 region, from 07:11 to 10:53 UTC. We’re hosted in multiple regions of Google Cloud and so the majority of our product was unaffected by the outage.

Build Vs. Buy? Why Creating Your Own Cost Management Platform Is Futile

The siren song of building a custom, internal cloud cost management platform is enticing. Many brilliant engineering teams are convinced they can come up with a bespoke solution that perfectly fits their needs. They look at their company’s unique infrastructure and decide they can DIY cost management without having to rely on an external vendor. Believe me, I get the temptation.