Operations | Monitoring | ITSM | DevOps | Cloud

AI ROI Dispatches: How a non-engineer solved a $300K problem for under $1K

A year ago, the sentence “I just deployed an app on GitHub” wouldn’t have made sense coming from me. I’m the VP of People at CloudZero; code deployments and I were not close friends. That’s changed. In this AI era, non-engineers are building, and I think that’s a genuinely good thing. But only if it’s tied to something that matters.

Shipped: LiteLLM is probably under-counting your Claude spend

If you run Claude through LiteLLM, some of that spend is probably going uncounted – and you can’t see it, precisely because the data isn’t there. Routing through a gateway is messier than it looks: LiteLLM alone can carry Claude several ways – the OpenAI-compatible endpoint, and the Anthropic pass-through proxy that the native SDK and Claude Code use – and each path describes the same call differently.

Upsun Dispatch is available in prerelease

When we introduced Upsun Dispatch last week, we said we were building the platform layer for everything around the code. Today, you can apply to join as a founding design partner. Starting July 1, 2026, a number of engineering organizations will join us in prerelease. This is a selective, high-touch collaboration with teams who want to help shape what comes next. If you missed the introduction, you can catch up on Upsun Dispatch here.

Difference Between Elasticity and Scalability in Cloud Computing

In cloud computing, teams use elasticity and scalability as if they mean the same thing. In reality, the two describe different ways a system handles load, and they solve different problems. Mixing them up can be very expensive. You either pay for capacity that sits idle, or your app buckles the moment traffic spikes, and the bill and the incident report both feel it.

Cloud freedom with AI built in

Most cloud providers give you the hardware and leave you to figure out the rest. Civo AI is different. Chief Innovation Officer Josh Mesout explains how Civo thinks strategically about AI adoption, guiding organisations through the full lifecycle from planning and infrastructure through to running and scaling workloads, powered by best-in-class NVIDIA GPUs.

How to Choose a Cloud Migration Partner in New Jersey: What IT Leaders Need to Verify

A failed cloud migration does not announce itself in advance. Data loss, extended downtime, misconfigured security controls, and compliance gaps surface during or after the move, when reversing course is expensive and the business is already affected. For New Jersey organisations in financial services, healthcare, legal, and manufacturing, the stakes are high enough that choosing the right migration partner is at least as important as choosing the right cloud platform. The hard part is separating providers who can execute a migration cleanly from those who can describe one convincingly.

Why your team keeps waiting for staging (and what to do about it)

The staging bottleneck: why your framework needs ephemeral preview environments There's a specific kind of Friday afternoon that frontend and backend developers both recognize. A feature is ready to test. Staging is occupied. Someone else pushed a half-finished migration to the shared database last Tuesday and it's been "almost fixed" ever since. You either wait or you merge blind and hope. Most teams treat this as a scheduling problem. It isn't. It's an architecture problem.

The AI vendors just started watching the meter. CFOs need to watch the return.

On June 18, OpenAI gave ChatGPT Enterprise admins new credit usage analytics and spend controls. It’s a single view of credit consumption broken down by user, product, and model, default workspace budgets, per-group limits, and a Cost API for pulling the data into their own systems. Two days earlier, Microsoft shipped Copilot Cowork with spending limits, budget allocation, usage alerts, and user-level caps. This is a step in the right direction.

Customer lifetime value (CLV): formula, calculation, and how to improve it

Customer lifetime value (CLV) is the total revenue a business expects from a single customer over the entire relationship, minus the costs of serving them. The standard SaaS CLV formula: Average Revenue Per Account x Gross Margin % / Monthly Churn Rate. For a $500/month customer with 75% gross margin and 5% churn: CLV = $7,500. That number can swing materially once AI spend per customer is built into gross margin, something many SaaS companies still don't do.

How Kubernetes Operators May Conflict With Resource Optimization (And How to Avoid It)

A Kubernetes Operator is a method of packaging, deploying, and managing a Kubernetes application. It extends the native Kubernetes API by combining custom resources (CRDs) with a dedicated controller: a custom control loop that continuously watches the state of those resources. The primary purpose of an operator is to automate complex, stateful applications (like databases, message queues, or monitoring suites) that require human operational knowledge to maintain.

The secret behind Carnegie's fortune and the lesson for the AI era

Point A: 1835. Andrew Carnegie is born in a weaver’s cottage in Dunfermline, Scotland. The cottage has one main room, which the Carnegies share with another family. Point B: 1901. Andrew Carnegie becomes the richest man in the world when Carnegie Steel Company wins the Iron vs. Steel industrialists’ war, and he sells the company to J.P. Morgan for the modern equivalent of $450 billion.

Azure FinOps with AI: What's New in Turbo360 v5.2

Turbo360 v5.2 is the biggest AI update we've shipped. Every module now has AI built in - not just to surface data, but to explain it, guide you through it, and help non-experts take action without needing to call in a specialist. In this video, Mike Stephenson walks through every new feature in v5.2, from AI agents that explain cost drivers and rightsizing recommendations, to a brand new Savings Tracker that gives you a better way to prove FinOps impact to management.

Turbo360 for System Integrators: Grow Your Azure Practice

If you deliver Azure integration solutions for clients, this video is for you. Fragmented tooling, unpredictable bills, and support incidents that eat into your consultancy time — these are the problems that limit how fast SI partners can scale. Turbo360 helps you solve all three, and turn each one into a business growth opportunity. In this video, Turbo360 CTO Mike Stephenson (Microsoft MVP) walks through how system integrator partners are using Turbo360 to deliver better outcomes for clients, reduce support overhead, and build managed service revenue alongside their integration practice.

New in Kubex: KAI Scheduler Integration for Shared GPU Inference

Today, we’re launching Kubex support for the KAI Scheduler and automated GPU sharing for inference workloads. As AI inference moves into production, platform teams are being asked to serve more models, support more teams, and control GPU costs at the same time. But many inference workloads do not need an entire GPU all the time. When teams reserve full GPUs or oversized GPU fractions to stay safe, expensive capacity can sit idle across the cluster.

CloudZero Dimension Studio: A drag-and-drop UI at the foundation of AI ROI

The core of ROI is visibility. If you can clearly see … 1. What it costs to produce the thing you make, and 2. How much money it makes you … then calculating ROI is easy. But with AI, as with the cloud before it, getting that visibility is extremely challenging. Why? Because the cost data associated with each is inherently chaotic.

Introducing Upsun Dispatch

AI has made writing code fast, and you can feel it. Commits are up, pull requests are up, new repos spin up over a weekend, and your engineers swear they are faster. But where are all the new products? If every team really got faster, the software you use every day should be getting visibly better. AI helped your engineers ship more code. It didn't help your team ship more products.

Retention Policies vs Retention Labels in SharePoint (2026): The Difference Admins Constantly Get Wrong

Retention policies apply to locations. Retention labels apply to items. Both live in Microsoft Purview, both retain content, and admins regularly use the wrong one. What each actually does and when to use which.

How Businesses Are Building More Resilient Technology Strategies

In an increasingly digital world, businesses face constant pressure to keep their technology systems secure, efficient, and adaptable. From cyber threats and system outages to changing customer expectations and market disruptions, organizations can no longer rely on technology strategies that focus solely on short-term needs. Instead, companies are investing in resilient technology strategies designed to support long-term growth, minimize risk, and maintain business continuity in uncertain conditions.

Upsun included in IDC ProductScape on worldwide cloud deployment-centric platforms, 2026

Upsun is included in IDC ProductScape guide to worldwide cloud deployment-centric platform capabilities. Building and scaling applications has never been more complex or more critical. Engineering teams are under constant pressure to ship faster, manage increasingly complex infrastructure, and adapt to the rapid rise of agent and AI-powered development. Choosing the right platform to support these demands is an important decision for technology organizations.

How to track business expenses in 2026: methods, tools, and AI spend

How to track expenses for a business: categorize expense types (operating, software, cloud, travel, capital), choose a tracking method (spreadsheet, accounting software, expense management tool, or cost intelligence platform), connect data sources (bank feeds, cloud billing APIs, SaaS invoices), assign ownership per cost center, set a reporting schedule, and audit quarterly.

AWS Summit London & NYC: what engineers want

Across two AWS Summit events in London and New York City, we had the chance to speak with more than 1,000 engineers. They came from startups building their first production stack, and enterprises managing large AWS and multi-cloud deployments. The energy was exactly what you'd expect: major AWS launches, dozens of new service announcements, wall-to-wall cloud conversations. And HAProxy right in the middle of it.

40+ Cloud Storage Statistics in 2026

Each year, there are increasing risks to the security of cloud data and our online privacy. Cloud storage statistics help us understand these risks in the cloud, how to prevent them, and which is the best cloud storage you can choose to protect your data. Throughout this article, we will cover: By understanding the common security risks in the cloud, we can also learn how to prevent them with cloud storage services and online privacy tools.

Why your PaaS choice is a governance commitment

Choosing a Platform-as-a-Service (PaaS) is not just an infrastructure decision. It is also a decision about how personal data will be handled over the life of the project. It's a governance commitment made early, with consequences that run late. A PaaS does not remove an organization’s accountability for privacy, security, or regulatory compliance. However, a well-architected PaaS can materially strengthen the control environment in which those obligations are managed.

Cloud Storage vs Local Storage: Everything You Need to Know

In 2026, the world is expected to generate roughly 450 to 500+ million terabytes of data per day due to continued rapid growth in: All this data needs to be stored somewhere, but is cloud storage or local storage best to manage your data? Throughout this article, we will cover This way, you will gain a deeper understanding of both storage models and determine which best suits your personal, business, or enterprise use case.

Where to Find a Cloud Application Development Company Fast?

Cloud application development company decides how fast you ship a product, how well you survive a traffic spike and whether your customers stick around when a page takes two seconds too long. And yet hunting down the right team for the work somehow always drags. You have got the idea. You have got a launch date that wakes you at 3 a.m. You have got a budget that refuses to stretch the way you need it to. So where do you even begin?

Lock-in is not theoretical: What UK organizations told us about cloud exit barriers

For years, vendor lock-in has been discussed as a theoretical risk. A concern to acknowledge in architecture reviews. A box to tick in compliance frameworks. A future problem that might need addressing. Our latest research reveals something more urgent. For UK organizations, lock-in isn't theoretical anymore. It's structural. It's measurable. And it's preventing organizations from acting on their own strategic priorities.

How to Fix Azure Integration Errors in Minutes Instead of Days

Azure integration errors can be difficult to diagnose when messages flow across multiple services such as Logic Apps, Service Bus, Azure Functions, APIs, and external systems. Support teams often spend hours searching through logs and correlating events across services just to identify where a transaction failed.

The sovereignty debate explained with Nine23

Who really owns your data? Data sovereignty has become one of the defining issues shaping digital infrastructure, cloud strategy and AI adoption. But what does it actually mean, and why has it become a board-level discussion for so many organisations? In Episode 4 of Perspectives from the Edge, Pulsant's Wendy Shearer is joined by Steve Jewell, CEO of Nine23, to explore data sovereignty and its relationship to security, resilience and digital transformation.

EU Data Act Compliance for Cloud and DevOps Teams: What Changes You Need to Make Before the Deadline

If you have been staying updated on trends, you already know that the EU data act is actually very quickly turning into one of the most important regulations for companies that manage business data and customers in Europe. Understandably, many companies have already been adjusting to privacy laws over the past few years. However, this new regulation has brought on different challenges.

Why Cloud Spending Keeps Rising Across the Financial Sector

Financial institutions have spent years modernizing their technology infrastructure, but cloud adoption continues to accelerate. From global banks to fintech startups, organizations across the financial sector are increasing their cloud budgets as they look for greater flexibility, efficiency, and access to advanced technologies.

The bottleneck has moved. AI is rewriting the Software Development Lifecycle

If you've read our previous piece on the 8 stages of AI engineering maturity, you know where your team sits. Turns out adopting AI is the easy part; adapting to its consequences is where most organizations struggle. For more than a decade, software organizations optimized around a single assumption: implementation capacity was scarce.

Eight best practices for a successful cloud migration strategy

Moving to the cloud is one of the most consequential decisions an IT organization makes. A successful cloud migration strategy sets the foundation for how your business scales, innovates, and competes. But too often, cloud migration initiatives stall, underperform, or force organizations to repatriate applications back on-premises because the groundwork wasn’t laid correctly.

Shipped: Stop rebuilding Views from scratch

In Explorer, you build a filter set and group-by to answer a cost question, and often that’s exactly the configuration you’d want to save for later. But saving it as a View meant navigating away from Explorer, opening the Views page, and rebuilding the same configuration from scratch: filter by filter, dimension by dimension. That friction was enough to discourage saving exploratory analysis as a View at all You can now save any Explorer analysis as a View in place.

AI pricing explained: what AI actually costs and how providers charge for it in 2026

AI pricing covers the cost structures and billing models providers use to charge for AI products: per-token APIs (GPT-4o at $2.50/1M input tokens), per-seat subscriptions (Copilot at $30/user/month), per-conversation billing (Agentforce at $2/conversation), and consumption-based GPU compute (H100 instances at $55.04/hour). There is no standard. The total AI cost is almost always higher than the sticker price.

Alibaba Cloud monitoring: What changes when scale, speed, and cost collide

Alibaba Cloud monitoring isn't AWS or Azure monitoring with a different logo. The way its services scale, absorb load, and send early warning signals follows its own logic and if you're watching the wrong things, you'll find out too late. Cloud monitoring conversations often follow patterns set by AWS and Azure. The metrics are familiar, dashboards look the same, and operational playbooks are built around expected infrastructure behavior.

LightMesh DHCP Integration: Always Know What's on Your Network

Dynamic Host Configuration Protocol (DHCP) activity changes faster than most IP inventory systems can keep up. Devices reconnect. Leases expire. Infrastructure changes constantly across servers, endpoints, and cloud environments. If your IP inventory cannot reflect those changes automatically, teams quickly lose confidence in the data they rely on to operate the network.

Why developer teams are rethinking their cloud provider this year

The default cloud choice for technically literate teams has shifted. It hasn't shifted dramatically; the major hyperscalers aren't going anywhere, and their enterprise position is still strong, but the conversation that used to start with "which hyperscaler" now genuinely starts with "what do we actually need." That's new.

Shipped: You're emitting AI telemetry. Point it at an engine that turns it into allocated spend.

Your AI calls already emit OpenTelemetry: your LLM gateway exports it, and it’s the open standard your own services can speak. But you don’t have anywhere to turn those spans into spend you can allocate to an outcome. Now you can. CloudZero exposes an OpenTelemetry endpoint that doesn’t care what’s on the other end.

How to monitor and optimize GPU utilization in the cloud

GPU utilization is one of the most expensive metrics in cloud infrastructure to get wrong. A GPU running at 30% utilization costs the same as one running at 90%, but it's doing a third of the useful work. For workloads measured in tens of thousands of GPU-hours, the difference between average utilization in the 30s and average utilization in the 70s is hundreds of thousands of dollars across the life of the workload.

Building More Resilient Multi-Cloud Operations

The last post in this series looked at how disconnected alerts can slow incident response and how stronger correlation helps teams investigate issues with more clarity. That same operational context has value beyond triage. It also plays an important role in resilience, service assurance, and the ability to maintain confidence across increasingly complex multi-cloud environments. Resilience depends on more than reacting well during an outage.

Shipped: What did the feature cost to ship? What does this customer cost to serve?

You can already split AI spend by team and by model. But that’s not what your CEO asks in the QBR. The question is what you got for it: what did it cost to ship that feature, to launch that campaign, to serve that customer. And is the AI bet behind it paying off? Now you can allocate AI spend to the outcomes you own: customer, product, feature, the strategic bet on the P&L. Not just the team that spent it.

The next era of telco clouds: get open infrastructure choice with Sylva and Canonical Kubernetes

The telco industry is undergoing a fundamental change. Over the past few years, the increasing maturity of cloud-native infrastructure has accelerated the movement from manually operated and hardware-centric systems to automated, software-defined platforms. Underpinning this change are open source initiatives such as the Sylva project. Sylva is hosted by Linux Foundation Europe and heavily backed by major telecom operators and vendors.

How Cloud Computing Is Revolutionizing Prop Firm Technology

The financial trading world has changed dramatically over the past decade, and much of that change has been driven by one thing: cloud computing. For proprietary trading firms, staying competitive means being faster, smarter, and more reliable than ever before. That is where prop firm technology comes in.

Claude Mythos pricing in 2026: Fable 5 costs, Mythos 5 costs, and what every model actually runs

Claude Mythos is now available to the public through Claude Fable 5, released June 9, 2026. Claude Fable 5 pricing is $10 per million input tokens and $50 per million output tokens, exactly 2x Claude Opus 4.8 ($5/$25). Claude Mythos 5 (the restricted Project Glasswing version) has identical pricing. Prompt caching cuts input spend by 90%. Batch API pricing is $5/$25 (50% off). In April 2026, Anthropic announced a model it said was too dangerous to release.

Shipped: Catch the runaway agent while it's still running.

AI spend has no ceiling. An engineer can burn $5,000 in an hour, and a team that spins up an agent on Friday can loop it on a bad prompt all weekend. You find out when the bill lands: the money is already gone, the damage pieced back together from logs. Cloud spend had a natural limit. Tokens don’t. Now you see it as it happens. Connect a source and the calls stream in within seconds. Within minutes they’re broken out by model, provider, agent, and user.

The Two-Sided Scheduling Problem: Reaching the Next Layer of Cloud Savings

You’ve deployed Karpenter or Cluster Autoscaler and tightened your resource requests, but while you saw an initial dip in your cloud bill, your savings have flatlined. Organizations that thought they had the fundamentals of cloud cost under control are now seeing stagnation. The problem isn’t that they need another FinOps tool or better visibility. The problem is that the current state of enterprise cloud cost optimization strategy is fundamentally reactive.

The Inference Paradox: How Split-Brain LLMs Are Killing Your GPU ROI

During the Toronto KCD (Kubernetes Community Days), I attended an insightful talk on AI resource optimization that highlighted a staggering Gartner study: “AI infrastructure is adding $401 billion in new spending this year alone. Yet, real-world audits tell a much darker story, revealing that average GPU utilization in the enterprise is stuck at a dismal 5%”. While many people in the audience were shocked by that number, the data didn’t come as a surprise to us.

Centralize DHCP Visibility with the Windows Discovery Agent

Your Dynamic Host Configuration Protocol (DHCP) server already knows what’s connected to your network. The problem is that DHCP data rarely stays aligned with the rest of your infrastructure systems. Instead, it becomes fragmented across Windows servers, branch offices, spreadsheets, and disconnected operational tools. Lease data ages, assignments go untracked, and teams lose confidence in their network inventory.

7 Best AI-Powered Virtual Labs Software for 2026

Virtual labs have been part of technical training programs for years, but the role of artificial intelligence inside these environments is changing how organizations build, manage, and scale hands-on learning experiences. While many discussions around AI focus on content generation or chat-based assistance, some of the most significant developments are happening behind the scenes.

Azure Deployment Strategies & CI/CD Best Practices | Harness Blog

‍ Learn how to master Azure deployment with CI/CD pipelines, progressive delivery, and feature flags. See how Harness helps engineering teams ship faster and safer on Azure. Azure deployment sounds straightforward. Push code, it runs in the cloud. But if you've managed a 2 a.m. production incident because a deployment went sideways on AKS, you know the gap between "it deploys" and "it deploys safely at scale" is significant.

Why the fastest teams standardize first

There's a version of this conversation that plays out in engineering organizations everywhere. Leadership pushes for standardization. Developers push back. The argument from developers is reasonable on its face: every codebase has different needs, every team has tools they're good at, and adding process feels like slowing down to go faster. It's a genuine tension, and it's also a false one. The teams that ship the most aren't the ones with the most infrastructure freedom.

The 8 stages of AI engineering maturity: a framework for teams

A few months ago, Steve Yegge published his 8 levels of AI-assisted development, and it clicked the moment I read it, because I had lived that exact progression myself, moving from autocomplete to running agents one step at a time. Framed as an AI trust gradient, it finally gave the industry a vocabulary for something most of us were already going through without a name for it. If you haven’t read it, save it for later.

AI Economics Pulse: Your AI line item is winning, but is it working?

This edition of the Pulse is shifting lanes. We’re calling it the AI Economics Pulse now, because the question on every finance leader’s mind is whether AI spend and the returns on it can be made to pair at all. That question came to a head over the last few weeks. The bills came due, and they came due in public. Uber burned through its entire 2026 AI budget in four months and capped employee spending on Claude Code and Cursor at $1,500 a month.

Shipped: The AI spend on your team's laptops is the part you can't see.

Your engineers run Claude Code. Your designers are in Cowork. Half the company has Claude open in a browser tab, and a few are on Cursor. It’s on their laptops, each person authenticated a different way, and none of it touches your gateway. The only record you get is one lump-sum bill at the end of the month. Now you can capture it where it happens – on the laptop.

Claude Code alternatives in 2026: 10 AI coding tools compared on cost, features, and AI ROI

Something unusual happened in the first half of 2026: the most productive AI coding tool on the market became the most financially dangerous. And the companies that discovered this the hard way read like a Fortune 50 roll call.

Shipped: Counting tokens isn't enough. Start connecting them to outcomes.

You’re funding AI across four billing relationships – Anthropic direct, OpenAI, Claude through Bedrock, Claude through Vertex – and the spend climbs every month. When your CEO asks what it’s producing, you have a number and no answer. Not which product it built, which customer it served, or which bet it’s paying off. And you’re being asked to approve more of it.

From Visibility to Real Savings: Turning FinOps Insights into Measurable Cost Reduction

FinOps programs are maturing, and most organizations have better visibility into cloud spend than ever before. Dashboards are full of data. And yet costs keep climbing. The problem isn’t the data. It’s the gap between knowing where the waste is and actually eliminating it. In this joint session, Tangoe and Kubex come together to bridge that gap. Tangoe brings deep expertise in spend management and FinOps discipline, while Kubex delivers infrastructure-level optimization across cloud, Kubernetes, and the AI and GPU workloads that are rapidly becoming the next frontier of cost pressure.

A practical guide to standardizing app delivery without rebuilding everything internally

Standardize the route from code to production. Everything else is a team decision, not a platform problem. Most app delivery problems do not start with bad engineering. They start with too much variation. One team provisions environments manually. Another keeps deployment notes in a wiki. A third has a staging setup that only one engineer understands. Security reviews happen late because the platform does not make the safe path obvious.

The Future of FinOps: Engineering, Applications & Cloud Cost Accountability

In this episode of the FinOps on Azure Podcast, Michael Stephenson is joined by Ben DeBow, Founder and CEO of Fortified, to discuss the next evolution of FinOps and why cloud cost management needs to move beyond dashboards, reporting, and allocation. Ben shares insights from years of helping enterprises optimize cloud spend and explains why the biggest savings opportunities are often hidden inside applications, workloads, and engineering decisions—not infrastructure.

How Operations Teams Use Break-Even Analysis to Improve Business Performance

Operations teams don't usually spend their days thinking about profit margins. They're thinking about systems, processes, staffing, infrastructure, and support tickets-automation projects that may or may not save time. Still, almost every major operational decision comes back to the same question: Is this worth the investment? That's where break-even analysis comes in.

Sovereign GPU cloud: Data residency across training, inference, and model weights

Sovereign cloud conversations usually center on where customer data sits at rest. The provider points at a UK data center, the contract gets signed, and procurement marks the box. For most workloads, that's a defensible position. For GPU workloads, it isn't.

How to Manage Expense Tracking Across Different Cloud Systems

As businesses use more cloud technology, expense management becomes more complex. Many organizations use multiple cloud platforms for accounting, project management, communication, customer service and data storage - these systems are flexible plus scalable but they can create challenges when financial information is in different locations. Tracking expenses is slow and contains many errors if there is no structured approach.

5 questions you should be asking about cloud dependency

Cloud infrastructure has become the backbone of modern business operations. But as organizations deepen their reliance on cloud providers, a critical question often goes unasked: just how dependent are we, and at what cost? For years, the cloud adoption narrative focused on agility, scalability, and cost efficiency. Those benefits remain real. But the landscape is shifting.

Checklist: how to reduce environment drift without slowing devs or AI agents

Environment drift persists when teams standardize code but leave infrastructure, data, and access decisions to individual teams and manual setup. Most teams know their environments are not identical. What they underestimate is how quietly the gap widens. A database version is out of sync between production and staging; an environment variable is added manually to one server but never tracked; a cron job runs in production but was never captured in the dev config.

What is Cloud Infrastructure? Everything You Need to Know

Modern businesses need infrastructure that can scale as quickly as their demands change. Yet many organizations still struggle with infrastructure that is costly to maintain, difficult to expand, and slow to adapt to new requirements. As applications, users, and data continue to grow, managing resources efficiently becomes increasingly challenging. Cloud infrastructure provides a more flexible approach.

UK GDPR compliance for cloud and hosting: requirements, risks and responsibilities

UK organisations using cloud services carry a clear legal obligation: they must demonstrate compliance with UK GDPR and the Data Protection Act 2018, not simply assert it. The shift to cloud and hosted infrastructure does not transfer that responsibility to a provider. It distributes it across a chain of controllers and processors that regulators expect you to understand and manage. Post-Brexit, that obligation is set within a distinct legal framework.

Shipped: Keep your cost allocation logic out of the wrong hands

CostFormation is how your organization models cost allocation. As more teams adopt it, protecting that logic matters. RBAC for CostFormation Namespaces lets you scope access at the namespace level, so the right people can view and edit Dimensions, and everyone else can’t.

Minga cut infra costs 30-40% - and it scales itself | Control Plane

Minga checks in 1.5 million students across the eastern seaboard by 8:30 AM Eastern — then lets that infrastructure wind down an hour later. After migrating to Control Plane, they cut infrastructure costs 30–40% and traded fragile, manual scaling for a platform that scales itself.

What is Azure Cost Management? Complete Beginner's Guide (2026)

What is Azure Cost Management and how can it help you control cloud spending? In this complete beginner's guide, Michael Stephenson explains the fundamentals of Azure Cost Management and walks through the core features available in Azure. You'll learn how Azure organizes billing accounts, subscriptions, resource groups, and management groups, and how these structures affect cost tracking and reporting.

10 Enterprise AI Infrastructure Voices Worth Following

Enterprise AI has crossed an inflection point. The model problem is largely covered. What remains unsolved is the operational impact: how to run AI inference and agentic processes continuously, reliably, and at a cost that doesn’t cancel out the value. Most enterprises are discovering this the hard way. GPU utilization dashboards show 80%. Actual compute efficiency is half that. Token demand is compounding at 200-500% annually as agents multiply every action into dozens of model calls.

CloudZero AI Hub: The nexus of autonomous AI cost control

CloudZero originated as a way to make sense of your cloud costs. Costs spread across bills with billions of line items belonging to resources that might or might not have been tagged (or taggable), spun up by engineers working across teams, on different microservices, features, and products, that served a wide range of customers. Kubernetes. Multi-cloud. Check, check, check.

AI ROI: How to measure and provide the return on AI investments in 2026

Every quarter, the same scene plays out in boardrooms across the Fortune 500. The CEO asks: “What is the return on everything the company is spending on AI?” The CTO talks about productivity gains and developer velocity. The CFO points at a cloud bill that doubled but cannot isolate which line items are AI. The board nods politely and tables the discussion until next quarter, when the same question will produce the same non-answer. (If this sounds familiar, you are not alone. Keep reading.)

How to ship a POC in an afternoon: a Claude Code and Upsun walkthrough for product and product marketing

I have an Upsun project that's nothing but proofs of concept. It's a dashboard, basically. Each POC gets its own tile. Click in, and you land on a page with three tabs. The first tab is a written explanation of what the POC argues. The second tab is the POC itself, with a built-in demo that automates a walkthrough of the feature so the recipient can watch it run without me on the call.

Code isn't cheap, but POCs are

I keep hearing the phrase "code is cheap." I don't know who came up with it. Whoever it was clearly has not seen an Anthropic bill. I get what they mean. The cost of writing a line of code has cratered, AI does most of the typing, you know the rest. Fine. But the phrase is combative in a way that doesn't help anyone, especially the engineers in the room. "Code is accessible" lands better. Less swagger, more honesty. Either way, here's the line my friend Guillaume gave me that finally cracked it open.

How to deploy Canonical Managed Kubeflow on Microsoft Azure?

Learn how to deploy Canonical Managed Kubeflow on Microsoft Azure step by step. Canonical's Managed Kubeflow on Azure gives enterprise and startup AI teams a fully operational, open source MLOps platform in under an hour. It is managed 24/7 by Canonical's engineers. This means you can focus entirely on building models rather than running infrastructure.

How LivePerson optimized Logstash and Kafka performance on GCP through benchmarking

By benchmarking five GCP machine types across both Logstash and Kafka, LivePerson's observability team found that infrastructure selection (not just pipeline configuration) is one of the highest-leverage cost optimization decisions at scale.

Claude Opus 4.8: Pricing, benchmarks, and which model to actually run

Anthropic shipped Claude Opus 4.8 on May 28, 2026, exactly 41 days after Opus 4.7. The SERP was empty for two days after launch. Not because nobody cared. Because engineering managers and finance teams were doing the math on whether the bill changes.

The AI ROI Company's new groove: CloudZero's new UI, and what it means for customers

Customizability. Feature velocity. Performance. Capabilities that are critically important to all B2B software users. And capabilities in which CloudZero’s brand-new platform specializes. Pitching a total frontend overhaul didn’t necessarily make me CloudZero’s most popular new PM. But it’s made CloudZero faster, more customizable for a wider range of personas, and easier to update with the new features that matter most to our customers. And, if I may say, it also looks beautiful.

What High-Performing DevOps Teams Get Right About Cloud Security

Most DevOps teams understand that cloud security matters, but the gap between understanding the problem and operationalizing it effectively remains fairly large. Cloud environments move quickly, infrastructure changes constantly, and teams are under pressure to deploy faster without creating unnecessary friction inside development pipelines.

AI ROI is an allocation problem

AI spend is going parabolic, and the labels on the bill (OpenAI, Anthropic, Gemini) are about all a CXO gets to work with. The hard part of tying that spend to outcomes is structural. A major portion of AI spend isn’t COGS. It’s the spend on coding agents producing the software, the spend on building marketing content, the spend on custom sales tooling, the spend on Intercom agents and Sybill analysis.

How platform standardization will help you deliver on your KPIs

IT leaders rarely think they have an infrastructure problem. When a roadmap slips or an audit finding lands, the reflex is to hire more senior engineers, a bigger platform team, another DevOps lead. But headcount is rarely the real lever. The bottleneck is the "hidden factory": the undocumented, invisible work that sits between a developer writing code and that code reaching customers. It doesn't show up in post-mortems because engineers treat the workarounds as normal.

Migrate to Azure Managed Redis with Datadog and Eden

Azure Managed Redis is a Microsoft first-party, fully managed in-memory data store, replacing Azure Cache for Redis tiers. It includes Redis Enterprise features such as RediSearch for vector search and full-text search, in addition to RedisJSON, RedisTimeSeries, and Active Geo-Replication. As Azure Cache for Redis reaches end of life, more teams are planning migrations to Azure Managed Redis in search of better performance, lower cost, and modern capabilities for AI and real-time workloads.

A deep dive into AWS data perimeter misconfigurations

In AWS environments, a data perimeter is a set of preventative controls that help ensure that your trusted cloud identities (principals or AWS services acting on your behalf) are accessing trusted resources from authorized networks. You can apply these controls at various levels of your infrastructure, such as per resource or across all resources in your AWS account.