Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

How to launch a Deep Learning VM on Google Cloud

Setting up a local Deep Learning environment can be a headache. Between managing CUDA drivers, resolving Python library conflicts, and ensuring you have enough GPU power, you often spend more time configuring than coding. Google Cloud and Canonical work together to solve this with Deep Learning VM Images, which use Ubuntu Accelerator Optimized OS as the base OS. These are pre-configured virtual machines optimized for data science and machine learning tasks.

Capture and Use Network Response Data in AI Powered Testing

Learn how to capture and use response data from network calls to build smarter and more reliable AI-driven tests. This walkthrough covers the full workflow from configuring user actions to extracting backend responses, validating data, and creating dynamic test flows. You will also see how response data improves debugging visibility and supports data-driven automation. The video includes Ideal for developers, testers, and platform engineers looking to improve the accuracy and resilience of AI-powered test suites.

Gamifying FinOps (And CloudZero) For Better Adoption

In our increasingly online world, managing cloud, AI, and other tech spend has shifted from a good idea to an absolute necessity. But even when cost management is a priority, how do you get busy development teams and engineers actively engaged in the new practices? New initiatives are often viewed as more work on the team’s plate, which is an understandable deterrent to adoption. That leaves FinOps proponents struggling to get others on board.

The AI Cost Crisis: 'AI Cost Sprawl' Is Crashing Your Innovation (AI Cost Sprawl Explained + How To Fix It)

AI should speed up innovation, not inflate your cloud bill. But today, the biggest GenAI challenge for SaaS teams isn’t model quality; it’s cost. And increasingly, that cost comes from AI cost sprawl. That’s not because anyone is doing something wrong, but because AI operates differently from the cloud services we’ve all spent a decade learning how to manage.

Accelerating Our Mission to Bring AI to Everything After Code

Since launching Harness in 2017, we’ve been on a mission to unlock faster innovation by removing the bottlenecks that slow software engineering teams down. From day one, we believed that the biggest obstacles in engineering weren’t in writing code — they were in everything that followed.

Why cloud fragmentation is slowing teams down and how unified platforms solve it

Engineering teams today manage infrastructure spread across multiple clouds and tools. Whether this happened through gradual accumulation or deliberate strategy, the result is the same: complexity that slows teams down. Managing each cloud separately with different tools and workflows is a bottleneck to delivery speed, operational efficiency, and platform reliability.

Cutting tech debt at the source: how cloud application platforms put IT back on offense

For most Central IT leaders, tech debt isn't a surprise. It's the silent tax on every roadmap, every quarterly plan, every conversation about why things take so long. Modern cloud application platforms (true PaaS environments) give IT leaders a path to unwind years of accumulated complexity while simultaneously accelerating innovation. You no longer have to tolerate the tax.

What I Learned From Building an eBPF-Based Traffic Capture Application

I just finished building Speedscale’s eBPF-based component to capture and analyze network traffic in a Kubernetes cluster, and it forced me to confront some uncomfortable truths about observability. While there were certainly some challenges along the way, particularly in dealing with Go applications, the approach was relatively straightforward.

The rhythm of reliability: inside Canonical's operational cadence

In software engineering, we often talk about the “iron triangle” of constraints: time, resources, and features. You can rarely fix all three. At many companies, when scope creeps or resources get tight, the timeline is often the first element of the triangle to slip. At Canonical, we take a different approach. For us, time is the fixed constraint. This isn’t just about strict project management. It is a mechanism of trust.

Harnessing the potential of 5G with Kubernetes: a cloud-native telco transformation perspective

Telecommunications networks are undergoing a cloud-native revolution. 5G promises ultra-fast connectivity and real-time services, but achieving those benefits requires an infrastructure that is agile, low-latency, and highly reliable. Kubernetes has emerged as a cornerstone for telecom operators to meet 5G demands.