Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

{unscripted} AI Verification and Rollback

Our first AI/ML capability, Continuous Verification, made Harness the first Continuous Delivery tool to understand observability telemetry and trigger rollbacks when deployments caused trouble. We knew we could do more to eliminate the friction involved in its setup. Deploying with confidence shouldn't require a coordination meeting between DevOps, SREs, and developers just to configure the right health checks. That’s why we’re introducing the next generation: AI Verification and Rollback.

{unscripted} AI in Chaos Engineering

Harness AI enhances your chaos engineering capabilities by leveraging artificial intelligence to automate and optimize reliability testing and analysis. One of the challenges of scaling up the Chaos Engineering practice within the organization is skilling up the users to create or run chaos experiments and to come up with solutions to mitigate the risks that are identified during the chaos experiment execution. The Chaos Engineering module comes with an AI Agent called "AI Reliability Agent" that helps in these aspects.

API World 2025: Growth, Memories, and Next Steps

A couple of weeks ago, our team returned from API World. We’ve officially had a few weeks to decompress and get back into the swing of things after an incredible time at API World 2025. Looking back, the experience was even more rewarding than I had imagined in my Pre-API World blog. This year was especially memorable for me, as I had the opportunity to attend my first tech conference and travel across the country for work. I’m still buzzing from everything I learned and the people I met.

Global Online Meetup: K3k

Even though multi-tenancy isn't a new concept, when it comes to Kubernetes, implementing the concept can come with its own set of challenges - noisy neighbours, operational complexities, and, of course, security considerations. Sounds like a lot? Well, that's why it's essential to strike a balance between flexibility and optimising resource utilisation. Join Divya Mohan at 2 PM UTC on 25th September as she hosts Rossella Sblendido and Jean-Phillipe Gouin to explore how the K3k project from SUSE helps us achieve all this and more in this edition of the Global Online Meetup.

Why Security Must Include Cost Accountability In The Cloud

A SaaS team once spotted their first breach not in a SIEM dashboard, but in their AWS bill. Their compute costs spiked by 400% overnight. Turns out, an attacker had spun up dozens of high-powered instances for crypto mining. Logs eventually confirmed the intrusion, but the cost anomaly was the first signal that something was wrong. This incident isn’t unusual. Cloud costs often reflect consumption, but they can also reflect compromise.

Monitor Kubernetes Hosts with OpenTelemetry

It’s 3 AM. API latency just spiked from 200ms to 2s. Alerts are firing, and users are frustrated. You SSH into the first server: top, free -h, iostat — nothing unusual. On to the next host. And the next. That’s how most of us learned to debug. The tools worked, and we got good at using them. But as infrastructure became distributed and dynamic, this approach started to break down. Modern monitoring needs more than SSH and top. It needs unified telemetry.

Densify Talks, CNCF, OpenAPI, and Kubernetes with Dan Ciruli from Nutanix

<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span> Andrew Hillier sits down with Dan Ciruli, who leads the cloud native product management team at Nutanix.