%term

Why agentic AI development needs reliability guardrails

May 15, 2026 By Gavin Cahill In Gremlin

AI has massively accelerated code deployment. In fact, since the introduction of agentic coding, GitHub has seen exponential growth in PRs, commits, and new repos. What they originally predicted would require 10X capacity, they’re now estimating it’s going to require 30X capacity, and the biggest driver is agentic development. Companies across industries are building agentic pipelines to ship features faster than ever before. That acceleration isn’t without risk.

Read Post

Gremlin

Read more about Why agentic AI development needs reliability guardrails

Learn these 4 Chaos Engineering Principles Before You Break Anything | Resilience Testing | Harness

May 5, 2026 By Harness In Harness

Want to start chaos engineering? Don't randomly break stuff and hope for the best. Real chaos engineering starts with defining your system's steady state metrics like latency, throughput, and error rates. Then you form a clear hypothesis about what should happen when failures occur. Next, you inject controlled failures, starting small with single pod kills or network drops, not production meltdowns. Finally, you limit the blast radius by running experiments in safe environments first.

View Video

Harness

Read more about Learn these 4 Chaos Engineering Principles Before You Break Anything | Resilience Testing | Harness

Chaos Engineering vs. Traditional Testing: What's the Difference? | Resilience Testing | Harness

Apr 21, 2026 By Harness In Harness

Stop treating system outages like surprises and start preparing for them. While traditional software testing is the bedrock of development, using unit, integration, and regression tests to verify that code meets specific requirements, it only accounts for what we expect to happen. Chaos Engineering takes a different approach by shifting the focus from bug prevention to system resilience. Instead of asking "does this work?", Chaos Engineering asks "how does this survive?" by injecting real-world turbulence like network latency or pod failures directly into production-like environments.

View Video

Harness

Read more about Chaos Engineering vs. Traditional Testing: What's the Difference? | Resilience Testing | Harness

What is Chaos Engineering? Explained in 60 seconds | Resilience Testing | Harness

Apr 8, 2026 By Harness In Harness

Discover how leading engineering teams proactively build rock-solid applications using Chaos Engineering. Learn why waiting for real outages is risky and how intentionally injecting controlled failures like pod crashes, network latency, and node restarts helps uncover hidden weaknesses before they impact your users. In this short, explore the simple yet powerful practice that turns fragile systems into resilient ones and how Harness makes running chaos experiments effortless and safe with its intuitive Resilience Testing module.

View Video

Harness

Read more about What is Chaos Engineering? Explained in 60 seconds | Resilience Testing | Harness

3 Biggest Myths of Chaos Engineering

Mar 30, 2026 By Harness In Harness

Are myths about chaos engineering preventing your team from building more resilient systems? In this video, Matt Schillerstrom, Director of Product Management at Harness and founding engineer of the chaos engineering program at Target.com, breaks down the three most common misconceptions about chaos engineering. Drawing from his experience building large-scale programs, Matt explains how to move past these myths to build confidence in your infrastructure.

View Video

Harness

Read more about 3 Biggest Myths of Chaos Engineering

The hidden reliability risks in your agentic AI workflows

Mar 17, 2026 By Andre Newman In Gremlin

Artificial intelligence recently took a major leap from “saying” to “doing.” Instead of simple back-and-forth chats, we’re now allowing automated AI processes to take action on our behalf—from responding to emails to building and deploying complete applications. This shift from “assistant” to “actor” can make applications more capable, but it also creates additional failure modes.

Read Post

Gremlin

Read more about The hidden reliability risks in your agentic AI workflows

Test your AI model training reliability, too

Mar 13, 2026 By Gremlin In Gremlin

Training is at the heart of every LLM model, but it’s still an application running on an infrastructure, which means it can fail. Our GPU test helps you test your training GPUs so you don’t lose that valuable work. TRANSCRIPT: One of the things we built recently was the GPU Gremlin. So if you are training a bunch of models and you're doing a bunch of GPU testing. You know, we want to give you the tools to be able to go test that, to understand how training the model could fail.

View Video

Gremlin

Read more about Test your AI model training reliability, too

How Gremlin makes disaster recovery testing easier and faster

Mar 4, 2026 By Gavin Cahill In Gremlin

There’s a common saying: “A backup isn’t a backup until you’ve tested it.” The same is true whether it’s a simple database failover or an entire data center/cloud provider failover. You simply won’t know if it works if you don’t test it. When it comes to disaster recovery testing, that can be an expensive, painful, and arduous process. But it’s required by companies for a reason. And not just for disasters like hurricanes, flooding, or earthquakes.

Read Post

Gremlin

Read more about How Gremlin makes disaster recovery testing easier and faster

From Chaos Engineering to Resilience Testing: Why We're Expanding How Teams Validate Reliability | Harness Blog

Feb 27, 2026 By Uma Mukkara In Harness

At Harness, we’re committed to helping teams build and deliver software that doesn’t just work – it thrives under pressure, scales reliably, and recovers swiftly from the unexpected. Today, we’re taking the next step in that mission by evolving our Chaos Engineering module into Resilience Testing. This evolution reflects how reliability is tested in practice today.

Read Post

Harness

Read more about From Chaos Engineering to Resilience Testing: Why We're Expanding How Teams Validate Reliability | Harness Blog

You need to regularly test your reliability

Feb 24, 2026 By Gremlin In Gremlin

Reliability testing isn’t a one-and-done thing. You need to test on a regular schedule to make sure your system is reliable in the face of changing systems.

View Video