Operations | Monitoring | ITSM | DevOps | Cloud

Detect, Communicate, Resolve: Checkly's Agentic Workflow End-to-End

Coding agents are the fastest-growing audience for the Checkly CLI, and we're doubling down on them. In this session, Stefan hands Claude a real e-commerce app, lets it set up monitoring with `npx checkly init`, generate Playwright tests through MCP, and walk an actual alert end-to-end with Rocky AI in the loop.

Two AI agents, one incident: Rocky AI comes to the terminal

A Playwright Check fails at 2 am. The login flow is broken. Until today, that alert triggered a human to get up, open the Checkly dashboard, copy Rocky AI root cause analysis (RCA), and then tell an agent to get to work. There were two AI agents, one incident, and no way for them to talk to each other. The extended checkly checks and new checkly rca CLI commands close that gap. Your coding agent can now pull Rocky AI's analysis into its ongoing work, read the diagnosis, and go fix the code.

Connecting Agents for Real-Time Root Cause Analysis with Checkly's Rocky AI

Rocky, Checkly's AI agent, monitors production sites and provides an analysis for every failing check. Previously, a coding agent couldn't access this analysis, leaving incidents and agents disconnected. Now, you can access all the analyses via the Checkly CLI (or API) and tell your coding agent, "Hey, I got a Checkly alert. Please investigate!" With Rocky's structured analysis delivered inline, the coding agent can start with a strong hypothesis, fix issues, and propose a PR in one session.

Eliminate Manual Authentication Configuration for Fast & Effective API Security Scanning | Harness Blog

Application security testing tools promise coverage and accuracy, but teams often struggle just to get started. One of the biggest friction points in dynamic application security testing is configuring authentication correctly so a scanner can even access a target application, let alone API endpoints that power the functionality. Whether it’s API keys, bearer tokens, or custom auth flows, setting up authentication for scans frequently requires trial-and-error and engineering support.

The Claude Bill is Too Damn High #speedscale #claude #aiagents #aicoding #devops #llms

Stop overpaying for AI reasoning by trading expensive GPU cycles for efficient, deterministic testing. This video explores how tools like linters and traffic replay can complement Claude, helping you fix bugs more accurately while cutting token usage by up to 50%. Visit: speedscale.com to learn more.

Run Local LLMs on Mac to Cut Claude Costs

Part of the motivation for this post is how cloud API economics are shifting: Anthropic is moving large enterprise customers toward per-token, usage-based billing (unbundled from flat seat fees), which makes “always call the API” a moving cost line for teams at scale. A hybrid or local layer is one way to keep spend bounded while you still use premium models where they matter.

Introducing on-demand Pipelines: run pipelines via API

Your CI/CD pipeline doesn’t have to live in a YAML file anymore. With on-demand pipelines, you can generate pipeline definitions programmatically, from scripts, services, or automation tools – and execute them instantly via the Pipelines API. No commit. No pull request. No static configuration to modify. Just build the YAML your situation demands and run it.

Replace API Synthetics with Traffic Replay

The alert fires at 2 AM. Your observability platform’s synthetic test just failed. Login is broken. So you open your laptop, pull up the dashboard, and stare at a single red dot: the browser test. You know the problem is somewhere in the stack, but not where. Is it the auth service? The token validator? The user profile API? The API gateway timing out? You’re now about to spend the next 45 minutes correlating traces, tailing logs, and manually hitting endpoints until you find it.

Dark Code: The AI-Generated Software Nobody Understands

The biggest risk to your product isn’t AI-generated code that doesn’t work. It’s generated code that seems fine. AI doesn’t optimize for correctness. It creates something passable. Something that passes the smell test. And when everybody in the industry is pushed to move faster and do more with less, you end up shipping software that looks correct. It passed your quick visual check. It passed all the tests. But no one ever fully understood it.

Beyond AI Vibes: Deterministic Foundations for Agentic Coding

Every week there is another model drop, another agent framework, and another workflow tweak you are supposed to evaluate. Meanwhile, the largest companies, the ones operating at the highest scale and leaning hardest on AI, are also the ones making headlines for reliability strain: capacity limits, outages, and services that buckle under load.

Your "clean code" is costing the company millions #speedscale #darkcode #coding #aiagents #claude

If it passes the tests, it’s not my problem. If you’re still manually checking every line of code in 2026, you’re just wasting company time. Let the AI cook and go touch some grass. Check out: speedscale.com.

Building Agent-Friendly CLIs - What we learned at Checkly

Building Agent-Friendly CLIs: Why Your AI Agent Already Loves the Checkly CLI Stefan explains why products, docs, and CLIs must be AI-ready as coding agents rapidly become primary users of the Checkly CLI. He outlines key CLI features for agent workflows: Stefan demos how an agent initializes project-tailored Checkly setup from scratch without any human intervention and also shows how agents can entirely automate the incident life cylce from resolution to status page communication.

How to Monitor a Shopify Store with Playwright and Checkly

This is a guest post by Vince Graics, Staff QA Engineer at World of Books. If you're running a Shopify storefront and want reliable synthetic monitoring, you'll hit a wall. Shopify's bot detection doesn't care that your headless browser is friendly; it sees datacenter IPs and acts accordingly. Cart API calls get hit with 429 rate limits, Cloudflare challenge pages pop up mid-check, and you're left wondering whether the bug is in your code or in the platform fighting you.

The Best SKILL.md Is the One You Never Update - Meet Checkly's CLI

Most agent skills are static — frozen documentation snapshots that go stale the moment APIs change or flags get deprecated. Checkly does it differently. Our SKILL.md is just 100 lines of CLI pointers. No baked-in docs. Your coding agent learns what it needs, when it needs it, straight from the Checkly CLI.

When Your Observability Literally Stops Traffic

Last week, a fleet of autonomous robotaxis in China suddenly stopped working—at scale. Over a hundred vehicles stalled across a city, stranding passengers in traffic and raising immediate concerns about safety, reliability, and trust in autonomous systems. This wasn’t just a bad day for self-driving cars. It was a distributed systems failure, one that happened in the physical world, not just in dashboards.

OpenTelemetry Trace Testing for CI Release Gates

OpenTelemetry is great at answering one question: “what just broke?” The problem is that most teams need a different answer first: “what is about to break in this release?” That is where trace-based testing comes in, especially for teams running a vendor-neutral OTel stack (Collector + Tempo/Jaeger + Prometheus) and needing deterministic release gates.

Instrument and monitor Boomi integration flows with OpenTelemetry and Datadog

Boomi is an Integration Platform as a Service (iPaaS) used by thousands of organizations to connect applications, data, and workflows across cloud and on-premises environments. Business-critical processes, from order fulfillment pipelines to customer data synchronization, depend on Boomi Atoms and Molecules running reliably.

Why Autonomous AI Agents Can't Run on SaaS Infrastructure

The era of the “copilot” is ending. We are moving rapidly toward the era of the autonomous software factory, where autonomous agents don’t just autocomplete our code—they investigate, plan, test, and merge entire features while we sleep. But this shift has exposed a critical flaw in how we consume AI. For the past decade, the default motion for enterprise software has been SaaS. It’s easy, frictionless, and managed by someone else.

From Datadog to CI Tests: Catch Regressions Before Deploy

I worked in observability for years, and the same pattern showed up across teams. An alert fired, the on-call rotation scrambled, and everyone did what they had to do to stabilize production. Then came the retrospective. Once the immediate pressure was gone, the conversation shifted to one question: how do we make sure this never happens again? My friend Jade Rubick coined a name for that principle: DRI, “don’t repeat the incident”.

Why You Should Stop Buying SaaS and Start Building It

The "Buy vs. Build" rule is dead. Generic CRMs are too slow for lean startups, so we built our own. In this video, Ken breaks down "Radar," the custom AI dashboard we use at Speedscale to automate prospecting and outreach. Stop fighting bloated SaaS and start building the exact tools you need to solve your distribution problem. Learn more: speedscale.com.

Checkly Playwright Reporter: A Cloud Dashboard for Your Playwright Tests

The Checkly Playwright Reporter is an npm package that sends the results of npx playwright test to Checkly as a cloud test session, including traces, screenshots, videos, and full debugging context. Run your Playwright suite in CI or locally, and every result gets a persistent, shareable home in Checkly with AI-powered analysis, richer trace-derived views, and a direct path to production monitoring. It does not replace Playwright. It makes the output of Playwright much easier to work with.

Why SaaS is Dying (and what's next) #speedscale #saas #data #datasecurity #devops #technews

Traditional SaaS is a data trap. It’s time to stop sending your most valuable asset to third parties. Enter BYOC (Bring Your Own Cloud): the future of data sovereignty, where the software comes to you. Visit: speedscale.com.

Playwright Myths Busted: Speed, Flakiness, Production Monitoring & AI Test Generation

Playwright is too hard, too slow, and too flaky — right? In this webinar, Stefan busts six common end-to-end testing myths and shows how to reuse your Playwright tests as production monitors with Checkly. He covers codegen, trace viewer, UI mode, flakiness root causes (and fixes), and a quick look at Playwright MCP for AI-assisted test generation.