Monthly Archive

How QA engineers use AI to keep up with agentic development

Jun 26, 2026 By Rootly In Rootly

QA Lead at Rootly explains how she's embraced AI to keep up with the pace of AI-driven feature development.

View Video

Rootly

Read more about How QA engineers use AI to keep up with agentic development

It's always DNS, even at Cisco: behind a weeks-long incident

Jun 26, 2026 By Rootly In Rootly

SRE Lead Ricard Bejarano (Cisco) and Jorge Lainfiesta (Rootly) sit down to talk about a recent intermittent incident that had the team scratching their heads.

View Video

Rootly

Read more about It's always DNS, even at Cisco: behind a weeks-long incident

High Cardinality in ClickHouse at Scale: What Actually Breaks

Jun 25, 2026 By Prathamesh Sonpatki In Last9

ClickHouse swallows high-cardinality telemetry at ingest, then breaks at query time weeks later. Here is what fails, and how we keep it fast in production. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

Read Post

Last9

Read more about High Cardinality in ClickHouse at Scale: What Actually Breaks

Klaudia Under the Hood: How We Built an AI SRE That Actually Earns Trust

Jun 18, 2026 By Asaf Savich In Komodor

In reliability engineering, being ‘mostly right’ is a liability. An AI SRE that sometimes misses the root cause or gives a confident, wrong answer at 2:17 AM has no place in an enterprise cloud environment. In this context, silence is better than noise. That’s the bar Klaudia is built to clear: genuine reliability that you can trust in production. The kind of reliability that earns a place alongside your best engineers. Getting there requires more than just a capable model.

Read Post

Komodor

Read more about Klaudia Under the Hood: How We Built an AI SRE That Actually Earns Trust

Why API Reliability Is Critical to Modern Finance

Jun 17, 2026 By OpsMatters In OpsMatters

Financial APIs power payments, compliance, and customer services. Learn why observability, monitoring, and API reliability are vital to resilience.

Read Post

OpsMatters

Read more about Why API Reliability Is Critical to Modern Finance

ClickHouse LowCardinality: When It Helps and When It Hurts

Jun 15, 2026 By Prathamesh Sonpatki In Last9

ClickHouse LowCardinality cuts storage and speeds up queries on low-cardinality columns, but backfires on trace IDs. How to tell the difference. Prathamesh works as an evangelist at Last9, runs SRE stories - where SRE and DevOps folks share their stories, and maintains o11y.wiki - a glossary of all terms related to observability.

Read Post

Last9

Read more about ClickHouse LowCardinality: When It Helps and When It Hurts

Introducing the Rootly Agent

Jun 11, 2026 By Rootly In Rootly

During an incident, ask the Rootly Agent anything and it'll respond (and act) based on context and your data. Use the Rootly Agent to: The Rootly Agent performs actions on your behalf, so it is bound by the permissions assigned to your user. It will also ask for confirmation before taking significant actions. Rootly admins can turn it on for their workplaces and start running incidents even more efficiently.

View Video

Rootly

Read more about Introducing the Rootly Agent

Should platform, SRE, and security merge into one function?

Jun 4, 2026 By Cristina Buenahora In Cortex

Platform, SRE, and security are three distinct functions in modern engineering orgs, each shaped by a different problem. SRE was the operations function's answer to scale: how to keep systems reliable when the systems get big. Platform answered a different problem: how to let developers ship without becoming infrastructure experts. Security drew the line on what could safely reach production.

Read Post

Cortex

Read more about Should platform, SRE, and security merge into one function?

Best Log Management Software for DevOps and SRE Teams in 2026: Feature and Cost Breakdown

Jun 3, 2026 By Libi Michelson In logz.io

TL;DR Picking the right log management platform in 2026 comes down to three things: how much operational overhead you can absorb, how much AI automation you need, and what you’re willing to spend.

Read Post

logz.io

Read more about Best Log Management Software for DevOps and SRE Teams in 2026: Feature and Cost Breakdown

Running AI at Enterprise Scale w/ Anthropic, Descope, Port, Rootly and Twingate

Jun 3, 2026 By Rootly In Rootly

The debate about whether AI can write production code is over. Companies are handing work to fleets of agents, and for many, they write most of the code that ships to production. The next challenge is everything that happens once an entire engineering organization runs this way, at full speed. Teams that generate code 10x faster still review it at human speed, and that mismatch is now the constraint. Code ownership is also becoming an issue, as developers learn to trust agentic processes a little too much. When an agent breaks production, who is responsible?

View Video

Rootly

Read more about Running AI at Enterprise Scale w/ Anthropic, Descope, Port, Rootly and Twingate

Operations | Monitoring | ITSM | DevOps | Cloud

How QA engineers use AI to keep up with agentic development

It's always DNS, even at Cisco: behind a weeks-long incident

High Cardinality in ClickHouse at Scale: What Actually Breaks

Klaudia Under the Hood: How We Built an AI SRE That Actually Earns Trust

Why API Reliability Is Critical to Modern Finance

ClickHouse LowCardinality: When It Helps and When It Hurts

Introducing the Rootly Agent

Should platform, SRE, and security merge into one function?

Best Log Management Software for DevOps and SRE Teams in 2026: Feature and Cost Breakdown

Running AI at Enterprise Scale w/ Anthropic, Descope, Port, Rootly and Twingate

Monthly Archive

Follow Us