Operations | Monitoring | ITSM | DevOps | Cloud

AWS Middle East data center strikes: 92 SaaS platforms report disruptions

StatusGator analysis identifies 92 cloud services that publicly acknowledged disruptions tied to the AWS Middle East incident. Over the weekend, Amazon confirmed that drone strikes damaged AWS facilities in the Middle East, disrupting cloud infrastructure across the region. The strikes affected AWS regions in the United Arab Emirates and Bahrain, causing outages and degraded performance across core cloud services including compute, storage, and databases.

How Gremlin makes disaster recovery testing easier and faster

There’s a common saying: “A backup isn’t a backup until you’ve tested it.” The same is true whether it’s a simple database failover or an entire data center/cloud provider failover. You simply won’t know if it works if you don’t test it. When it comes to disaster recovery testing, that can be an expensive, painful, and arduous process. But it’s required by companies for a reason. And not just for disasters like hurricanes, flooding, or earthquakes.

Best IT Asset Tracking Software in 2026 for Smarter IT Asset Management

IT asset tracking software is becoming a critical operational tool for organizations in 2026. Businesses now manage large inventories of physical IT equipment including laptops, desktops, monitors, networking hardware, and peripherals across multiple departments and locations. Implementing reliable IT asset tracking software allows companies to maintain accurate equipment records, reduce asset loss, strengthen accountability, and improve overall it asset management efficiency.

Centralizing Docker Logs for Observability and Security

Most people can remember the old game of telephone, the stream of whispered sentences or phrases across a group of kids. At each transmission, a different piece of information gets lost or misheard, leaving the last person with an incomplete or incomprehensible statement. Managing Docker logs can feel the same way, especially when an error message is lost or an error message lacks context.

AI SRE in Practice: Enabling Non-Experts to Troubleshoot Kubernetes

Kubernetes troubleshooting traditionally requires deep platform expertise. Understanding pod lifecycle, decoding error messages, correlating events across resources, and identifying root cause all demand experience that takes years to build. This expertise gap creates a bottleneck where only senior engineers can handle production issues, limiting how quickly teams can resolve incidents.

Database Schema Evolution: Designing for Continuous Change | Harness Blog

Modern database design is no longer a one-time activity but an ongoing process that evolves as business needs, scale, and system behavior change. Instead of large redesigns, teams rely on incremental and backward-compatible schema changes, such as adding columns, indexes, or new tables, to safely adapt the database without disrupting production.

How to Lower Your Egress Fees in 2026

Egress fees can quietly drive cloud costs. Learn practical ways to reduce your cloud egress fees in 2026 without redesigning everything. Cloud egress fees can sneak up on you. One month your cloud bill can look reasonable, and the next it’s clear that data movement is causing your cloud spend to fluctuate. For many network teams, egress is still treated as a fixed cost or something you only revisit during a major architecture change, but that approach doesn’t hold up in 2026.

What's New in Calico: Winter 2026 Release

As anyone managing one or more Kubernetes clusters knows by now, scaling can introduce an exponentially growing number of problems. The sheer volume of metrics, logs and other data can become an obstacle, rather than an asset, to effective troubleshooting and overall cluster management. Fragmented tools and manual troubleshooting processes introduce operational complexity leading to the inevitable security gaps and extended downtime.

OpenTelemetry traces for Bitbucket Pipelines via webhooks

Continuous delivery is only as good as your ability to understand what’s happening inside your pipelines. When a build is slow, flaky, or burning through capacity, you need more than a green/red status and a wall of logs — you need traces. Bitbucket Pipelines now exposes pipeline execution as OpenTelemetry (OTel) traces via webhook events. This lets you stream detailed pipeline spans into your own observability stack and correlate them with the rest of your system. This post walks through.