Operations | Monitoring | ITSM | DevOps | Cloud

Top tips: When "sounds right" isn't right

Top Tips is a weekly column where we highlight what’s trending in the tech world today and list ways to explore these trends. This week, we’re looking at why convincing AI answers can still be wrong and how to catch them before they slip through. AI doesn’t fail the way it used to. It doesn’t give obviously wrong answers. It gives answers that are just right enough to trust. And that’s exactly why we stop questioning it. It fits into our workflow so easily.

Improved Microsoft 365 private status integration

Keeping track of your Microsoft 365 services just got easier. We’ve rolled out an update to the Microsoft 365 integration that removes manual setup and improves visibility. All services in your account can now automatically appear as components, so you can monitor them right away.

Reports just got smarter

We’ve upgraded the Reports page in StatusGator to give you more insight directly inside the StatusGator dashboard. Previously, reporting was limited to exports you could use to calculate your own uptime percentages and trends. Now, in addition to exported reports, you can view key reports and metrics without needing to download anything. We’ve also added a one-click download of the most commonly requested report: Uptime percentage by monitor.

Coralogix and Atlassian: Full-Stack Observability Inside the Incident Workflow

Incident response has a well-known efficiency problem. The tools teams use to detect and investigate issues are often disconnected from the tools they use to manage and resolve them. Engineers spend a significant portion of each incident switching between platforms, assembling context that should already be at hand. Even when the data is available, correlating signals across user, app, infrastructure, and security events to pinpoint a root cause remains manual and slow.

GitHub Outages 2025 - 2026: Reliability Analysis and Outage History

Hashicorp's co-founder Mitchell Hashimoto decided to pull out his Ghostty project from GitHub in April 2026 due to GitHub's reliability issues. He did this after 18 years of using GitHub, saying that GitHub "is no longer a place for serious work". GitHub has experienced a significant decline in reliability over the past 6 months, and Hashimoto is not alone in expressing this sentiment.

Rightsizing Nightmares: When Your Cloud Cost Tool Degrades Performance

This is what production teams see happening. A vertical pod autoscaler recommendation gets applied automatically. Resource requests come down a notch across a namespace. The cost dashboard registers a small cost savings win. A few minutes later, health checks start failing. Pods enter crash loops.

Your Team is Using Claude Code. Do You Know What It's Costing You?

The first two weeks of Claude Code are exciting. The third week is when you realize you don’t have visibility into what it’s doing or what it’s costing you. You would not run a production service without metrics, logs, and dashboards or deploy an API without knowing its latency, error rate, or cost per request.

Agentic ITOps is here. Here's what early movers are doing.

We recently brought together IT operations leaders from across financial services, healthcare, airlines, media, and other industries for BigPanda 26, our annual customer event. The theme that emerged above all others during the event’s conversations is that our industry is no longer debating whether AI belongs in ITOps. The debate now is about how quickly it can be implemented, how to measure it, and who’s accountable when it acts. Here are some key learnings from BigPanda 26.

Ticket Taker to Team Leader: Managing an Agentic IT Workforce

The promise of AI in IT service management has been circulating for years. Chatbots that deflect tickets. Virtual agents that answer FAQs. Automation that routes requests. These are useful, but probably not the dream-state you were originally sold. What's different today is the arrival of agentic AI: systems that don't just respond to instructions but reason, act, and adapt across multi-step workflows with real consequences. The question for IT leaders is no longer whether to adopt agentic ITSM.