Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Observability and incident response need resilience testing

There’s a reason why observability and incident response practices have become standard across modern software development. Anyone wanting to minimize downtime and deliver reliable, available applications needs to have fully instrumented systems and playbooks so they can respond quickly and effectively to outages or incidents. But there’s another piece to the reliability puzzle: resilience testing.

10 Best VS Code Extensions for Web Development, According to Redditors

When it comes to improving your coding workflow, Visual Studio Code (VS Code) extensions are a fan favorite among devs worldwide. Well-known extensions like Prettier for code formatting, ESLint for identifying and fixing problems, and Live Server for launching a local server are fantastic, but there are plenty of hidden gems offering unique features these popular tools don’t cover. We turned to Reddit to find out which lesser-known VS Code extensions the community’s talking about.

Streamline Your Development Workflow with Bunnyshell: Achieve Faster Time-to-Market

In today’s fast-paced software development landscape, maintaining consistent and reliable environments across all stages—whether it’s development, testing, or production—is crucial. The "works on my machine" problem is all too familiar, leading to inefficiencies and delays that can derail your projects. Enter Bunnyshell, a game-changer in the world of environment management that can transform your development workflow and drastically accelerate your journey from code to production.

Customer impacting incidents increased by 43% during the past year- each incident costs nearly $800,000

PagerDuty, Inc. releases study of 500 IT leaders and decision-makers of companies with more than 1,000 employees responsible for IT operations from the United States, the United Kingdom and Australia, that highlights the growing impacts of customer-facing incidents and the ways automation can help mitigate.
Sponsored Post

All-in-One Incident Management: Why Squadcast Trumps Separate On-Call and Alerting Tools

The pressure is on. Incidents happen, and resolving them quickly and efficiently is crucial for meeting your SLAs. But relying on a patchwork of tools for alerting, collaboration, and post-incident analysis can create confusion, delays, and frustration. They can work or may have been working perfect in your company but here are a few factors to consider: The list of questions can go on differing from organization to organization. These are just a few factors that can help you evaluate whether your current tools are truly effective for Incident Response, or if it's time to switch to a unified solution like Squadcast.

Managed Apps on Public Cloud: Why Operations Matter, Part I

You might be tempted to think that running an app on a public cloud means you don’t need to maintain it. While that would be wonderful, it would require help from the public cloud providers and app developers themselves, and possibly a range of mythological creatures with magic powers. This is because any app, regardless of the infrastructure on which it runs or its output, requires maintenance in order to yield accurate and reliable outputs.

Monitoring as Code and Checkly Listed in the Gartner Hype Cycle for the Second Consecutive Year

I'm excited to share that Gartner has included Monitoring as Code (MaC) as an emerging practice to their Hype Cycles for SREs again, the second year in a row. Since we founded Checkly, our vision has been that monitoring should sit in your repository, be codified, and scale with your software development. There is no alternative to MaC as it allows your engineering team(s) to work together, create and maintain checks, and ultimately own their monitoring.