Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

How Visa Cross Border Solutions Reduces Outages by Testing System Resilience in Their SDLC

For global financial services companies, reliability must be built-in and validated before and after shipping to production. Resilience testing is crucial for verifying the reliability of your applications under real-world conditions. But ad-hoc testing and exploratory experiments aren't sufficient: you need to run automated, standardized tests at global scale.

25 Azure Monitoring Tools To Consider For Cloud Optimization

Microsoft Azure is the most popular cloud computing platform after Amazon Web Services (AWS). With over 200 services and resources available, there are plenty of ways to use Azure. This means the Azure public cloud allows hundreds, if not thousands, of unique configurations. This flexibility is ideal for tailoring Azure to your workload’s requirements but also makes cloud management more challenging.

Cortex secures investment from ServiceNow to unify tech operations at the enterprise.

This month marks a huge milestone for us at Cortex. We’re excited to announce that ServiceNow, the global leader in digital workflows, has invested in our Series C funding round. Together, we’re pushing forward with our mission to unify tech operations at the enterprise through our industry-leading Internal Developer Portal (IDP).

Why I like discussing actions items in incident reviews

Are incident reviews about learning or tracking actions? This question has sparked recent debate in incident management circles, including in my recent panel at SEV0 and in Lorin Hochstein’s post. Should the goal of an incident review be learning, or should it focus on tracking actionable improvements? When is the right time to discuss actions, and are they picked up just to make us feel better? From my experience, learning from incidents and identifying actions are inseparable.

Installing and Upgrading the Flyway CLI | The Tony and Tonie Show

Tony and Tonie discuss a Phil Factor article that explains how the Flyway CLI installation and upgrade process works on Windows and Linux, and how we can make it simpler and less time-consuming using a bit of PowerShell automation, or by running Flyway from a Docker container, or finally by using a package manager.

GitKraken Workshop: Practical Tips for Enhancing Team Alignment with GitKraken.dev

In this Workshop, we dive into how GitKraken.dev can help your team stay aligned and work more efficiently. Ken and Jeff take you through key features like Team Launchpad and Insights, showing you how to improve team visibility, track important metrics, and maintain compliance – all within one platform. You'll also get a sneak peek at the upcoming Automations feature, which promises to make repetitive tasks a thing of the past, freeing up your team to focus on what matters most.

Complete Guide: How to Manage IT Infrastructure Remotely

Learning how to manage IT infrastructure remotely is an essential capability for businesses of all sizes, particularly with the rise of distributed workforces. With the right tools and strategies, IT teams can effectively monitor and troubleshoot systems from anywhere, ensuring smooth operations and minimal downtime. This guide will cover best practices for maintaining control over your network, regardless of your team’s location.