Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Why SREs Need to Embrace Chaos Engineering

Reliability and chaos might seem like opposite ideas. But, as Netflix learned in 2010, introducing a bit of chaos—and carefully measuring the results of that chaos—can be a great recipe for reliability. Although most software is created in a tightly controlled environment and carefully tested before release, the production environment is harsher and much less controlled.

MetricFire: A Great Instrumental Monitoring Alternative

Instrumental has made the decision to shut down its platform starting August 2022 including its application, servers, and all related APIs being shut down. Users will need to migrate to another solution or risk all their data being permanently deleted! But Instrumental users need not fret!

A Data Lake Is Not Enough to Keep Your Observability Ambitions Afloat

Recently I heard one of our prospects talk about a competitor who was promoting their data lake and ask, how are we different than that? His question got me thinking about why a data lake alone does not provide the depth of observability you really need. The goal of observability is to help SREs, IT Ops and DevOps teams run their IT systems with close-to-zero downtime. Consolidating data from across your environment into a data lake is certainly a good step.

IDC report: How autonomous compliance ensures better business outcomes

A new report from IDC emphasizes just how critical autonomous compliance is for companies to ensure that their digital infrastructure environments are consistently hardened, resilient, and compliant. Leaders who prioritize compliance optimize company efficiency while reducing risk. The IDC PeerScape report outlines the best practices of these leaders, who, by implementing autonomous compliance, better protect their businesses.

Episode 5: Mooving to... Practical Postmortems

Episode 5, Mooving to… Practical Postmortems covers how to leverage postmortems to effectively learn from failure. Postmortems are a commonplace reference and are now considered a best practice in most modern engineering teams. However, there’s still a lot of confusion on what postmortems should be – and more importantly, what they should NOT be. Thom Duran, Senior Manager of Productivity from Panther walks us through all that and more in the latest Mooving To.. episode!

Our fully-redesigned incident response experience delivers a more intuitive workflow

Today we’re releasing fully redesigned Slack and Command Center experiences for FireHydrant so anyone on your team can intuitively navigate the incident response process — in the app or on the web. There are many things you can do ahead of an incident to help things run smoothly: design and document your process, automate predictable steps, train the team, and run drills.

Default Pull Request Tasks

There are multiple ways to create a task on a pull request. They can be added from the sidebar, top-level pull request comments, file-level comment or inline comments. Once created, they all appear in the sidebar. On any repository, merge checks can be configured for any branch to only allow merging if all pull request tasks are resolved. This is a very useful functionality if some tasks are critical to be resolved before changes are merged.

Voice Network Fraud: How to Fight Back with Automated Threat Prevention

Telecommunications fraud is estimated to be a $39 billion a year problem according to the Communications Fraud Control Association. Despite that, less than 50% of enterprises* have implemented any sort of strategy to address fraud in their voice infrastructure. Firewalls and SBCs are not enough to provide a secure voice network. Enterprises need a more complete approach to network security—one that encompasses the unique vulnerabilities of real-time communications systems—to preempt issues and protect the organization as a whole.