Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Introducing Datadog Application Security

Securing modern-day production systems is expensive and complex. Teams often need to implement extensive measures, such as secure coding practices, security testing, periodic vulnerability scans and penetration tests, and protections at the network edge. Even when organizations have the resources to deploy these solutions, they still struggle to keep pace with software teams, especially as they accelerate their release cycles and migrate to distributed systems and microservices.

Design Considerations for Software Distribution to Edge & IoT Applications

Make no mistake: You can’t overlook software distribution in DevOps. At risk are the reliability, security and speed of your software releases — and your business itself. This is especially true in enterprises that are releasing across numerous edge endpoints or IoT devices. As your releases’ cadence and payload grow, software distribution challenges multiply, particularly at the edge.

Now You can Invoke PagerDuty Rundeck Actions Within the PagerDuty Slack Integration

Last year, we released PagerDuty Rundeck Actions, a PagerDuty add-on product that connects responders to automated diagnostics and remediation for common problems directly in the PagerDuty incident response workflow. After working with our customers and listening to the community, we are excited to announce that PagerDuty Rundeck Actions now integrates with PagerDuty’s Slack integration.

The Question Isn't Whether You're Overspending in the Cloud, It's by How Much

Everyone is doing it. No, I am not talking about the latest Tik Tok challenge… The thing that everybody is doing—every company, that is—is that they are spending more money in the cloud than they need to. In fact, 82% of respondents in our own recent survey admitted that their organizations have incurred unnecessary cloud costs.

SRE: How the role is evolving

The growth of site reliability engineering (SRE) has demonstrated the need for SRE implementations is here to stay for the foreseeable future. LinkedIn voted SRE jobs as the second most promising positions in the US in 2019, and now as we head into 2022, you can be sure to see the evolution of SRE continue to grow and expand. Below, we’ll get into what SRE is, what SRE engineers do, and how SRE will continue to evolve into the future.

The startup guide to sensible incident management

If you’re working at an early stage startup and looking to get some good incident management foundations in place without investing excessive time and effort, this guide is quite literally for you. There’s an enormous amount of content available for organisations looking to import ‘gold standard’ incident management best practices – things like the PagerDuty Response site, the Atlassian incident management best practices, and the Google SRE book.

CFEngine bootstrap with Ansible

CFEngine and Ansible are two complementary infrastructure management tools. Findings from our analysis show that they can be combined and used side by side with joint forces to handle all areas in the best possible way. Part of infrastructure management is hosts deployment, either when building a brand new infrastructure or when growing one by adding new hosts.