Operations | Monitoring | ITSM | DevOps | Cloud

Panel: Handling Incident Response - Dash 2021 (Datadog, PagerDuty)

When customer-impacting downtime happens, it’s crucial that responders are prepared and can resolve these issues as quickly as possible. Knowing the right tools to use, from wherever you are working from, will help to have a well-defined strategy in place to come together as a team, work the problem, and get to a solution quickly. In this roundtable discussion, PagerDuty and Datadog engineers chat about incident responses and how we use all the tools at our disposal to respond quickly and effectively.

Roundtable: The Complexities of Cloud Migration - Dash 2021 (Datadog, LaunchDarkly, StockX)

Often when completing a migration project, you’re having your organisation straddle between two systems. You’re fighting habits and changing attitudes while also attempting to complete a high-risk operation. Every software team at one stage in their career will have to complete a migration. Whether it’s to improve scalability and performance, or transition between an on-prem to cloud solution, you’ll need a deep understanding of your current environment to create a strategy that minimises downtime for your team.

How to do serverless monitoring right #shorts

Monitoring CPU load and memory usage is common practice, but with serverless no action is required. In this video, we quickly explain that if your Cloud Run instances start hitting high CPU load, Google Cloud will automatically spin up new instances for you, and vice versa!

Config best practices: dependency caching

Let’s face it: Creating the optimal CI/CD workflow is not always a simple task. In fact, writing effective and efficient configuration code is the biggest hurdle that many developers face in their DevOps journey. But you don’t need to be an expert to set up a fast, reliable testing and deployment infrastructure. With a few straightforward techniques, you can optimize your config.yml file and unleash the full potential of your CI/CD pipelines.

"Open source done right": Why Canonical adopted Grafana, Loki, and Grafana Agent for their new stack

Michele Mancioppi is a product manager at Canonical with responsibility for observability and Java. He is the architect of the new system of Charmed Operators for observability known as LMA2. Jon Seager is an engineering director at Canonical with responsibility for Juju, the Charmed Operator Framework, and a number of Charmed Operator development teams which operate across different software flavors including observability, data platform, MLOps, identity, and more.

Various policy engines for Kubernetes policies - Saiyam Pathak

Kubernetes configurations are complex to manage across developers and operators. External tools like Helm, Kustomize cannot ensure environment-specific configurations and admission controllers provide a way to do this. Now, various tools have evolved over time that helps solve this problem - OPA Gatekeeper, Kyverno, Kubewarden and jsPolicy. In this talk during ContainerDays 2021, Saiyam Pathak from Civo goes through the need for a policy engine and discusses how each of the tools help along with the differences between them and where these are headed to.

A CTO's View: Driving Continuous Alignment with Mattermost 6.0

The past few weeks have marked a real milestone for the Mattermost community. My co-founder and longtime colleague, Ian, shared his reflections on our huge v6.0 launch, and I echo his take on the magnitude of the launch and our new product capabilities. As CTO at Mattermost, I have the unique pleasure of leading product development efforts for an open source platform backed by an inspiring community of contributors and enthusiasts.

How Pingdom's Real User Monitoring Can Help Optimize Your WordPress Website

Enterprise web applications or medium-to-large, consumer-facing websites are typically built by teams of engineers, administrators, web developers, and other professionals. However, once a site goes live, the operations team is responsible for keeping the site up and running at optimal performance. Online users aren’t forgiving, often abandoning a site as soon as they encounter an issue with functionality, complexity, or performance.

Working With the WordPress REST API

Logging is an important part of every software application. In addition to capturing user activity, well-structured logs can make it easier to debug problems should they occur. But if your application is split up across several different subsystems, collecting and analyzing disparate logs can be a real challenge. Picture this scenario: You work at a startup that uses a CMS managed by a few admins. You also have a standalone front-end application for users to communicate with your platform via an API.