Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring Kubernetes performance metrics

As explained in Part 1 of this series, monitoring a Kubernetes environment requires a different approach than monitoring VM-based workloads or even unorchestrated containers. The good news is that Kubernetes is built around objects such as Deployments and DaemonSets, which provide long-lived abstractions on top of dynamic container workloads.

Collecting metrics with built-in Kubernetes monitoring tools

In the previous post in this series, we dug into the data you should track so you can properly monitor your Kubernetes cluster. Next, you will learn how you can start inspecting your Kubernetes metrics and logs using free, open source tools. In this post we’ll cover several ways of retrieving and viewing observability data from your Kubernetes cluster.

Monitoring Kubernetes with Datadog

If you’ve read Part 3 of this series, you’ve learned how you can use different Kubernetes commands and add-ons to spot-check the health and resource usage of Kubernetes cluster objects. In this post we’ll show you how you can get more comprehensive visibility into your cluster by collecting all your telemetry data in one place and tracking it over time.

NGINX 502 Bad Gateway: PHP-FPM

This post is part of a series on troubleshooting NGINX 502 Bad Gateway errors. If you’re not using PHP-FPM, check out our other article on troubleshooting NGINX 502s with Gunicorn as a backend. PHP-FastCGI Process Manager (PHP-FPM) is a daemon for handling web server requests for PHP applications. In production, PHP-FPM is often deployed behind an NGINX web server. NGINX proxies web requests and passes them on to PHP-FPM worker processes that execute the PHP application.

Introducing wildcard-filtered metric queries

Tags are essential for your teams to quickly and efficiently filter through and find the information they need among the huge scope of data generated by your cloud infrastructure. Given that modern environments are always changing, with hosts and containers continuously being added or replaced, you need to be able to dynamically scope your queries so that you’re not rewriting the same searches over and over again.

Monitor Apache Airflow with Datadog

Apache Airflow is an open source system for programmatically creating, scheduling, and monitoring complex workflows including data processing pipelines. Originally developed by Airbnb in 2014, Airflow is now a part of the Apache Software Foundation and has an active community of contributing developers. Airflow represents workflows as Directed Acyclic Graphs (DAGs), which are made up of tasks written in Python. This allows Airflow users to programmatically build and modify their workflows.

How to monitor Kubernetes audit logs

Datadog operates large-scale Kubernetes clusters in production across multiple clouds. Along the way, audit logs have been extremely helpful for tracking user interactions with the API server, debugging issues, and getting clarity into our workloads. In this post, we’ll show you how to leverage the power of Kubernetes audit logs to get deep insight into your clusters.

Monitor email workflows with Datadog Browser Tests

Monitoring your application from end to end is important for ensuring that core functionalities work as designed. Datadog’s browser tests help you verify that key user workflows—such as signing up for a new account—are consistent across devices and locations. Within these workflows, email often plays a key role in onboarding users and providing customers with important information about their accounts and application activity, such as profile changes and order confirmations.