Operations | Monitoring | ITSM | DevOps | Cloud

Analyze wait events and in-flight queries with the Datadog Database List

When you’re operating databases at scale, being able to get real-time insights across all your databases is essential for addressing issues and identifying areas for optimization. Datadog Database Monitoring’s Database List allows you to monitor your entire database fleet in one place, so you can quickly identify and troubleshoot overloaded hosts and gauge the impact of problematic queries throughout your infrastructure.

Hardware vs. IT vs. Software Asset Management - Why the need for specific asset monitoring tools?

Hardware asset management (HAM), IT asset management(ITAM), and software asset management (SAM) are all closely related fields that deal with the maintenance of IT assets in an enterprise. Asset management is managing and tracking the lifecycle of all your enterprise’s assets, from physical to digital. They ensure your enterprise has full visibility into its assets to make informed decisions about what it needs to do with them.

Application Snapshots: A Valuable Observability Signal for Developers

Monitoring is often not the first thing on the mind of the modern developer. Yet, it’s necessary at many points of the software development lifecycle, including: before deprecating an API, before launching a new feature, after launching the feature, and more. In fact, monitoring needs can vary much more than the classic Ops monitoring.

3 Enterprise IT Factors That Will Make MSPs More Successful in 2022

A version of this blog first appeared in APMdigest. A new study by OpsRamp on the state of the Managed Service Providers (MSP) market concludes that MSPs face a market of bountiful opportunities but must prepare for growth by embracing complex technologies like hybrid cloud management, root cause analysis and automation.

Why More Incidents Are Better

Ask most SREs how many incidents they’d have to respond to in a perfect world, and their answer would probably be “zero.” After all, making software and infrastructure so reliable that incidents never occur is the dream that SREs are theoretically chasing. Reducing actual incidents by as much as possible is a noble goal. However, it’s important to recognize that incidents aren’t an SRE’s number one enemy.

Cloud Monitoring metrics, now in Managed Service for Prometheus

According to a recent CNCF survey, 86% of the cloud native community reports that they use Prometheus for observability. As Prometheus becomes more of a standard, an increasing number of developers are becoming fluent in PromQL, Prometheus’ built-in query language. While it is a powerful, flexible, and expressive query language, PromQL is typically only able to query Prometheus time series data.

Why Operational Maturity Helps Businesses Reduce the Great Resignation Trend

The past few years have led to fundamental business and cultural shifts for both companies and employees. Covid-19 has brought opportunities for companies who invested early in digital operations, while others struggled to maintain the status quo. The latter gave rise to record employee burnout, and what is now commonly referred to as the Great Resignation.