Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Datadog Universal Service Monitoring Demo

See how you can get instant visibility into the health of your entire fleet of services—without requiring you to change a single line of code. By automatically discovering, mapping, and monitoring every service and dependency, Universal Service Monitoring allows you to detect issues faster, monitor service performance and SLOs across entire environments, and centralize all the knowledge about your services in a single place.

Multiple players, one stack: Inside Roblox's centralized observability stack

When you sign into the Roblox platform, you get 30 million immersive experiences, ranging from concerts to fashion shows to, of course, video games. But when the observability team at Roblox logs on, they’re not playing around. The Roblox observability engineers are responsible for keeping more than 214 million monthly users happy and engaged by making the wildly popular gaming platform highly available around the world.

Now Available: The Flight SQL Plugin for Grafana

Today we have exciting news for Grafana customers with Flight SQL data sources: Now there is a new community plugin available for Grafana that allows it to communicate with Flight-SQL-compatible databases. Flight SQL is a client-server protocol developed by the Apache Arrow community for interacting with SQL databases. It utilizes the Flight RPC framework and the Arrow in-memory columnar format.

Why Your Data-Driven Strategies for Network Reliability Aren't Working

What do network operators want most from all their hard work? The answer is a stable, reliable, performant network that delivers great application experiences to people. In daily network operations, that means deep, extensive, and reliable network observability. In other words, the answer is a data-driven approach to gathering and analyzing a large volume and variety of network telemetry so that engineers have the insight they need to keep things running smoothly.

Before Taking the Plunge, Dip Your Toes in OTel

OpenTelemetry was launched in May 2019, as a merger of the OpenCensus and OpenTracing projects. The open-source, vendor-neutral project resides within the Cloud Native Computing Foundation (CNCF), which virtually ensures its longevity and widespread adoption. In fact, OpenTelemetry has gained significant traction in recent years, with support from many major cloud providers and the tech industry.

How to throw custom exceptions inside Logic Apps: Using default capabilities - Extract failure information (Part II)

Welcome to the second part of this series of blog posts on How to throw custom exceptions inside Logic Apps. In this series of five blogs, I will cover throwing custom exceptions in Logic Apps. I will cover the following topics: In this second approach, we are going to do a small fine-tuning of the previous approach by adding the capability to define custom error messages for each condition and, of course, get that information inside the Catch Scope.

Log Shippers: The Key to Efficient Log Management

Logs are a vital source of information for any system, providing valuable insights into its performance and behaviour. However, with the increasing complexity of modern systems and the massive amount of data generated by them, managing logs can be a daunting task. This is where log shippers come into play. Log shippers are tools designed to simplify the process of collecting and forwarding log data to a centralized location, allowing for easy analysis and troubleshooting.

4 Differences Between DEM & RUM You Should Know

If you want to deliver an outstanding user experience you must know the differences between DEM and RUM. In this modern world, businesses are embracing digitization to provide better services to their customers. However, customer expectations and preferences have changed drastically over time. To address customer demands, businesses have started investing in systems and applications that enhance the user experience.

Should Every Incident Get a Retro?

At a recent training session, Jeli spent a great deal of time covering incident retrospectives and what makes an incident worthy of studying. My colleague Ben Hartshorne asked a fascinating question, which I’ll paraphrase here: That caught me by surprise. We had a great discussion, and it made me consider approaches I hadn’t before.