Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Generate RUM-based metrics to track historical trends in customer experience

Datadog Real User Monitoring (RUM) provides end-to-end visibility into the user experience and performance of your browser and mobile applications. RUM allows you to capture and retain complete user sessions for 30 days. This means you can pinpoint bugs, prioritize issues, and determine fixes with data collected across an entire quarter.

5 Reasons Why OpenTelemetry is the Future of Observability

It has been said that open source is eating the world and in the observability space, the project behind this movement is OpenTelemetry. The project is quickly becoming the standard for instrumentation and collection of observability data. Why is an open standard and open-source approach to instrumentation and data collection so compelling? This talk will provide five reasons why OpenTelemetry is disrupting the observability market.

My Most Surprising Discoveries from The SRE Report 2023

I’ve had the honor and privilege of authoring The SRE Report for the last three years. For the 2023 version, this included working with some amazing individuals like Anna Jones, Kurt Andersen, and Steve McGhee. Download The SRE Report 2023 here (no registration required).

Reducing MTTR for DevOps and SREs with PagerDuty Process Automation and InfluxDB

Mean time to resolution (MTTR) is a metric that transcends industry and technology. It’s a measure of how quickly, on average, support teams identify, act, and resolve IT issues and incidents. Because MTTR directly relates to service quality, maintaining a low MTTR is a critical goal for DevOps and SRE teams. These teams have a vested interest in resolving issues quickly because escalating incidents to higher levels of the support team increases response and resolution times.

How Do You Measure Application Performance?

Web performance isn’t just about how long a website needs to render all its page elements—it also covers techniques for monitoring an application’s runtime, user-defined transactions, component response times, and network requests. The important thing is using performance data to evaluate the success of your app or service, whether you’re trying to compare different versions or introduce new capabilities.

Reduce Data Costs: Log Sampling with OpenTelemetry and BindPlane OP

Redundant logs are a common nuisance in observability pipelines of all kinds. In large environments, excess logs can multiply data costs to unsustainable amounts. Log sampling is the process of randomly sampling logs to produce the same valuable insight with dramatically reduced data flow. Configuring agents in a pipeline to appropriately sample logs can be a pain. Pipeline managers, like BindPlane OP, make that process simple and scalable.

5 Tips For Consumers To Shop Safely This Black Friday

While it makes for bleak reading, the frenzy of sales and online shopping activity surrounding Black Friday, means this pre-holiday season is a key period for cybercriminals. And each year we see an increase in cyberattacks during what should be a feel-good time. The picture is all-the-more worrying in 2022, as this Black Friday weekend (25th-28th November) falls on the same date as the USA vs England World Cup game – a highly- anticipated day of betting for bookmakers.

Answering the FAQ of CPU temperature monitoring

Have you ever wondered how productive we could be if we could measure and monitor our brains and be alerted every time we overused them? While there’s no practical way to measure the performance of the human brain without expensive medical equipment, you can track metrics like these for your computer’s brain—its central processing unit (CPU). A device’s performance depends on the condition of its CPU; a device cannot function properly without a CPU.