Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Surface and Confirm Buggy Patterns in Your Logs Without Slow Search

Debugging with logs in distributed systems can be a pain. It’s tough to search raw data looking for a pattern, relating potential causes with other logs, and checking trace and metrics data for more confirmation. Is finding one pattern enough? What if there are other problems? Who knows how many colliding factors are relevant? At Honeycomb, we’re flipping the script on the log search problem. Hear our resident experts, (former Splunk Ninja) Michael Wilde and Andy Dufour, discuss how Honeycomb customers have technically evolved their log analysis process to achieve fast pattern detection, skipping the search grep/search loop entirely.

Learn How SumUp Implemented SLOs to Mitigate User Outages and Reduce Customer Churn

Blake Irvin and Matouš Dzivjak from SumUp’s Software Engineering team, Honeycomb Solution Architect Michael Sickles and Account Executive Nathan Leary, discuss how SumUp incorporated observability, specifically, SLOs, to identify and resolve issues before they grew into customer-noticeable problems.

Join Jeli and Honeycomb for an Incident Response and Analysis Discussion

Solutions Engineers Vanessa Huerta Granda and Emily Ruppe from Jeli, along with Honeycomb’s Field CTO Liz Fong-Jones and SRE Fred Hebert discuss some of our more interesting recent incidents and how we use Honeycomb and Jeli together for incident response.

See How Coveo Engineers Reduced User Latency

Many teams are wasting far too much time and energy searching through massive amounts of log data trying to find answers to user latency issues. Metrics data doesn’t help either as it only tells you that there is a problem, not where to fix it. This is why Coveo turned to observability. Through implementing observability with Honeycomb, Coveo was able to reduce their user latency by 50 percent.

Empowering SecOps Admins: Getting the Most Value From CrowdStrike FDR Data With Cribl Stream

In this live stream, Sidd Shah and I discuss how Cribl Stream can empower Security Operations Admins to make the most of their CrowdStrike FDR data. They address the challenges faced by CrowdStrike customers, who generate a vast amount of valuable data each day but struggle to leverage it fully due to complexity and size.

Introducing the Netdata demo space

Introducing Netdata's Demo Space, a quick and easy way to experience monitoring environments before you set them up yourself. At Netdata, we are always striving to provide the best monitoring experience for our users. We understand that adopting a new monitoring solution can sometimes be challenging, especially when you're unsure of how it will fit your specific environment. That's why we're excited to announce the Netdata Demo Space!

Google Colab Monitoring with Netdata

Hello, fellow data enthusiasts and Google Colab aficionados! Today, we're going to explore how to monitor your Google Colab instances using Netdata. Colab is a fantastic platform for running Notebooks, developing ML models, and other data science and analytics tasks. But have you ever wondered how your Colab instance is performing under the hood? That's where Netdata comes into play!

Reference Architecture Series: Scaling Syslog

Join Ed Bailey and Ahmed Kira as they go into more detail about the Cribl Stream Reference Architecture, with a focus on scaling syslog. In this live stream discussion, Ed and Ahmed will explain guidelines for how to handle high volume UDP and TCP syslog traffic. They will also share different use cases and talk about the pros and cons for using different approaches to solve this common and often painful challenge.