Operations | Monitoring | ITSM | DevOps | Cloud

Syncing PagerDuty Schedules to Slack Groups

We’ve posted before about how engineers on call at Honeycomb aren’t expected to do project work, and that whenever they’re not dealing with interruptions, they’re free to work on whatever will make the on-call experience better. However, all of our engineering rotations rely on hand-off meetings where they update the Slack groups with everyone who’s on call. During my last shift, a small problem kept causing friction for some of our incident management automation.

Investigate Performance issues with SLOs

When an alert goes off because a Service Level Objective (SLO) is in danger of violation, it comes with a lot of context about what has been going wrong and for how long. Then Honeycomb gives you tools to explore the where & why. Here, Martin Thwaites walks through an example of diagnosing slower performance. What service is the problem, and under what circumstances?

AI-Powered Observability: Picking Up Where AIOps Failed

GenAI promises evolutionary changes in how we use observability tools, but meeting expectations means heeding the lessons of our AIOps mistakes. The emergence of generative AI in observability tools was inevitable, but there’s already been an extreme degree of hype in the market. Monitoring, DevOps and ITOps have never been immune to trends, and with GenAI capabilities, the propagandahype machine is running out of control.

4 benefits of observability

Achieving modern observability with a unified data platform and Search AI If you have a love-hate relationship with your data, we don’t blame you. It’s generated at high velocity and from all sides — your apps, endpoints, networks, and servers. By 2025, global data creation is projected to grow by more than 180 zettabytes.* Inside this wealth of data lies better operational resilience, profitability, and innovation.

How to Set Up Real User Monitoring in SolarWinds Observability Platform

Learn how to set up Real User Monitoring in the SolarWinds Observability Platform to track and analyze the real-time performance of your website. This tutorial covers integrating Real User Monitoring with your website, setting performance thresholds, and configuring the tool for single-page applications. By the end, you'll know how to gain valuable insights into your end users' experience and optimize your website's performance.

Improve your observability strategy with AIOps

Change is the only constant in the IT landscape. These changes might involve adding new observability tools, retiring existing monitoring systems, establishing new business units, or integrating IT systems from acquisitions. Managing these changes can challenge even expert ITOps teams. Organizing your monitoring setup can seem overwhelming, especially with issues like monitoring gaps, observability redundancy, complex toolsets, or significant technical debt.

Cloud Observability vs Monitoring: A Practical Guide to Go Beyond Cloud-Native Tools

As organizations move their application workloads to the cloud, understanding the difference between cloud observability vs monitoring is crucial to ensure optimal performance and seamless operations. While both concepts are often mentioned in tandem, they serve different purposes, and mastering each can help organizations thrive in increasingly complex cloud environments.

Getting Started with AWS Monitoring and Observability

It’s no secret that many businesses rely heavily on Amazon Web Services (AWS) for their infrastructure and application needs. While AWS offers scalability, flexibility, and reliability, managing and monitoring cloud resources can be challenging. That’s where AWS monitoring and observability can be a tremendous asset. Today, we will explore how implementing these practices is crucial for ensuring that your cloud environment operates smoothly, efficiently, and securely.

The OTTL Cookbook: Common Solutions to Data Transformation Problems

As our software complexity increases, so does our telemetry—and as our telemetry increases, it needs more and more tweaking en route to its final destination. You’ve likely needed to change an attribute, parse a log body, or touch up a metric before it landed in your backend of choice. At Honeycomb, we think the OpenTelemetry Collector is the perfect tool to handle data transformation in flight. The Collector can receive data, process it, and then export it wherever it needs to go.

Introduction to The Splunk Terraform Provider | Create a Detector in Splunk Observability Cloud

In this video I will demonstrate how to use the Splunk Terraform Provider. I’ll explain what it is and why you should use the Splunk Terraform Provider as part of your overall Observability as Code solution. Using a simple Terraform project, I will walk you through the setup of the provider and the creation of a Detector in Splunk Observability Cloud.

Beyond Backend: Honeycomb for Frontend Observability is Now GA

Real user monitoring (RUM) tools are great if you want to give your developers a very high level view of the health of your frontend. But when it comes to actually debugging issues in your web app, you’re often left piecing together outputs from browser devtools, with details (if you’re lucky) from customer support tickets to replicate issues locally in hopes of identifying the source of the issue. Debugging Core Web Vitals (CWVs) to improve your scores can be even worse.

Debugging INP With Honeycomb for Frontend Observability

Interaction to Next Paint is the newest of Google’s Core Web Vitals. The three metrics that make up the CWVs are Google’s attempt at defining proxy metrics for measuring things they believe are critical to a good user experience on the web. The three metrics are: Debugging and fixing these metrics can be quite complicated. In this post, I’m going to walk through how you can use Honeycomb for Frontend Observability to debug INP, which was just promoted to a stable Core Web Vital in March.

Lower observability bills, reduced MTTR, and more: why companies migrate to Grafana Cloud

There are a lot of factors that go into choosing an observability solution. And even after all that careful consideration, sometimes the platform you initially invest in doesn’t meet your needs, especially as your organization grows and evolves. For that very reason, we’ve seen users begin their observability journeys with another tool, and then decide to migrate to Grafana Cloud, our fully managed cloud-hosted observability platform.

Coroot: The Ultimate eBPF Observability Platform. #observability #devopstools #monitoringtool

Explore the benefits of using Coroot for system monitoring, alerting, and inspection. Watch the full "Zero-Instrumentation Observability with eBPF" webinar, and learn from Peter Zaitsev. Coroot is an open source observability platform that helps engineers fix service outages and even prevent them. It continuously audits telemetry data to highlight issues and weak spots in your services. Quick setup, no code required.

Coroot's Approach to eBPF and OpenTelemetry. #observability #monitoringtool #shorts #devopstools

Discover how Coroot's passive approach to eBPF can provide valuable insights without impacting your system. Coroot is an open source observability platform that helps engineers fix service outages and even prevent them. It continuously audits telemetry data to highlight issues and weak spots in your services. Quick setup, no code required.

Grafana vs Splunk - Features, Pricing, and Performance Compared [2024]

Monitoring and observability tools are critical for organizations to keep their systems running efficiently. Grafana and Splunk are two leading platforms that cater to various observability needs, but they differ significantly in functionality, user base, and cost. This article will explore their features, strengths, and limitations to help you choose the right tool for your use case.

An Engineer's Checklist of Logging Best Practices

The best DevOps and SRE teams have shifted their approach to monitoring and logging their systems. These teams debug problems cohesively and rationally, regardless of the system’s complexity. Gone are the days of having a slew of logs that fail to explain the cause of alerts, system failures, and other unknowns.

Using observability to ship faster with confidence ft. Christine Yen, CEO of Honeycomb

In this episode of The Confident Commit, Rob sits down with Christine Yen, CEO of Honeycomb, to delve into the evolving role of observability in modern software development. They discuss how observability goes beyond traditional metrics and monitoring, and allows developers to be better prepared for the unknown and embrace the complexities of distributed systems. Christine shares insights on how observability not only boosts developer confidence but also enhances productivity by reducing toil and enabling teams to focus on delivering value for customers.

How OpenTelemetry is Transforming Observability

The OpenTelemetry project is changing how organizations approach observability. It aims to standardize monitoring across different systems. OpenTelemetry—commonly referred to as OTel—provides APIs, SDKs, exporters, and collectors. It is making data collection, analysis, and utilization more efficient, leading to better decision-making and technology adoption.

A CoPE's Duty: Indexing on Prod

Odds are that a software engineer today is really focused on one place: pre-prod. Short for “pre-production,” this is slang for an environment where software code operates in a prototype phase of its development lifecycle. Common sense would have one believe that this is a safe space, a workbench of sorts, where problems can be found and remediated.

Splunk vs Dynatrace - Detailed Comparison [2024]

Splunk and Dynatrace are two powerful platforms in the realm of observability and performance monitoring. Each offers unique strengths that cater to different monitoring needs. In this article, we'll explore the features, pros, and cons of both tools, and introduce an exciting alternative that combines the best of both worlds.

The Layers, Not Pillars, of Observability

Remember the Tabs vs. Spaces arguments? It seems that observability has grown up enough that we are arguing over which signals are the “best” signals for observability. Often referred to as the Pillars of Observability, Metrics, Logs, and Traces (sometimes adding Events for MELT) each provide a unique perspective on a system. What happens when we change our perspective from finding the “best” telemetry format to finding the telemetry that aligns with the problems we need to solve?

An Ode to Events

At this point, it’s almost passé to write a blog post comparing events to the three pillars. Nobody really wants to give up their position. Regardless, I’m going to talk about how great events are and use some analogies to try to get that across. Maybe these will help folks learn to really appreciate them and to depreciate a certain understanding of the three pillars. Or maybe not.

Top 11 Grafana Alternatives [comparison 2024]

Grafana is a widely used open-source platform for monitoring and visualization. Grafana has a lot of built-in functionality and also provides a large amount of community templates that can improve your overall experience. However, Grafana requires quite a lot of configuration and the documentation can be a bit overwhelming for beginners. In this article, we explore seven alternatives that can be simpler to use and can provide seamless integration of traces, logs, and metrics.

Top 10 API Monitoring Tools in 2024 [Including Open Source]

API monitoring has become increasingly important due to the growth of microservices, cloud-native architectures, and distributed systems. APIs play a crucial role in facilitating communication between systems, and even small API failures can cause significant disruptions in service delivery. This article delves into the best API monitoring tools available in 2024, encompassing both proprietary and open-source options, to assist you in selecting the most suitable solution for your business requirements.

Introducing The eBPF Agent: A New, No-Code Approach for Cloud-Native Observability

Microservices architecture has become a dominant approach for building scalable, resilient, and flexible applications. However, monitoring these microservices presents unique challenges due to their distributed nature, fixed or limited resources, enterprise scale, and the dynamic nature of environments, such as Kubernetes clusters. The result is that in-process application agents often introduce significant overhead because they rely on intrusive instrumentation and frequent polling.

Broadcom's Vision for Network Observability

The performance monitoring industry has been using the word “observability” to a lot of different ends lately. While the trend towards more visibility into services is a good one, it’s also based on a need we see from customers on a day-to-day basis. The need to take back control of network visibility is strong in the face of complexity that has been rapidly increasing for years.

Is OpenTelemetry Open for Business? September 2024 Update

One of the things about OpenTelemetry that’s easy to miss if you’re not spending the whole day in the ins and outs of the project is just how much stuff it can do—but that’s what I’m here for! Today, I want to go through the project and give you a guide to the various parts of OpenTelemetry, how mature they are, and what you can expect over the next six months or so. I ranked these elements by relative maturity across the entire project.

Introducing Ingest Guard - A Game-Changer for Observability Cost Control

It’s day 1 of SigNoz Launch Week 2.0, and we’re releasing Ingest Guard, a feature that will help platform and finops teams have granular control over data ingestion and observability costs. At SigNoz, we are constantly evolving to meet the needs of modern engineering teams, and this launch week, we're excited to introduce a highly anticipated feature—Ingest Guard.

Introducing Ingest Gaurd: A Game-Changer for Observability Cost Control

Ingest Guard is a feature that will help platform and finops teams have granular control on data ingestion and observability costs. This new addition to our platform is designed to enhance security, provide better cost control, and offer a streamlined approach to managing observability data.

Kibana vs Grafana - Comparison for Advanced Monitoring and Observability [2024 Guide]

Kibana and Grafana are the leading options when selecting a tool for observability and monitoring in cloud environments. This guide extensively explores their variances to assist you in selecting the most suitable option for your requirements. Understanding these tools is crucial for effective system monitoring, whether managing a small startup or a large enterprise.

Beyond Metrics: The Power of eBPF for Deep System Understanding. #observability #monitoringtool

Discover how eBPF can provide unparalleled visibility into your Kubernetes clusters. Watch the full webinar: "Zero-Instrumentation Observability with eBPF" with Peter Zaitsev. Coroot is an open source observability platform that helps engineers fix service outages and even prevent them. It continuously audits telemetry data to highlight issues and weak spots in your services. Quick setup, no code required.

Beyond Profiling: The Importance of Runqueue Latency. #observability #devopstools #profiling

Get tips on choosing the right eBPF-based tool for your Kubernetes environment. Watch the full webinar: "Zero-Instrumentation Observability with eBPF", learn from Peter Zaitsev. Coroot is an open source observability platform that helps engineers fix service outages and even prevent them. It continuously audits telemetry data to highlight issues and weak spots in your services. Quick setup, no code required.

Open Source Alternatives to Tracealyzer

Tracealyzer is a popular tool for visualizing and analyzing the execution of real-time systems, but its price tag can be a barrier for some developers. This guide explores powerful open-source alternatives that provide similar functionality for free, helping you choose the right tool for your embedded systems projects.

Best Practices for Multi-Cloud Observability

If The Notorious BIG – the artist behind the iconic song "Mo Money Mo Problems" – had been an IT operations engineer, he might instead have labeled his hit "Mo Clouds Mo Problems." Why? Because the more clouds you have to manage and monitor, the more problems you're likely to run into.

How to Tail Docker Logs - Detailed Guide

Managing Docker container logs is essential for debugging and monitoring application performance. Tailoring Docker logs allows for real-time insights, quick issue resolution, and optimized performance. This guide focuses on efficient methods for tailing Docker logs, with clear examples and command options to streamline log management.

Is it Time to Version Observability? Signs Point to Yes

In 2016, we at Honeycomb first borrowed the term “observability” from the wikipedia entry for control systems observability, where it is a measure of your ability to understand internal system states just by observing its outputs. We then spent a couple of years trying to work out how that definition might apply to software systems. Many twitter threads, podcasts, blog posts, and lengthy laundry lists of technical criteria emerged from that work, including a whole ass book.

Zero-instrumentation observability based on eBPF

Zero-Instrumentation Observability with eBPF Are you struggling to achieve comprehensive system observability without the burden of instrumentation? Join Peter Zaitsev for a webinar that will revolutionize your approach. Discover how eBPF, a powerful technology, can provide zero-instrumentation observability, allowing you to: Coroot is an open source observability platform that helps engineers fix service outages and even prevent them. It continuously audits telemetry data to highlight issues and weak spots in your services. Quick setup, no code required.

Guide to Crontab Logs - How to Find and Read Crontab Logs

Crontab logs are records of scheduled tasks (or "cron jobs") that are executed by the cron daemon on Unix-like operating systems such as Linux. These logs provide details about the tasks that have been run, when they were executed, whether they completed successfully, and any errors or issues that occurred during their execution. This detailed guide will cover all aspects of crontab logs, from fundamental concepts to advanced strategies for optimization.

Strategies For Reducing Observability Costs With OpenTelemetry

Keeping smooth and safe operations now relies entirely on observability. But as there's more and more data to keep track of, the costs are going up. This makes it hard for your companies to balance how well things are running and their budgets. OpenTelemetry can help by making a standard way to collect and process all the data. We're going to share how OpenTelemetry can save you money on observability and why having too much data can be costly.

Top 5 New Relic Competitors & Alternatives in 2024 [Including Open-Source]

While New Relic has long been a popular choice for Application Performance Monitoring (APM), the tech landscape has brought forth several compelling alternatives. This guide provides an in-depth look at the top New Relic competitors and alternatives including open-source, comparing their features, strengths, and use cases to help you make an informed decision for your organization's needs.

The Evolution of Engineering and the Role of Observability 2.0 in Shaping the Future

Engineering has come a long way since the days of delivering discrete, point-in-time products that were often packaged on a CD and shipped to customers. The days of physical media and long development cycles are long gone. The advent of cloud computing and the rise of Software-as-a-Service (SaaS) transformed the landscape, creating a new model of continuous development and service delivery. This shift has not only revolutionized how software is developed, but has also redefined the engineer’s role.

Centralized Observability on Kubernetes with SigNoz

Monitor your applications and troubleshoot problems in your deployed applications, an open-source alternative to DataDog, New Relic, etc. Backed by Y Combinator. SigNoz helps developers monitor applications and troubleshoot problems in their deployed applications. SigNoz uses distributed tracing to gain visibility into your software stack. If you need any clarification or find something missing, feel free to raise a GitHub issue with the label documentation or reach out to us at the community Slack channel.

Enhance digital resilience through observability

As digital demands grow, so does the pressing need to move to AI and cloud integrations. This is the state for SMEs in Australia who wish to boost agility and responsiveness. These complexities often come with a package of challenges that include increasing costs, security risks, and scalability issues. A recent white paper from ManageEngine addresses this.

How Data Observability is Transforming Modern Enterprise

Modern enterprises are more dependent than ever on data. That's why it's more important than ever for organizations to ensure that their data is accurate, reliable, and easily accessible. Data observability is a modern method that helps achieve this. It involves real-time monitoring of data to detect unusual patterns. By doing so, it ensures data quality and reliability, which boosts operational efficiency and governance.

Getting Started With Refinery: Rules File Template

Sampling is a necessity for applications at scale. We at Honeycomb sample our data through the use of our Refinery tool, and we recommend that you do too. But how do you get started? Do you simply a set rate for all data and a handful of drop and keep rules, or is there more to it? What do these rules even mean, and how do you implement them? To answer these questions, let’s look at a rules file template that we use for customers when first trying out Refinery.

Observability vs Monitoring [Understanding the Key Differences in 2024]

When systems fail, it's not just a technical hiccup – it's a business problem. Downtime means unhappy customers and lost revenue. That's why teams need effective ways to spot issues fast and fix them even faster. This is where monitoring and observability come into play. Monitoring and observability are two key approaches to keeping your systems running smoothly. Monitoring is like your system's alarm bell – it tells you when something's wrong.

Strategies for Lowering Observability Costs

Learn how to cut IT observability costs with OpenTelemetry. We'll cover ways to streamline data collection, reduce hidden expenses, and optimize data management. Discover practical tips for handling telemetry data efficiently, avoiding vendor lock-in, and improving system performance. Watch this video for actionable insights and real-world examples of using OpenTelemetry to manage costs effectively.

Lumigo Introduces AI to Simplify Observability Workflows

Lumigo is expanding its troubleshooting and observability platform with cutting-edge AI-powered tooling, now available in beta, which will provide developers and DevOps teams with the fastest and most cost-efficient way to debug and observe complex microservices. AI is quickly reshaping the technology landscape. However, observability tools have been slow to find ways to leverage AI in a fashion that provides tangible value.

How Australian local governments can use cloud-native observability

Australian city councils are the command centers of every city, ensuring essential services are delivered with reliability and speed while being available for citizens' queries and requests. Though the IT infrastructure of Australian city councils has predominantly been on-premises, the last decade has seen a substantial digital shift with increasing cloud adoption.

Prometheus vs InfluxDB [Detailed Technical Comparison for 2024]

Prometheus and InfluxDB represent two distinct approaches to time-series data management and system monitoring. As organizations grapple with increasing data volumes and complex infrastructures, choosing the right tool becomes crucial. This analysis dives deep into the technical nuances of Prometheus and InfluxDB, examining their architectures, data models, and performance characteristics.