Monthly Archive

8 Years of Building Obkio: From Network Monitoring to Observability & Network Diagnostics

Feb 27, 2026 By Alyssa Lamberti In Obkio

In 2016, Obkio was just an idea, but it was an idea born from a real problem. Before writing a single line of code, we conducted a market audit to understand why Network Performance Monitoring solutions weren't more mature. We interviewed banks, manufacturing companies, and service providers, and the answer was unanimous: the NPM tools on the market were too complex, and most businesses simply didn't have the internal resources to dedicate full-time to managing them.

Read Post

Obkio

Read more about 8 Years of Building Obkio: From Network Monitoring to Observability & Network Diagnostics

Data Observability, AI Guard, Feature Flags, Ambassador program, and more | This Month in Datadog

Feb 27, 2026 By Datadog In Datadog

See how you can ensure trust across the data life cycle in February’s episode of This Month in Datadog. Join us for a spotlight of Datadog Data Observability, which enables you to detect data quality and pipeline issues early, as well as remediate those issues with end-to-end lineage. Plus, we cover: Protecting agentic AI applications from real-time threats with Datadog AI Guard Staying up to date and reducing steps to collaborate with five new Incident Management releases Releasing software with confidence using Datadog Feature Flags.

View Video

Datadog

Read more about Data Observability, AI Guard, Feature Flags, Ambassador program, and more | This Month in Datadog

The rise of agentic AI in production: Can observability systems run themselves?

Feb 27, 2026 By Grafana Labs Team In Grafana

Sometimes the biggest shifts in technology aren’t about collecting more data — they’re about who (or what) gets to act on it. In this episode of “Grafana’s Big Tent” podcast, host Tom Wilkie, Grafana Labs CTO, is joined by Spiros Xanthos, Founder & CEO of Resolve AI, Manoj Acharya, VP of Engineering for Observability at Grafana Labs, and Cyril Tovena, Principal Engineer on the Grafana Assistant team, to discuss agentic AI in observability.

Read Post

Grafana

Read more about The rise of agentic AI in production: Can observability systems run themselves?

From RCA to Autonomous Ops: The Future of AI in Observability | Big Tent S3E7

Feb 27, 2026 By Grafana In Grafana

SREs are famously skeptical of AI — so how do you convince them to trust agents in production? In this episode of Grafana’s Big Tent, Tom Wilkie talks with Spiros Xanthos (Resolve AI), Manoj Acharya (Grafana Labs), and Cyril Tovena (Grafana Assistant team) about agent-first observability. They unpack knowledge graphs, LLM reasoning, autonomous debugging, pricing models, and the “Claude Code moment” for observability. Is autonomous production ops closer than we think?

View Video

Grafana

Read more about From RCA to Autonomous Ops: The Future of AI in Observability | Big Tent S3E7

Observability Self-Hosted 2026.1 - Routing Insights

Feb 26, 2026 By solarwindsinc In SolarWinds

SolarWinds Evangelist Chrystal Taylor introduces the new routing insights feature in Observability Self-Hosted 2026.1. This first phase enhancement enriches routing table information with detailed context, including forwarding interface names, VRF data, next hop IPs, and timestamps. The update unifies BGP, OSPF, and EIGRP neighbors in a single dashboard, providing visibility into peer identity, flap counts, health status, and admin states.

View Video

SolarWinds

Read more about Observability Self-Hosted 2026.1 - Routing Insights

Let's make alerting great again

Feb 26, 2026 By Nikolay Sivko In Coroot

No one has time to watch dashboards all day. Alerts exist to tell us when something goes wrong or is starting to go wrong, so we can act early. In theory, it sounds simple. Define a rule, set a threshold, get notified when it is crossed. In practice, it rarely works that smoothly.

Read Post

Coroot

Read more about Let's make alerting great again

Colsubsidio transforms business process monitoring with Elastic Observability

Feb 26, 2026 By Amena Siddiqi In Elastic

Colsubsidio is one of the largest and most representative family compensation funds in Colombia. The organization manages and delivers essential social services to millions of users through a broad network spanning health, education, subsidies, recreation, tourism, credit, housing, pharmacies, retail supply, culture, and labor welfare.

Read Post

Elastic

Read more about Colsubsidio transforms business process monitoring with Elastic Observability

Lightrun Launches Industry's First AI SRE With Live Dynamic Runtime Context

Feb 25, 2026 By Lightrun In Lightrun

Autonomously Remediates Software Issues, Generates Missing Runtime Evidence on Demand, and Validates Hypotheses Against Live Execution from Code to Production.

Read Post

Lightrun

Read more about Lightrun Launches Industry's First AI SRE With Live Dynamic Runtime Context

Incident Report: Exercises, Cleanups, and Evacuations

Feb 25, 2026 By Fred Hebert In Honeycomb

Every year, Honeycomb runs disaster recovery scenarios in multiple environments, including in production. Although each of our instances runs in a single region, on at least three Availability Zones (AZs), we have multiple plans for partial regional failures, and particularly, zonal failures. One of these tests was run on December 5th, and after its successful completion came its cleanup steps.

Read Post

Honeycomb

Read more about Incident Report: Exercises, Cleanups, and Evacuations

Observability Self-Hosted 2026.1 - Server Configuration Comparisons

Feb 25, 2026 By solarwindsinc In SolarWinds

In this video, SolarWinds Evangelist Chrystal Taylor introduces server configuration comparisons, a new feature in Observability Self-Hosted 2026.1 and Server Configuration Monitor 2026.1. The key highlight is the ability to compare server configurations side by side, enabling users to identify differences in configuration files between nodes or against a defined ideal state. This new functionality aims to help users monitor configuration drift.

View Video

SolarWinds

Read more about Observability Self-Hosted 2026.1 - Server Configuration Comparisons

Observability Self-Hosted 2026.1 - Additional Cloud Support

Feb 24, 2026 By solarwindsinc In SolarWinds

SolarWinds Evangelist Chrystal Taylor demonstrates the new cloud entity support features in Observability Self-Hosted version 2026.1. The update adds monitoring capabilities for MySQL and PostgreSQL databases on Google Cloud Platform, GCP load balancers, Azure functions, AWS Elastic Kubernetes Service, and AWS Lambda functions. She provides a guided walkthrough of the dashboard interface, showing how users can monitor various metrics including database performance, network traffic, latency, function execution counts, system usage, and costs across different cloud platforms.

View Video

SolarWinds

Read more about Observability Self-Hosted 2026.1 - Additional Cloud Support

Using Core Web Vitals in Honeycomb Frontend Telemetry

Feb 23, 2026 By Ken Rimple In Honeycomb

Google's Core Web Vitals (CWVs) measurements have been used by web administrators and SREs to review frontend application performance metrics, and have been factored into Google's page rankings since 2021. They are also used in Google Analytics, which crawls websites and evaluates performance metrics over a period of multiple days, and with various frontends (desktop web, mobile web, etc.) to establish how well a website performs in production.

Read Post

Honeycomb

Read more about Using Core Web Vitals in Honeycomb Frontend Telemetry

A 4-Month Bug Fixed in <10 Minutes with Olly

Feb 23, 2026 By Chris Cooney In Coralogix

In today’s highly interconnected systems, the subtle relationships between services are rarely obvious. Modern, complex architectures generate telemetry that functions less as “flashing signs” and more as faint “breadcrumbs” to be followed across a vast network of signals. In 2025, about two-thirds of outages involved third-party systems like cloud platforms and APIs.

Read Post

Coralogix

Read more about A 4-Month Bug Fixed in

The Next Era of Observability: Founders' Reflections - Additional Q&A

Feb 19, 2026 By Rox Williams In Honeycomb

What happens when the people who helped define observability take a hard look at AI? That’s what Honeycomb co-founders Christine Yen (CEO) and Charity Majors (CTO) dug into during this webinar, starting with the early days of observability (back when it wasn’t even a category yet).

Read Post

Honeycomb

Read more about The Next Era of Observability: Founders' Reflections - Additional Q&A

Kiro Can Now Use Lightrun via MCP

Feb 19, 2026 By Lightrun Team In Lightrun

AI code assistants transformed how software is written. They did not transform how it fails. Today, we’re announcing a new MCP integration between Lightrun and Kiro. Kiro now gains live runtime visibility through the Lightrun MCP, grounding AI-assisted development in how code actually behaves at runtime. Kiro, the AI coding assistant from the teams at AWS, is built for velocity and intuition. It helps teams move from specification to production faster by turning intent into working code.

Read Post

Lightrun

Read more about Kiro Can Now Use Lightrun via MCP

How to Make AI-Generated Code Reliable with Runtime Context

Feb 19, 2026 By Lightrun Team In Lightrun

AI coding assistants like Cursor and Claude Code are driving massive productivity gains, yet they have introduced a critical validation gap in the software delivery lifecycle. While these tools excel at generating syntax, they lack visibility into live production environments. This article explains how Runtime Context, the missing nervous system of AI development, secures production by moving from probabilistic guessing to deterministic, live code validation.

Read Post

Lightrun

Read more about How to Make AI-Generated Code Reliable with Runtime Context

Teaching AI How to Refinery

Feb 17, 2026 By Tyler Helmuth In Honeycomb

At the beginning of February, we released v3.1 of Refinery, our advanced, tail-based sampling solution. The new version comes with more performance enhancements, bug fixes, and a few new pieces of telemetry. In tandem with the 3.1 release, we also released a new tool for our MCP server which helps your AIs understand Refinery, and how Honeycomb handles sampling.

Read Post

Honeycomb

Read more about Teaching AI How to Refinery

Introducing "Explain Flame Graph": Stop Fighting Fires and Start Explaining Them

Feb 16, 2026 By Jonny Steiner In Coralogix

In a modern observability deployment, it’s simple to get data that helps you understand where your system is failing. However, when we try to understand why, the answer is often buried beneath a mound of stack traces. For many developers, attempting to interpret a flame graph by manually calculating self-time (the resources consumed by the function itself) versus child-frame latency (the time spent waiting on called sub-functions) is both confusing and time-consuming.

Read Post

Coralogix

Read more about Introducing "Explain Flame Graph": Stop Fighting Fires and Start Explaining Them

Sovereign observability: How UAE data residency powers resilient digital economies

Feb 13, 2026 By Ramkumar Ramaswamy In Site24x7

Cloud observability is a must for IT teams operating in modern digital economies. It allows administrators to see inside complex systems, understand how each component behaves under real conditions, and act before users or regulators feel the impact. In simple terms, observability transforms digital infrastructure from a black box into a transparent, accountable, and resilient system.

Read Post

Site24x7

Read more about Sovereign observability: How UAE data residency powers resilient digital economies

Uptrace Errors & Logs Tutorial: Capture Stacktraces, Context, and Traces in One Place

Feb 12, 2026 By Uptrace In Uptrace

Every error tells a story — and Uptrace helps you see the full picture. In this tutorial, you’ll learn how to use Uptrace to capture errors, logs, stacktraces, and request context in a single observability platform. See how errors automatically link to traces, understand exactly what happened, and debug issues faster with rich attributes, user data, and performance impact. What you’ll learn: Understand not just *what broke*, but *who it affected and why* — and fix problems with confidence using Uptrace.

View Video

Uptrace

Read more about Uptrace Errors & Logs Tutorial: Capture Stacktraces, Context, and Traces in One Place

Uptrace Tutorial: Dashboards, Percentiles, Heatmaps & OpenTelemetry Metrics

Feb 12, 2026 By Uptrace In Uptrace

Learn how to use *Uptrace* to measure what truly matters in your applications using percentiles, heatmaps, and histograms—then turn that data into dashboards that answer questions before they’re even asked. In this tutorial, you’ll discover how to: Whether you’re setting up observability for the first time or replacing expensive monitoring tools, this guide shows how Uptrace helps you understand performance, reliability, and user experience — all in one place.

View Video

Uptrace

Read more about Uptrace Tutorial: Dashboards, Percentiles, Heatmaps & OpenTelemetry Metrics

End-to-End Tracing with Uptrace: Follow Any Request Across Your Entire System

Feb 12, 2026 By Uptrace In Uptrace

Stop guessing where requests slow down. With Uptrace, you can follow any request across your entire system and instantly see performance bottlenecks, errors, and latency sources. This video covers: Build real observability, not just dashboards.

View Video

Uptrace

Read more about End-to-End Tracing with Uptrace: Follow Any Request Across Your Entire System

Uptrace Alerts in 10 Minutes: Metrics, Errors, Slack & Telegram

Feb 12, 2026 By Uptrace In Uptrace

Learn how to monitor application metrics, track errors, and configure real-time alert notifications in Uptrace. In this step-by-step tutorial, you will: Perfect for developers, DevOps engineers, and teams looking for simple, powerful observability.

View Video

Uptrace

Read more about Uptrace Alerts in 10 Minutes: Metrics, Errors, Slack & Telegram

Happy Birthday to Us: Honeycomb 10 Year Manifesto, Part 1

Feb 11, 2026 By Charity Majors In Honeycomb

Christine and I started Honeycomb in 2016, which means it’s been ten years. Christine, a developer, and I, an operations engineer, were both profoundly unhappy with the state of the art in monitoring and logging tools. The tools we had used at Facebook didn’t spray our signals around to a bunch of siloed-off pillars. They consolidated as much context as possible so we could properly explore it, the way every other non-software engineering team already takes for granted.

Read Post

Honeycomb

Read more about Happy Birthday to Us: Honeycomb 10 Year Manifesto, Part 1

ilert now supports a native WhaTap integration

Feb 11, 2026 By Sirine Karray In iLert

ilert now supports a native WhaTap integration, connecting AI-native observability with AI-first incident management in a seamless workflow. This integration allows DevOps, SRE, and IT teams to move instantly from detection to resolution – cutting through alert noise, improving coordination, and dramatically reducing MTTR in even the most complex IT environments.

Read Post

iLert

Read more about ilert now supports a native WhaTap integration

The Architecture Shift Powering Network Observability

Feb 11, 2026 By Idan Green In Broadcom

If you work in network operations, you know that the only constant is the increasing complexity of the infrastructure you manage. The days of installing a monolithic software package on a single bare-metal server and letting it hum along for years are largely behind you. The software industry has largely shifted toward cloud-native architectures, microservices, and containerization. While these shifts promise agility and scalability, they also introduce significant operational complexity.

Read Post

Broadcom

Read more about The Architecture Shift Powering Network Observability

Kubernetes Network Observability: Comparing Calico, Cilium, Retina, and Netobserv

Feb 11, 2026 By Reza Ramezanpour In Tigera

Calico, Cilium, Retina, and Netobserv: Which Observability Tool is Right for Your Kubernetes Cluster? Network observability is a tale as old as the OSI model itself and anyone who has managed a network or even a Kubernetes cluster knows the feeling: a service suddenly can’t reach its dependency, a pod is mysteriously offline, and the Slack alerts start rolling in. Investigating network connectivity issues in these complex, distributed environments can be incredibly time consuming.

Read Post

Tigera

Read more about Kubernetes Network Observability: Comparing Calico, Cilium, Retina, and Netobserv

Why distributed observability is straining and what new research reveals

Feb 11, 2026 By OpsMatters In OpsMatters

Distributed systems quietly run much of today's digital world. People expect these systems to work reliably across regions and time zones for everything from money transfers to streaming platforms and AI-driven workloads. As organisations use more microservices, containers, and event-driven architectures, observability has become the main way for teams to understand what is happening in production.

Read Post

OpsMatters

Read more about Why distributed observability is straining and what new research reveals

Heartbeat behind the metrics | Muraleedharan on support, scale, and seeing the product in the wild

Feb 9, 2026 By ManageEngine Site24x7 In Site24x7

What does observability look like when you’re responsible for customers at scale? In this episode of Heartbeat Behind the Metrics, Muraleedharan Sadhasivam, Head of Customer Success, talks about his 15-year journey at ManageEngine and the perspective you only get from being close to customers every day. He shares why custom dashboards matter so much, and why AppLogs is a feature he wishes more users explored to complete the MELT story. From querying logs to turning them into alerts and dashboards, he explains how real insights start when data is brought together.

View Video

Site24x7

Read more about Heartbeat behind the metrics | Muraleedharan on support, scale, and seeing the product in the wild

Go Context timeouts can be harmful

Feb 7, 2026 By Vladimir Mihailenco In Uptrace

You probably should avoid ctx.WithTimeout or ctx.WithDeadline with code that makes network calls. Here is why.

Read Post

Uptrace

Read more about Go Context timeouts can be harmful

Monitoring cache stats using OpenTelemetry Go Metrics

Feb 7, 2026 By Vladimir Mihailenco In Uptrace

This article explains how to use opentelemetry-go Metrics API to collect metrics, for example, go-redis/cache stats.

Read Post

Uptrace

Read more about Monitoring cache stats using OpenTelemetry Go Metrics

How eBPF Improves Open Source Observability

Feb 6, 2026 By Coroot In Coroot

Try it open source on your system. Learn how tools can make gathering and making sense of observability data instant and painless with co-founder Peter Zaitsev.

View Video

Coroot

Read more about How eBPF Improves Open Source Observability

How we built Grafana Assistant - a conversation about AI development for observability

Feb 6, 2026 By Grafana In Grafana

This conversation with Grafana Labs engineers, Mat Ryer, Cyril Tovena and Sven Großmann, dives deep into the engineering behind Grafana Assistant, exploring how agentic AI is transforming the observability landscape. From hackathon origins to sophisticated backend agents, the team shares candid lessons on building, scaling, and refining AI tools for engineers.

View Video

Grafana

Read more about How we built Grafana Assistant - a conversation about AI development for observability

Kiro Can Now Reason With Lightrun's Live Runtime Context

Feb 5, 2026 By Gideon Freud In Lightrun

AI code generation is fast. Making it reliable requires runtime context. Today, Kiro gains live runtime visibility with the Lightrun MCP. This grounds AI-assisted development in how code actually behaves at runtime. Kiro, the AI coding assistant from the teams at AWS, is built for velocity and intuition. It moves from specification to production with speed and structure, helping teams turn intent into working code. But until now, like every AI coding assistant, Kiro had a major blind spot.

Read Post

Lightrun

Read more about Kiro Can Now Reason With Lightrun's Live Runtime Context

How Honeycomb Supercharges OpenTelemetry for AI

Feb 5, 2026 By Fahim Zaman In Honeycomb

It has become common knowledge that the nature of software development has changed as AI-code generation and agent-based features gain adoption. In perhaps a more subtle shift, the fundamentals of software instrumentation are changing too. As OpenTelemetry becomes the standard instrumentation layer across enterprises, with thousands of developers (many from Honeycomb) actively contributing to it, the nature of the telemetry data captured itself is evolving to meet the growing demand for rich context.

Read Post

Honeycomb

Read more about How Honeycomb Supercharges OpenTelemetry for AI

Top 9 Observability Tools for AI-Assisted Development & Deployment

Feb 5, 2026 By OpsMatters In OpsMatters

AI-assisted development is rapidly becoming the default way software is built. Code generation, AI copilots, agentic pull requests, and automated refactoring are now embedded directly into engineering workflows. While this shift dramatically increases delivery speed, it also introduces a new operational reality: production systems are changing faster than humans can fully reason about them. This is where observability becomes mission-critical.

Read Post

OpsMatters

Read more about Top 9 Observability Tools for AI-Assisted Development & Deployment

Observability trends for 2026 (Part 2): GenAI and OpenTelemetry reshape the landscape

Feb 4, 2026 By David Hope In Elastic

Over the course of my 20 years as a developer, SRE, and now observability product leader, software has typically progressed at a good pace. But now, the emergence of two transformative technologies are fundamentally reshaping enterprise observability: generative AI (GenAI) and OpenTelemetry (OTel). We surveyed over 500 IT decision-makers for a new report:The Landscape of Observability in 2026: Balancing Cost and Innovation.

Read Post

Elastic

Read more about Observability trends for 2026 (Part 2): GenAI and OpenTelemetry reshape the landscape

What's New at Cribl 4.16: On release days, we wear teal.

Feb 4, 2026 By Cribl In Cribl

On release days, we wear teal, y'all! Check out the fun and exciting new features from Cribl releases on a monthly (:fingers-crossed:) basis. Here's what's new in Cribl 4.16.

View Video

Cribl

Read more about What's New at Cribl 4.16: On release days, we wear teal.

30+ Top Observability Tools to Monitor Websites and Applications [2026 Updated]

Feb 3, 2026 By Janani In Atatus

By incorporating observability tools into your stack, you can better understand how your complex infrastructure operates, reduce downtime, and empower developers to identify and fix problems quickly. However, it now takes considerably more work, time, and money to build the best observability tools for your infrastructure and applications. According to a Splunk survey, over half of the firms polled employ eight or more observability tools.

Read Post

Atatus

Read more about 30+ Top Observability Tools to Monitor Websites and Applications [2026 Updated]

OpenTelemetry Instrumentation Best Practices for Microservices Observability

Feb 3, 2026 By Sematext In Sematext

OpenTelemetry instrumentation is the foundation of modern microservices observability, but getting it right in production requires more than just enabling auto-instrumentation. This guide covers production-tested OpenTelemetry best practices that help engineering teams achieve reliable distributed tracing, control observability costs, and extract maximum value from their telemetry data.

Read Post

Sematext

Read more about OpenTelemetry Instrumentation Best Practices for Microservices Observability

Agentic Observability - The Top Observability Trends in 2026

Feb 2, 2026 By Splunk In Splunk

Learn how autonomous agents are using real-time observability telemetry data to diagnose, fix and verify their own work.

View Video

Splunk

Read more about Agentic Observability - The Top Observability Trends in 2026

Observability vs Monitoring: Getting a Full Picture of the Environment

Feb 2, 2026 By Jeff Darrington In Graylog

Driving down the highway, you usually glance intermittently at your speedometer to ensure that you stay within the speed limit, or whatever window above the speed limit you’re willing to drive. While monitoring your speed mitigates the risk of a ticket, you still need to look out for various threats on the road, like cars going through stop signs. By observing your surroundings, you take in real-time information that can help prevent a crash.

Read Post

Graylog

Read more about Observability vs Monitoring: Getting a Full Picture of the Environment

Operations | Monitoring | ITSM | DevOps | Cloud