Operations | Monitoring | ITSM | DevOps | Cloud

February 2024

Negotiating Priorities Around Incident Investigations

There are countless challenges around incident investigations and reports. Aside from sensitive situations revolving around blame and corrections, tricky problems come up when having discussions with multiple stakeholders. The problems I’ll explore in this blog—from the SRE perspective—are about time pressures (when to ship the investigation) and the type of report people expect.

Decoding .NET8: Unveiling Cloud-Native Observability

The.NET programming language is taking cloud native deployment and observability seriously, and most notably with the recent announcement of.NET Aspire stack unveiled at the recent.NET Conf 2023. In the latest episode of OpenObservability Talks, we reviewed the journey to making.NET a “by default, out of the box observable platform,” as ASP.NET Core creator David Fowler put it.

Critical Automation: Anomaly Detection for Application Observability

There’s no debate — in our increasingly AI-driven, lean and data-heavy world, automating key tasks to increase effectiveness and efficiency is the ultimate name of the game. No matter what job you hold today, you’re likely being pushed to not only do more with less, but also perform your work with a tighter focus on specific outcomes and SLOs.

Much Ado About OpenTelemetry

There is so much good work that OpenTelemetry has done in the software industry, specifically around the domain of observability, in the last five years. Bringing users and vendors together to define the future of telemetry? Check! Unify logs, traces, and metrics under a completely vendor-neutral API? Check! Deprecate other standards by bringing their collaborators to the table to ensure their use cases are met? CHECK!

FOSDEM - Costa Tsaousis: Netdata Open Source Distributed Observability Pipeline Journey & Challenges

FOSDEM - Costa Tsaousis: Netdata Open Source Distributed Observability Pipeline Journey & Challenges ABSTRACT: Netdata is a powerful open-source, distributed observability pipeline designed to provide higher fidelity, easier scalability, and a lower cost of ownership compared to traditional monitoring solutions. This presentation will offer an in-depth overview of the journey we've undertaken in building Netdata, highlighting the challenges we've faced and the innovative solutions we've developed to address them.

OpsRamp and the Rise of Observability

As IT environments become more complex, cloud-based and divided across microservices, containers, and serverless computing, opportunities to optimise efficiency and improve performance open up. From cost and capacity savings to improving the speed, responsiveness and reliability of apps, it’s clear businesses are increasingly making the connection between IT and commercial outcomes.

The Next Generation of Papertrail is Here!

We are excited to unveil the next generation of SolarWinds® Papertrail™, SolarWinds Observability® logging. More powerful and faster than ever, the next generation of Papertrail, SolarWinds Observability logging aggregates log data from applications, services, infrastructure, databases, and network devices across both cloud-based and on-premise systems.

How SOCAR is driving visibility using Sumo Logic

SOCAR needed an observability solution that could parse logs, monitor ephemeral infrastructure in Kubernetes and ensure high visibility into their application, all at a price that fit their budget. Sumo Logic checked all those boxes and has already boosted team collaboration. Learn more about their purchase decision and how they're already making unexpected discoveries.

APM From a Developer's Perspective

In twenty years of software development, I did not have the privilege of being on call, of tending to my software in production. I’ve never understood what “APM” means. Anybody can tell me what it stands for—Application Performance Monitoring (or sometimes, the M means Management)—but what does it mean? What do people use APM for?

Flight to Success: Birdie's DevOps Evolution Fueled by Observability Insights

Birdie wanted to uplevel observability to a platform that would provide meaningful insights for application performance and debugging. Ensuring customers can provide seamless and timely care to in-home patients stands as a top priority for Birdie, and the development team takes pride in building and maintaining a high-quality platform distinguished by its reliability and responsiveness.

Capturing Security and Observability Data From Oracle Cloud

A couple of years ago, I wrote another blog on how Oracle Cloud Infrastructure (OCI) Object Storage can be used as a data lake since it has an Amazon S3-compliant API. Since then, I’ve also fielded several requests to capture logs from OCI Services and send them through Cribl Stream for optimization and routing to multiple destinations. There are two primary methods to achieve this.

How the open source Caddy server uses Grafana Cloud for full-stack observability

Mohammed Al Sahaf serves as Technical Product Manager at Samsung Electronics Saudi Arabia. Outside his day job, he serves with the Caddy team to tackle the web of problems facing web servers in the third millennium. Mohammed is the author of Kadeessh, formerly caddy-ssh, and the maintainer of numerous Caddy modules. When he isn’t programming, he is trying to catch up on life and sleep with the help of coffee.

Modern Observability for Data Unification for Business Insights Is Here

Today, we’re adding to the groundwork we’ve established to provide enterprise organizations with a modern approach to data unification to improve insights. Our True North unifies all the many layers of data into a single stream to provide better insights so you can make better decisions and adjustments and bring value to the organization.

Three Properties of Data to Make LLMs Awesome

This post first appeared on Phillip's personal blog. Back in May 2023, I helped launch my first bona fide feature that uses LLMs in production. It was difficult in lots of different ways, but one thing I didn’t elaborate on in several blog posts was how lucky I was to have a coherent way to get the data I needed to make the feature useful for users.

Building Your Own Observability Solution vs Implementing a SaaS Solution

Observability is a key component of modern applications, especially highly complex ones with multiple containers, cloud infrastructure, and numerous data sources. You can implement observability in two ways: build your own observability solution or use a homegrown alternative like Coralogix.

What Is Network Observability? - 5 Best Platforms for Observability

In today’s world, every business relies on its network infrastructure to achieve its goals. It’s, therefore, critical to monitor your network infrastructure and be aware of how efficient it is. You can achieve this through network observability. What Is Network Observability? The 3 Key Factors of Network Observability Benefits of Network Observability Observability vs.

Generative AI in Observability: A Trip or a Trap?

Generative AI or Generative Artificial Intelligence, in its simplest form, means being capable of generating text, images, or any data using generative models, mostly in response to prompts. You would have all heard of OpenAI’s ChatGPT. It is generative AI in action. Essentially, What do you do in ChatGPT? You type in a topic or a question, and the robot replies with structured answers.

Webinar: Cloud security and observability: When integrity and availability meet

The bad news: It’s no wonder so many organizations find it near impossible to get control of — and ensure — a secure, reliable network. The good news: Technology leaders from Prisma Cloud and StackState show you how you can significantly enhance the integrity and availability of your cloud environment — with just a few lines of code or simple clicks.

What Is Application Performance Monitoring?

Every business is a software business. And by software, we don’t mean code—we mean running software serving customers in production. Those customers may be internal to the company, they may pay you money, or they may represent attention that increases ad revenue—either way, making them happy is your business. And your fast, reliable software makes them happy. Application performance monitoring, also known as APM, represents the difference between code and running software.

OpenTelemetry: 3 questions to ask before choosing an observability solution

As OpenTelemetry rises in popularity, more organizations are implementing, or planning to implement, the open source project to monitor their applications — and, meanwhile, more vendors are offering OpenTelemetry support. In fact, a quick Google search for “OpenTelemetry support” shows results ranging from legacy APM vendors to newer, cloud native solutions like Grafana Cloud.

Cisco Live EMEA '24 - Let's talk Full-Stack Observability!

Listen in on this conversation with Ronak Desai - Cisco AppDynamics SVP and General Manager for Full-Stack Observability where he discusses recent innovations such as the advancements with Digital Experience Monitoring and why DEM is important, Cisco AI Assistant and how Cisco leverages AI, and the Cisco Observability Platform with its huge number of developer led modules. Chapters.

The Role of Observability in Telecoms

The rapid growth of 5G technology and expanse of the Telecoms industry has created the need for these organizations to implement effective data-driven decisions, to enable the future profitability of their companies. This raises the challenge of analyzing data from various sources across complex networks to derive insights and ultimately decision making.

Safer Client-Side Instrumentation with Honeycomb's Ingest-Only API Keys

We're delighted to introduce our new Ingest API Keys, a significant step toward enabling all Honeycomb customers to manage their observability complexity simply, efficiently, and securely. Ingest Keys are currently available for Environment & Services customers, with Classic support and programmatic key management capabilities under development and coming soon!

Why ngrok Prioritized a Datadog Integration for Streamlined Monitoring of HTTP Events

ngrok delivers instant ingress to your applications in any cloud, private network, or devices with authentication, load balancing, and other critical controls using their global points of presence. Hear from Chad Tindel, Field CTO & VP WW Solution Architecture, on why Datadog was their most requested integration and how it provides an easy pathway to ship application and traffic logs into one unified observability platform.

Honeycomb CCP Games Case Study

Imagine a universe in which a massively multiplayer online role-playing game (MMORPG) sets Guinness World Records for the size of its online space battles—and that game is built on 20-year-old code. Well, imagine no more. Welcome to the world of EVE Online, where hundreds of thousands of players interact across 7,800+ star systems and participate in more than one million daily market transactions. As you might guess, updating and maintaining this codebase without interrupting game play could pose quite a challenge.

Data-centric AIOps: The Next Frontier With Observability Pipelines

Data-centric AI is the new frontier in AI, where the models themselves now remain stationary while tools, techniques and engineering practices improve data quality. As Andrew Ng puts it, “Data-centric AI is the discipline of systematically engineering data to build an AI system.”

Advancing Observability Maturity: Core Benefits

One of the major trends in software development in the last decade has been “shifting left” responsibilities that have traditionally been under operation’s domain to earlier in the software development life cycle (SDLC). It first came in the form of DevOps where a lot of the software engineering best practices were introduced to the deploy, operate, monitor phases. Such examples include continuous integration and continuous deployment (CI/CD) and Infrastructure as Code (IaC).

Streamlining Cloud Operations by Unifying Security & Observability

Many companies are using cloud technologies to become more agile, scalable, and cost-effective during their digital transformation. However, this change brings new challenges in maintaining the security and performance of applications and infrastructure in the cloud. Security and observability go hand-in-hand.

Why AI is crucial to your hybrid observability strategy: LogicMonitor's latest innovations

At LogicMonitor, we are deeply committed to a mission that goes beyond the conventional: revolutionizing IT monitoring through hybrid observability powered by AI. This ambition is not merely a slogan but the cornerstone of our entire approach. Our LM Envision platform was purposely designed to bring together diverse IT environments under one seamless, integrated experience. Enterprises have complex IT ecosystems.

The Key Role of Cloud Observability in Ensuring Security

The utilization of cloud-based technologies developed to optimize and streamline business operations is far from a novel idea. In fact, research suggests at least 90% of modern organizations currently use cloud platforms and related technologies to oversee essential processes.

Open Source Observability with OpenTelemetry and ChecklyDescription

We need to monitor our service's performance, but large closed SaaS options are expensive and complex. OpenTelemetry is the 'wave of the future' for observability, but is it ready for your team? Yes! Join Nočnica to see a demonstration of instrumenting a demo application and learn what OpenTelemetry can do. We'll also add external site monitors with Checkly synthetics checks.

Komodor Joins Forces with Cisco FSO to Elevate Kubernetes Management to New Heights

We at Komodor are excited to announce our groundbreaking integration with Cisco Full-Stack Observability (FSO). This collaboration marks a significant milestone in Kubernetes Continuous Reliability, bringing together the best of both worlds to redefine Kubernetes management.

Improving workflow performance through a unified observability experience

Unified observability experience from Cisco delivers seamless Observability across your hybrid and cloud native applications. Focus on lower Mean Time to Resolution (MTTR)! In today’s press release, Cisco Unveils New Innovations on the Cisco Observability Platform, we announced a host of exciting innovations.

Data Sovereignty and OpenTelemetry

In today’s economic and regulatory environment, data sovereignty is increasingly top of mind for observability teams. The rules and regulations surrounding telemetry data can often be challenging to interpret, leaving many teams in the dark about what kind of data they can capture, how long it can be stored, and where it has to reside. In the past, addressing these issues at scale was a costly endeavor.

Combining tracing and profiling for enhanced observability: Introducing Span Profiles

In today’s complex data landscape, continuous profiling has become essential for detailed insights into application resource usage. Grafana Labs is now advancing this field with the introduction of Span Profiles in Grafana 10.3. The Span Profiles feature represents a major shift in profiling methodology, enabling deeper analysis of both tracing and profiling data. Traditional continuous profiling provides a system-wide view over fixed intervals.

What is the Benefit of Including Security with Your Observability Strategy?

Observability strategies are needed to ensure stable and performant applications, especially when complex distributed environments back them. Large volumes of observability data are collected to support automatic insights into these areas of applications. Logs, metrics, and traces are the three pillars of observability that feed these insights. Security data is often isolated instead of combined with data collected by existing observability tools.

Kubernetes 2024: Challenges and solutions

Kubernetes has become the world's leading container orchestration platform, aiding small-scale to large-scale businesses in automating, autoscaling, and managing application deployments. Before delving deeper, let's understand why cloud-native solutions like Kubernetes have become the world's—especially organizations'—favorite technology. Creating highly scalable, resilient applications requires flexible infrastructure management.

6 Benefits of an AI-Powered Observability Pipeline

Observability Pipelines have become vital tools for DevOps and Security teams to manage, control, store, route, and optimize telemetry data analyzed by Security Information and Event Management (SIEM), Application Performance Monitoring (APM), and Log management platforms. These teams spend hours every week trying to fit an increasingly large volume of data into the same size box.

Where Does Honeycomb Fit in the Software Development Lifecycle?

“Mommy, where does software come from?” “Software grows in a circle, just like this!” The software development lifecycle (SDLC) is always drawn as a circle. In many places I’ve worked, there’s no discernable connection between “5. Operate” and “1. Plan.” However, at Honeycomb, there is. More on that later.

eBPF: Revolutionizing Observability for DevOps and SRE Teams

Whether you're a system administrator, a developer, or any other DevOps or Site Reliability Engineering (SRE) professional, you know that staying ahead in cloud-native computing is crucial. One way to keep your competitive edge in the technology game is to embrace the benefits of eBPF (Extended Berkeley Packet Filter). On top of advances in security and networking, eBPF-based tools are particularly impacting the observability landscape.
Sponsored Post

Take control of all your Telemetry Data with CloudFabrix Robotic Observability Pipelines

CloudFabrix, the Robotic Data Automation Fabric inventor, announced “Data Observability Pipelines” for dynamic Data Ingestion and automation for any data source and destination. The solution acts as a data management and integration service that uses robotic processes to automate data tasks, such as data integration, data ingestion, cleansing, transformation, and enrichment. Automated data management saves time, improves data quality, and streamlines data workflows.

Avoid Stubbing Your Toe on Telemetry Changes

When you have questions about your software, telemetry data is there for you. Over time, you make friends with your data, learning what queries take you right to the error you want to see, and what graphs reassure you that your software is serving users well. You build up alerts based on those errors. You set business goals as SLOs around those graphs.

Delivering Value with a Flat Budget

Join us for an important conversation with Cribl's Ed Bailey and Jackie McGuire, as we navigate the intricate balance of maximizing organizational value with a constrained budget. In today's challenging economic climate, where maintaining operations often means minimal to no additional spending, adaptive strategies become crucial. This is more than just a best-case scenario; it's a necessary approach for business resilience. Ed and Jackie will share innovative ideas and strategies to help leaders skillfully manage tight budgets while delivering significant value to their organizations.