Operations | Monitoring | ITSM | DevOps | Cloud

August 2023

Calculating Sampling's Impact on SLOs and More

What do mall food courts and Honeycomb have in common? We both love sampling! Not only do we recommend it to many of our customers, we do it ourselves. But once Refinery (our tail-based sampling proxy) is set up, what comes next? Since sampling is inherently lossy, it’s good to be sure the organization’s most important measurements aren’t negatively affected.

Why Observability Architecture Matters in Modern IT Spaces

Observability architecture and design is becoming more important than ever among all types of IT teams. That’s because core elements in observability architecture are pivotal in ensuring complex software systems’ smooth functioning, reliability and resilience. And observability design can help you achieve operational excellence and deliver exceptional user experiences. In this article, we’ll delve into the vital role of observability design and architecture in IT environments.

Grafana Pyroscope 1.0 release: continuous profiling for a modern open source observability stack

When we launched Pyroscope in 2021, we had one clear goal: Give developers a powerful open source continuous profiling tool for collecting, storing, and analyzing profiling data. Grafana Labs had a similar goal when they released Grafana Phlare, a horizontally scalable, highly available open source profiling solution inspired by databases like Grafana Loki, Grafana Mimir, and Grafana Tempo.

Advanced Monitoring and Observability Tips for Kubernetes Deployments

Cloud deployments and containerization let you provision infrastructure as needed, meaning your applications can grow in scope and complexity. The results can be impressive, but the ability to expand quickly and easily makes it harder to keep track of your system as it develops. In this type of Kubernetes deployment, it’s essential to track your containers to understand what they’re doing.

How to deploy Hello World Elastic Observability on Google Cloud Run

Elastic Cloud Observability is the premiere tool to provide visibility into your running web apps. Google Cloud Run is the serverless platform of choice to run your web apps that need to scale up massively and scale down to zero. Elastic Observability combined with Google Cloud Run is the perfect solution for developers to deploy web apps that are auto-scaled with fully observable operations, in a way that’s straightforward to implement and manage.

Blackhat 2023 Recap: How Will Advanced AI Impact Cybersecurity?

Ed Bailey and Jackie McGuire from Cribl will recap Black Hat 2023, focusing on emerging trends in cybersecurity, including the rise of advanced AI. We’ll share insights and anecdotes from our time at the event. Tune into the live stream for an engaging discussion, and come prepared with your thoughts and questions about Black Hat and the future of cybersecurity.

When Two Worlds Collide: AI and Observability Pipelines

In today's data-driven world, ensuring the stability and efficiency of software applications is not just a need but a requirement. Enter observability. But as with any evolving technology, there's always room for growth. That growth, as it stands today, is the convergence of artificial intelligence (AI) with observability pipelines. In this blog, we'll explore the idea behind this merge and its potential.

Comparing Six Top Observability Software Platforms

When it comes to observability, your organization will have no shortage of options for tools and platforms. Between open source software and proprietary vendors, you should be able to find the right tools to fit your use case, budget and IT infrastructure. Observability should be cost-efficient, easy to implement and customers should be provided with the best support possible.

Honeycomb + Tracetest: Observability-Driven Development

Our friends at Tracetest recently released an integration with Honeycomb that allows you to build end-to-end and integration tests, powered by your existing distributed traces. You only need to point Tracetest to your existing trace data source—in this case, Honeycomb. This guest post from Adnan Rahić walks you through how the integration works.

Breaking the Cloud Illusion: The Hard Truth about Successful Migrations

Join our Kentik experts and Andrew Green, Research Analyst at GigaOm for a panel discussion on common challenges organizations face as they move their workloads to the cloud. They will discuss some tales from the field and ways organizations can mitigate some of these challenges, such as cost overruns, connectivity interruptions, and security considerations.

The Future of Observability: Navigating Challenges and Harnessing Opportunities

Observability solutions can easily and rapidly get complex — in terms of maintenance, time and budgetary constraints. But observability doesn’t have to be hard or expensive with the right solutions in place. The future of your observability can be a bright one.

Splunk and the Four Golden Signals

Last October, Splunk Observability Evangelist Jeremy Hicks wrote a great piece here about the Four Golden Signals of monitoring. Jeremy’s blog comes from the perspective of monitoring distributed cloud services with Splunk Observability Cloud, but the concepts of Four Golden Signals apply just as readily to monitoring traditional on-premises services and IT infrastructure.

Dashboards & Reports for New-Age Observability with DX UIM from Broadcom

In this 10-minute how-to video, 2nd in a series, learn more about DX UIM for new-age infrastructure observability. Watch to learn about inventory view and grouping, creating metric view dashboards and reports, Performance Reports Designer, List View Designer, and a sneak peek at a unified view.

Sneak Peek: New-Age Infra Observability Viewer with DX UIM from Broadcom

In this 6-min. video, see how an upcoming feature, Observability Viewer, will provide a new, consolidated view across the infrastructure estate. The DX UIM Observability Viewer feature is intended to allow customers to more quickly understand their operational situation.

Using Traces for Testing - SigNoz Community Call with TraceTest and DevOps Educator Paulo

This week we welcomed the TraceTest team to talk about how TraceTest can use your OpenTelemetry Traces to do truly deep end-to-end tracing of your stack. We also had Globo engineer and DevOps wizard Paulo Henrique de Morais Santiago, who along with experimenting with SigNoz as a New Relic alternative for Observability, is also the author of one of the top DevOps courses on Udemy. Check out his course at.

Observability and the DORA metrics

The Accelerate State of Devops Report highlights four key metrics (known as the DORA metrics, for DevOps Research & Assessment) that distinguish high-performing software organizations: deployment frequency, lead time for changes, time-to-restore, and change fail rate. Observability can kickstart a virtuous cycle that improves all the DORA metrics.

Upgrading NPM and SAM to Hybrid Cloud Observability

This video discusses and demonstrates upgrading an Orion Platform installation running NPM and SAM, to Hybrid Cloud Observability – advanced license. The video discusses system requirements, installation methods and walks through a full demonstration of the upgrade. This video is suitable for anyone who wishes to understand more and see an upgrade from a module based install to Hybrid Cloud Observability.

Simplifying Data Lake Management with an Observability Pipeline

Data Lakes can be difficult and costly to manage. They require skilled engineers to manage the infrastructure, keep data flowing, eliminate redundancy, and secure the data. We accept the difficulties because our data lakes house valuable information like logs, metrics, traces, etc. To add insult to injury, the data lake can be a black hole, where your data goes in but never comes out. If you are thinking there has to be a better way, we agree!

10 Observability Tools in 2023: Features, Market Share and Choose the Right One for You

Understanding what's happening within your systems is a necessity. Have you ever wondered how experts keep an eye on systems to make sure everything's running smoothly? That's where observability tools come in! Observability tools are like helpers that give you a peek inside your tech. In this blog, we will talk about observability tools and how they can be used in different situations so it's easier for you to choose the right one for your organization.

Checking your observability and communication platforms with Reliably

#reliably #chaosengineering #honeycomb #slack #resilience
In this video, we will use a chaos engineering experiment, that we expect to fail, to verify our open tracing and communication platforms are correctly set up. Using the Honeycomb and Slack integrations provided by Reliably, we will send traces and messages and observe if they are triggered as expected.

Infinite Retention with OpenTelemetry and Honeycomb

The needs of observability workloads can sometimes be orthogonal to the needs of compliance workloads. Honeycomb is designed for software developers to quickly fix problems in production, where reducing 100% data completeness to 99.99% is acceptable to receive immediate answers. Compliance and audit workloads require 100% data completeness over much longer (or "infinite") time spans, and are content to give up query performance in return.

4 Ways a Consistent Schema Drives More Value From Your Observability Data

One of the hardest challenges in computer science is deciding what to name things. Adoption of consistent nomenclature is difficult because there is no one right answer. In fact, it’s not uncommon for different teams within organizations to choose different names for the same technologies. In the world of monitoring and observability, this can create quite a lot of confusion – not to mention wasted resources.

Getting _____________ for Less from Your Analytics Tools

Your analytics system of choice is probably pulling triple-duty for your enterprise–data collection, data storage, and its primary goal: analytics for monitoring, reporting and taking action. In this session we discuss considerations for various use cases, and why and how to use Cribl Stream to customize the processing and routing of various data sources to optimize, enrich, and route your data based on its content, value, and purpose.

Apica Acquires LOGIQ.AI to Revolutionize Observability

In the world of observability, having the right amount of data is key. For years Apica has led the way, utilizing synthetic monitoring to evaluate the performance of critical transactions and customer flows, ensuring businesses have important insight and lead time regarding potential issues.

Optimizing cloud resources and cost with APM metadata in Elastic Observability

Application performance monitoring (APM) is much more than capturing and tracking errors and stack traces. Today’s cloud-based businesses deploy applications across various regions and even cloud providers. So, harnessing the power of metadata provided by the Elastic APM agents becomes more critical. Leveraging the metadata, including crucial information like cloud region, provider, and machine type, allows us to track costs across the application stack.

From Disruptions to Resilience: The Role of Splunk Observability in Business Continuity

In today's market, companies undergoing digital transformation require secure and reliable systems to meet customer demands, handle macroeconomic uncertainty and navigate new disruptions. Digital resilience is key to providing an uninterrupted customer experience and adapting to new operating models. Companies that prioritize digital resilience can proactively prevent major issues, absorb shocks to digital systems and accelerate transformations.

Reducing Mean Time to Diagnosis: How Salary Finance Uses Honeycomb to Ask the Right Questions

Salary Finance is a UK-based financial well-being employee benefit program. Over the last seven years, the company grew from a startup to a scaleup, earning rave reviews along the way from its more than 4,000 customers. However, with fast growth also comes natural growing pains. As their customer base expanded, so did the number of incidents they experienced, which also became harder to diagnose due to lack of visibility into their increasingly complex environment.

Managing your applications on Amazon ECS EC2-based clusters with Elastic Observability

In previous blogs, we explored how Elastic Observability can help you monitor various AWS services and analyze them effectively: One of the more heavily used AWS container services is Amazon ECS (Elastic Container Service). While there is a trend toward using Fargate to simplify the setup and management of ECS clusters, many users still prefer using Amazon ECS with EC2 instances.

Effective Remote Debugging with VS Code

This post will discuss remote debugging in VS Code and how to improve the remote debugging experience to maximize debugging productivity for developers. Visual Studio Code, or VS Code, is one of the most popular IDEs. Within ten years of its initial release, VS Code has garnered the top spot among popularity indices, and its community is growing steadily. Developers love VS Code not only for its simplicity but also due to its rich ecosystem of extensions, including the support for debugging.

The Evolution of the Service Model In the Data Industry

Cribl’s Ed Bailey will lead a great discussion with nth degree’s Paul Stout and Scott Gray about the evolution of the service model from time and materials to outcome-based services. We will share our own stories about our experiences with services and how to make them better. Join the live stream for a fun discussion and come armed with suggestions for how to make your next services engagement better.

How Qonto used Grafana Loki to build its network observability platform

Christophe is a self-taught engineer from France who specializes in site reliability engineering. He spends most of his time building systems with open-source technologies. In his free time, Christophe enjoys traveling and discovering new cultures, but he would also settle for a good book by the pool with a lemon sorbet.

Splashing into Data Lakes: The Reservoir of Observability

If you’re a systems engineer, SRE, or just someone with a love for tech buzzwords, you’ve likely heard about “data lakes”. Before we dive deep into this concept, let’s debunk the illusion: there aren’t any floaties or actual lakes involved! Instead, imagine a vast reservoir where you store loads and loads of raw data in its natural format. Now, pair this with the idea of observability and telemetry pipelines, and we have ourselves an engaging topic.

Three Code Instrumentation Patterns To Improve Your Node.js Debugging Productivity

In this age of complex software systems, code instrumentation patterns define specific approaches to debugging various anomalies in business logic. These approaches offer more options beyond the built-in debuggers to improve developer productivity, ultimately creating a positive impact on the software’s commercial performance. In this post, let’s examine the various code instrumentation patterns for Node.js.

Introducing the Telemetry Cloud: An All-In-One Observability Platform All Enterprises Can Afford

We’re excited to announce that we just released the next-generation of our observability platform – the Circonus Telemetry Cloud™. Here’s a closer look at what it is and why we think it’s a standout in the monitoring and observability space.

Free Jaeger Alternatives [comparison 2023]

Jaeger, a renowned distributed tracing system, has been a trusted companion for developers and operations teams seeking to unravel the complexities of microservices architectures. However, as the landscape continues to evolve, the time has come to explore Jaeger alternatives that offer distinct features and advantages.

Unify your observability signals with Grafana Cloud Profiles, now GA

Observability has traditionally been conceptualized in terms of three core facets: logs, metrics, and traces. For years, these elements have been seen as the “pillars” of observability, serving as the foundational components for system monitoring and delivering key insights to improve system performance. However, with the exponential growth in system complexity, a more comprehensive and unified perspective on observability has become necessary.

Mainframe Observability with Elastic and Kyndryl

As we navigate our fast-paced digital era, organizations across various industries are in constant pursuit of strategies for efficient monitoring, performance tuning, and continuous improvement of their services. Elastic® and Kyndryl have come together to offer a solution for Mainframe Observability, engineered with an emphasis on organizations that are heavily reliant on mainframes, including the financial services industry (FSI), healthcare, retail, and manufacturing sectors.

Deliver exceptional digital experiences with Cisco Cloud Native Application Observability

From the application layer down to your Kubernetes® infrastructure, Cisco Cloud Native Application Observability delivers cross-domain visibility with correlated MELT data and AI/ML-driven insights to simplify the complexity of observing the performance of modern applications, multi-cloud Kubernetes, and hybrid cloud infrastructure.

Cloud Observability: Unlocking Performance, Cost, and Security in Your Environment

A robust observability strategy forms the backbone of a successful cloud environment. By understanding cloud observability and its benefits, businesses gain the ability to closely monitor and comprehend the health and performance of various systems, applications, and services in use. This becomes particularly critical in the context of cloud computing. The resources and services are hosted in the cloud and accessed through different tools and interfaces.

Rethinking Observability with MinIO and CloudFabrix

While the growth trajectory for data in general is extraordinary, it is the growth of log files that really stand out. As the heartbeat of digital enterprise, these files contain a remarkable amount of intelligence – across a stunning range, from security to customer behavior to operational performance. The growth of log files, however, presents particular challenges for the enterprise. They are not “readable” per se, they require machine intelligence.

Send your logs to multiple destinations with Datadog's managed Log Pipelines and Observability Pipelines

As your infrastructure and applications scale, so does the volume of your observability data. Managing a growing suite of tooling while balancing the need to mitigate costs, avoid vendor lock-in, and maintain data quality across an organization is becoming increasingly complex. With a variety of installed agents, log forwarders, and storage tools, the mechanisms you use to collect, transform, and route data should be able to evolve and adjust to your growth and meet the unique needs of your team.

Anything But Tech Debt

Tech debt is usually one of the most fraught topics on engineering teams. Engineers often feel they aren’t allowed enough time to address tech debt. Product partners wonder why engineers spend so much time working on it—or at least talking about it. “The business” always seems to insinuate that engineers should do less of it, instead focusing on shipping value to customers.

Using UX and Observability to Track Application Health

UX (user experience) is a core factor that determines the success of an application or platform in a distributed system. Specifically, developers need to understand the infrastructure within an entire application stack to improve and refine the user experience to meet customer expectations without guesswork. System downtime remains a significant source of revenue and reputational losses for enterprises, employees, and customers.

How to Tackle Spiraling Observability Costs

As today’s businesses increasingly rely on their digital services to drive revenue, the tolerance for software bugs, slow web experiences, crashed apps, and other digital service interruptions is next to zero. Developers and engineers bear the immense burden of quickly resolving production issues before they impact customer experience.

Golang Monitoring using OpenTelemetry

When it comes to monitoring Golang applications, there are various tools and practices you can use to gain insights into your application's performance, resource usage, and potential issues. By using OpenTelemetry for monitoring in your Go applications, you can gain valuable insights into the behavior, performance, and resource utilization of your distributed systems, allowing you to troubleshoot issues, optimize performance, and improve the overall reliability of your software.

Data Observability's Impact on Business Decisions and Strategies

In today's data-driven world, businesses rely heavily on data to make important decisions and formulate strategies. However, one crucial element is often overlooked - data observability. Observing and understanding the behavior and performance of data systems and applications is vital to making accurate and informed decisions. In this blog post, Dennis Bonnen will explore the impact of data observability on business decisions and strategies.
Sponsored Post

3 Reasons to Prioritize Observability as part of Application Integration Strategy

Most companies in today's business landscape that deal with large amounts of data want to integrate their applications so that they can pass data between them seamlessly and easily. Being able to ensure that you can see exactly what is happening at every stage of the process is key, and this is where approaching the process with observability in mind can make a real difference. Deciding at the outset that observability is something that you want to be baked into the process means that you can plan and execute with that in mind.

Automatic Instrumentation for OpenTelemetry Go

The OpenTelemetry Go project now supports automatic instrumentation via eBPF! This is a big milestone for the project and makes it significantly easier to generate data from your Go apps: The automatic instrumentation agent is still in s/alpha/beta today, but it’s ready for you to try on your applications!

Announcing Easy Connect - The Fastest Path to Full Observability

Logz.io is excited to announce Easy Connect, which will enable our customers to go from zero to full observability in minutes. By automating service discovery and application instrumentation, Easy Connect provides nearly instant visibility into any component in your Kubernetes-based environment – from your infrastructure to your applications. Since applications have been monitored, collecting logs, metrics, and traces have often been siloed and complex.