Operations | Monitoring | ITSM | DevOps | Cloud

March 2023

Cloud Migrations with Cribl.Cloud

Cribl’s suite of products help you gain the control and confidence you need to successfully migrate to the cloud. With routing, shaping, enriching, and search functionalities, data becomes more manageable and allows you to work more efficiently. By routing data from existing sources to multiple destinations, you can ensure data parity in your new cloud destinations, before turning off your on-premises (or legacy) analytics, monitoring, storage, or database products and tooling.

Elastic Observability 8.7: Enhanced observability for synthetic monitoring, serverless functions, and Kubernetes

Elastic Observability 8.7 introduces new capabilities that drive efficiency into the management and use of synthetic monitoring and expand visibility into serverless applications and Kubernetes deployments. These new features allow customers to: Observability 8.7 is available now on Elastic Cloud — the only hosted Elasticsearch offering to include all of the new features in this latest release.

Lightrun's Product Updates - Q1 2023

During the past quarter, Lightrun has been busy at work producing a wealth of developer productivity tools and enhancements, aiming for greater troubleshooting of distributed workload applications and cost efficiency. Read more below the main new features as well as the key product enhancements that were released in Q1 of 2023!

Building a Distributed Security Team With Cjapi's James Curtis

Join Cribl's Ed Bailey and Cjapi's James Curtis as they discuss the challenges of building a distributed global security team. Talent is hard to find and companies are hiring all over the world to build the best teams possible, but this trend has a price. Traditional management processes do not work, from building culture to the basics around assigning, tracking and measuring work. Team leads and managers rarely have the experience and training to handle remote teams which can impact team effectiveness and thus weaken the enterprise security posture.

Best Practices for Effective Monitoring and Observability - Civo.com

In the first talk, "You're doing Observability wrong, breaking down the 3 pillars of observability," Matt Gibiec, Sr. Solutions Engineer at Dynatrace, will discuss the common misconceptions around observability and the importance of going beyond metrics, traces, and logs. He will break down the three pillars of observability and provide actionable insights into what is required to truly achieve observability in your systems.

Four Things That Make Coralogix Unique

SaaS Observability is a busy, competitive marketplace. Alas, it is also a very homogeneous industry. Vendors implement the features that have worked well for their competition, and genuine innovation is rare. At Coralogix, we have no shortage of innovation, so here are four features of Coralogix that nobody else in the observability world has.

Comprehensive Kubernetes Observability with LogicMonitor's Kube-State-Metrics Integration

With the growing popularity of Kubernetes, the need for effective monitoring solutions has become crucial. LogicMonitor, a leading cloud-based monitoring and observability platform, has rolled out a new set of DataSources in its Kubernetes monitoring solution, LM Container, that uses data from the kube-state-metrics service to provide enhanced visibility into the state of Kubernetes objects.

Splunk Synthetics in Observability Terraform Provider Released

“How do you know your web properties and APIs are up and functioning as expected for users, not just nationally, but across the entire planet?“ Splunk Synthetic Monitoring provides an effective solution to monitor and track the reliability of web properties from locations all over the globe. By generating simulated user or API requests with Splunk Synthetics you’ll quickly be able to measure response times from various locations, devices, and connection types.

Twelve-Factor Apps and Modern Observability

The Twelve-Factor App methodology is a go-to guide for people building microservices. In its time, it presented a step change in how we think about building applications that were built to scale, and be agnostic of their hosting. As applications and hosting have evolved, some of these factors also need to. Specifically, factor 11: Logs (which I’d also argue should be a lot higher up in the ordering).

Elastic Observability: Built for open technologies like Kubernetes, OpenTelemetry, Prometheus, Istio, and more

As an operations engineer (SRE, IT Operations, DevOps), managing technology and data sprawl is an ongoing challenge. Cloud Native Computing Foundation (CNCF) projects are helping minimize sprawl and standardize technology and data, from Kubernetes, OpenTelemetry, Prometheus, Istio, and more. Kubernetes and OpenTelemetry are becoming the de facto standard for deploying and monitoring a cloud native application.

Trace at Your Own Pace: Three Easy Ways to Get Started with Distributed Tracing

Stepping through a trace is an invaluable debugging workflow, providing a way to follow requests from service to service even as the applications we manage become more complex and distributed. That same complexity can make getting started with distributed tracing feel overwhelming, but it’s important to remember that instrumenting your code is an additive process—you don’t need to boil the ocean. A trace through a thousand services starts with a single ID.

Learn How NS1 Uses Distributed Tracing to Release Code More Quickly and Reliably

Chris Bertinato, Software Architect at NS1, and Nate Daly, Head of Architecture at NS1 along with Jessica Kerr, Honeycomb Developer Advocate, and Account Executive Scott Phillips discuss how NS1 used distributed tracing to scale their organization and accelerate their migration from a monolith to microservices.

Discover Unknown Service Interaction Patterns With Istio & Honeycomb

Istio service meshes enable organizations to secure, connect, and monitor microservices to modernize their enterprise apps more swiftly and securely. With the addition of distributed tracing and powerful observability tooling, platform operators can gain immediate actionable insights about their applications.

Intercom: Building a More Resilient Ecosystem Through Observability

Learn how Intercom implemented Honeycomb’s distributed traces to learn about production. Kesha Mykhailov, Product Engineer at Intercom joins Honeycomb Developer Advocate Jessica Kerr, and Account Executive Michael Wilde to discuss how Intercom uses distributed traces to streamline their observability workflows, allowing their product engineers to learn about and from their production to increase Intercom’s resilience. Topics include.

What Is Observability? Examples of How It Can Help You

Observability is a powerful concept that can help you gain insight into the performance of your systems and applications. It refers to the ability to measure, monitor, analyze, and manage different aspects of an infrastructure or application—from hardware components to application code. With observability techniques such as distributed tracing, monitoring metrics, log analysis, and anomaly detection, organizations can ensure their applications run smoothly without downtime or disruption.

Learn How SumUp Implemented SLOs to Mitigate User Outages and Reduce Customer Churn

Blake Irvin and Matouš Dzivjak from SumUp’s Software Engineering team, Honeycomb Solution Architect Michael Sickles and Account Executive Nathan Leary, discuss how SumUp incorporated observability, specifically, SLOs, to identify and resolve issues before they grew into customer-noticeable problems.

Surface and Confirm Buggy Patterns in Your Logs Without Slow Search

Debugging with logs in distributed systems can be a pain. It’s tough to search raw data looking for a pattern, relating potential causes with other logs, and checking trace and metrics data for more confirmation. Is finding one pattern enough? What if there are other problems? Who knows how many colliding factors are relevant? At Honeycomb, we’re flipping the script on the log search problem. Hear our resident experts, (former Splunk Ninja) Michael Wilde and Andy Dufour, discuss how Honeycomb customers have technically evolved their log analysis process to achieve fast pattern detection, skipping the search grep/search loop entirely.

How Much Should Your Observability Stack Cost?

Observability is critical to any software development. It is a term that describes the ability to monitor the performance and health of applications, services, and infrastructure. Observability aims to quickly identify and troubleshoot problems before they become full-blown incidents that can lead to costly downtime. But how much should you invest in an observability stack? Regarding the cost of your observability stack, there is no one-size-fits-all answer.

Reference Architecture Series: Scaling Syslog

Join Ed Bailey and Ahmed Kira as they go into more detail about the Cribl Stream Reference Architecture, with a focus on scaling syslog. In this live stream discussion, Ed and Ahmed will explain guidelines for how to handle high volume UDP and TCP syslog traffic. They will also share different use cases and talk about the pros and cons for using different approaches to solve this common and often painful challenge.

See How Coveo Engineers Reduced User Latency

Many teams are wasting far too much time and energy searching through massive amounts of log data trying to find answers to user latency issues. Metrics data doesn’t help either as it only tells you that there is a problem, not where to fix it. This is why Coveo turned to observability. Through implementing observability with Honeycomb, Coveo was able to reduce their user latency by 50 percent.

Join Jeli and Honeycomb for an Incident Response and Analysis Discussion

Solutions Engineers Vanessa Huerta Granda and Emily Ruppe from Jeli, along with Honeycomb’s Field CTO Liz Fong-Jones and SRE Fred Hebert discuss some of our more interesting recent incidents and how we use Honeycomb and Jeli together for incident response.

The future of observability: Trends and predictions business leaders should plan for in 2023 and beyond

If the past year has taught us anything, it’s that the more things change, the more things stay the same. The whiplash and pivot from the go-go economy post-pandemic to a belt-tightening macroeconomic environment induced by higher inflation and interest rates has been seen before, but rarely this quickly. Technology leaders have always had to do more with less, but this slowdown may be unpredictable, longer, and more pronounced than expected.

The Ultimate Guide to Digital Workplace Observability

The digital workplace has evolved dramatically over the past decade, both in terms of the increased reliance on technology for daily operations and the complexity of that technology. In order to manage an improve the digital workplace, service desk teams need more than just a comprehensive view of their IT environments — they need to be able to analyze that data in real-time to make faster, more continuously effective decisions. Enter: digital workplace observability.

Introduction to Kubernetes Observability

Cloud has become the de-facto standard for new application development. Kubernetes solves many problems of modern-day cloud infrastructure. It has made microservices-based distributed software systems possible, enabling organizations to provide on-demand scaling. But at the same time, Kubernetes has also increased operational complexity. In simple terms, Kubernetes is a container orchestration tool. Container environments are dynamic and ephemeral.

Top 10 AIOps & Observability Capabilities for the Banking and Finance Sector

Maintaining trust in the business services your customers rely on is everything. With ever-increasing customer expectations and the promise of ‘always-on’ services, poor digital experiences and outages can cause significant harm to your business. The Interlink Software AIOps and Observability platform strengthens IT teams’ capability to deliver more reliable, available digital services and reduce the risk of customer impacting disruption.

Ask Miss O11y: Is There a Beginner's Guide On How to Add Observability to Your Applications?

I want to make my microservices more observable. Currently, I only have logs. I’ll add metrics soon, but I’m not really sure if there is a set path you follow. Is there a beginner's guide to observability of some sort, or best practice, like you have to have x kinds of metrics? I just want to know what all possibilities are out there. I am very new to this space.

Splunk Observability in Less Than 2 Minutes

Splunk Observability is the most comprehensive observability solution available today, combining application, infrastructure and digital experience monitoring, with log management, AIOps and incident response in a single solution experience. With Splunk Observability software engineering and IT operations teams can fix problems faster, improve reliability and build exceptional customer experiences.
Sponsored Post

The Risks and Pitfalls of Too Many Monitoring Tools

If you are like most organizations, your technology environment is a complex mixture of tools needed to run your business. In this environment, monitoring and observability are critical to making sure everything is running smoothly. You use monitoring tools to measure server resources, log-parsing tools for troubleshooting, application tools to observe application performance, and audit-request tools to comply with regulations. While these are all valid observability needs, there are risks to overdoing it by introducing too many tools. Here are some ways to avoid monitoring proliferation when developing your observability strategy.

Level Up Your Observability Game With the Cribl Suite of Products: All About Our 4.1 Release

After our recent company-wide offsite in New Orleans, the Cribl employees are feeling like they’ve leveled up in more ways than one. Not only did we indulge in delicious beignets and king cakes, but we also came back motivated to create some kick-ass new product features with our 4.1 release. It’s like we soaked up all the good vibes and brought them back with us.

Easily configure Elastic to ingest OpenTelemetry data

Watch how to easily configure your application to ingest Elastic OpenTelemetry data. About Elastic Elastic is the leading platform for search-powered solutions, and we help everyone — organizations, their employees, and their customers — find what they need faster, while keeping applications running smoothly, and protecting against cyber threats. When you tap into the power of Elastic Enterprise Search, Observability, and Security solutions, you’re in good company with brands like Netflix, Uber, Slack, Microsoft, and thousands of others who rely on us to accelerate results that matter.

SaaS Observability Platforms: A Buyer's Guide

Observability is the ability to gather data from metrics, logs, traces, and other sources, and use that data to form a complete picture of a system’s behavior, performance, and health. While monitoring alone was once the go-to approach for managing IT infrastructure, observability goes further, allowing IT teams to detect and understand unexpected or unknown events.

How Do We Cultivate the End User Community Within Cloud-Native Projects?

The open source community talks a lot about the problem of aligning incentives. If you’re not familiar with the discourse, most of this conversation so far has centered around the most classic model of open source: the solo unpaid developer who maintains a tiny but essential library that’s holding up half the internet. For example, Denis Pushkarev, the solo maintainer of popular JavaScript library core-js, announced that he can’t continue if not better compensated.

Platform Engineering Is the Future of Ops

Ops and DevOps roles as we know them are on their way to becoming extinct—the future is platform engineering. While DevOps engineers typically focus on the application layer, platform engineers focus on the underlying infrastructure layer. Without a solid and reliable platform, it can be challenging to deploy and maintain software applications effectively. This can result in downtime, poor performance, and security vulnerabilities. Platform engineering enables software applications and services to run effectively and efficiently and has a direct impact on the user experience and the success of the entire organization.

How We Define SRE Work, as a Team

Last year, I wrote How We Define SRE Work. This article described how I came up with the charter for the SRE team, which we bootstrapped right around then. It’s been a while. The SRE team is now four engineers and a manager. We are involved in all sorts of things across the organization, across all sorts of spheres. We are embedded in teams and we handle training, vendor management, capacity planning, cluster updates, tooling, and so on.

MIAX and Cribl Stream: Enriching Data for Improved Observability and Faster Time to Value

Using Cribl Stream for observability is a given, but what about using Cribl Stream to get MORE from your data? Observability is all about being able to collect, route, store, and search your data. Implementing enrichment with observability provides more context and elevates your ho-hum data to robust information. This is key to faster, more confident decision-making!

Gain real-time observability into your software supply chain with the New Relic Log Analytics Integration

JFrog’s new log analytics integration with New Relic brings together powerful observability capabilities to monitor, analyze, and visualize logs and metrics from self-hosted JFrog environments. The integration is free for all tiers of self-hosted JFrog customers and utilizes the powerful, open source log management tool, Fluentd, to collect, process, and surface data in New Relic dashboards.

Metrics vs. Logs vs. Traces (vs. Profiles)

In software observability, we often talk about three signal types - metrics, logs, and distributed traces. More recently I've been hearing about profiles as another signal type. In this article I will explain the different observability signals and when to use them in a clear and concise way.

A Guide to Enterprise Observability Strategy

Observability is a critical step for digital transformation and cloud journeys. Any enterprise building applications and delivering them to customers is on the hook to keep those applications running smoothly to ensure seamless digital experiences. To gain visibility into a system’s health and performance, there is no real alternative to observability. The stakes are high for getting observability right — poor digital experiences can damage reputations and prevent revenue generation.

The Importance of Observability Pipelines in Gaining Control over Observability and Security Data

Today’s enterprises must have the capability to cope with the growing volumes of observability data, including metrics, logs, and traces. This data is a critical asset for IT operations, site reliability engineers (SREs), and security teams that are responsible for maintaining the performance and protection of data and infrastructure. As systems become more complex, the ability to effectively manage and analyze observability data becomes increasingly important.

Panel Discussion: Observability

Watch the Observability Panel discussion to learn how observability takes monitoring to the next level by making it simpler to discover the root cause of IT issues before services are disrupted. There is no shortage of observability platforms today; the challenge is determining the best practices that should be put in place to employ them most effectively.

Deploys Are the WRONG Way to Change User Experience

I'm no stranger to ranting about deploys. But there's one thing I haven't sufficiently ranted about yet, which is this: Deploying software is a terrible, horrible, no good, very bad way to go about the process of changing user-facing code. It sucks even if you have excellent, fast, fully automated deploys (which most of you do not). Relying on deploys to change user experience is a problem because it fundamentally confuses and scrambles up two very different actions: Deploys and releases.

Getting Started with Instant Evaluation

Learn how to leverage the Instant Evaluation feature within the SolarWinds Platform to easily trial and evaluate different solutions like Hybrid Cloud Observability. See how you can expand on the functionality available now and gain more integrated insights for streamlined issue resolution and performance monitoring in your environment.

Why Seven.One Entertainment Group Chose Datadog RUM for Client-side Observability

Hear why Seven.One Entertainment Group, a subsidiary of ProSiebenSat.1 Media SE , which is Germany’s top commercial broadcaster, chose Datadog Real User Monitoring and how the solution enabled them to better understand client-side issues.

How Coveo Reduced User Latency and Mean Time to Resolution with Honeycomb Observability

When you’re just getting started with observability, a proof of concept (POC) can be exactly what you need to see the positive impact of this shift right away. Coveo, an intelligent search platform that uses AI to personalize customer interactions, used a successful POC to jumpstart its Honeycomb observability journey—which has grown to include 10,000+ machine learning models in production at any one time. Wondering how Coveo got there? So were we.

Beyond Logging: The Power of Observability in Modern Systems

Observability has now become a key aspect of designing, building and maintaining modern systems. From logs to distributed tracing and from distributed locking to distributed tracing, observability as a function has gone beyond logging. With so many aspects to be taken care of, it thus becomes essential to have an observability toolchain which is comprehensive and comprehensive without making it complex. In this blog, we will explore the underlying motivations behind observability, the various tools available to enable it, and the various components of the same.

Empowering Security Observability: Solving Common Struggles for SOC Analysts and Security Engineers

Join Ed Bailey and GreyNoise founder Andrew Morris as they share insights on how Cribl and GreyNoise help SOC analysts overcome common struggles and improve security detections and incident resolution. Through personal stories and real customer use cases, they'll demonstrate how combining these solutions can make a real difference in the day-to-day lives of SOC analysts. You'll also gain valuable insights into data flow and architecture, and learn how GreyNoise can drive outsized value. Don't miss this opportunity to enhance your security observability skills.

5 key takeaways from the Grafana Labs Observability Survey 2023

Observability is coming into its own, as SREs and DevOps practitioners increasingly seek to centralize the sprawl of tools and data sources to better manage their workloads and respond to incidents faster — and to save time and money in the process. That was the overarching message from more than 250 observability practitioners who took part in the Grafana Labs’ first ever Observability Survey.

Data Gravity in Cloud Networks: Distributed Gravity and Network Observability

So far in this series, I’ve outlined how a scaling enterprise’s accumulation of data (data gravity) struggles against three consistent forces: cost, performance, and reliability. This struggle changes an enterprise; this is “digital transformation,” affecting everything from how business domains are represented in IT to software architectures, development and deployment models, and even personnel structures.

Caring for Complex Systems: We Can Do This

When we work at it, professionals are pretty good at analysis. We can break down a simple system, look at its parts and their relations, and master it. Given enough time and teammates, we can analyze a very complicated system and fix it when it breaks. But complex systems don’t yield to analysis. We have to add another skill: sense-making. Complex systems have parts that learn and change, with relations that vary with state and history. They respond to and influence their environment.

How Can You Optimize Business Cost and Performance With Observability?

Businesses are increasingly adopting distributed microservices to build and deploy applications. Microservices directly streamline the production time from development to deployment; thus, businesses can scale faster. However, with the increasing complexity of distributed services comes visual opacity of your systems across the company. In other words, the more complex your system gets, the harder it becomes to visualize how it works and how individual resources are allocated.

Debugging Serverless Functions with Lightrun

Developers are increasingly drawn to Functions-as-a-Service (FaaS) offerings provided by major cloud providers such as AWS Lambda, Azure Functions, and GCP Cloud Functions. The Cloud Native Computing Foundation (CNCF) has estimated that more than four million developers utilized FaaS offerings in 2020. Datadog has reported that over half of its customers have integrated FaaS products in cloud environments, indicating the growth and maturity of this ecosystem.

Understanding Distributed Tracing with a Message Bus

So you're used to debugging systems using a distributed trace, but your system is about to introduce a message queue—and that will work the same… right? Unfortunately, in a lot of implementations, this isn't the case. In this post, we'll talk about trace propagation (manual and OpenTelemetry), W3C tracing, and also where a trace might start and finish.

Observability from Development to Production with Platform.sh Observability

With Platform.sh and Blackfire.io monitor, profile and test your application even before it is released in production. Get actionable insights to improve your code rather than spend time figuring out what’s wrong. Ensure optimal performance and user experience for your web applications.
Sponsored Post

Machine-Learning Automation: Processing, Storing, & Analyzing Data in the Digital Age

The world of software is growing more complex, and simultaneously changing faster than ever before. The simple monolithic applications of recent memory are being replaced by horizontal cloud-native applications. It is no surprise that such applications are more complex and can break into infinitely more ways (and ever new ways). They also generate a lot more data to keep track of. The pressure to move fast means software release cycles have shrunk drastically from months to hours, with constant change being the new normal.

How Monitoring, Observability & Telemetry Come Together for Business Resilience

Systems going down because of an unforeseen incident? Got problems with your app or website? Is your audience missing out on products and services because your load times are too slow? Then monitoring and observability (and telemetry) should be of interest to you! In this long article, we’re covering everything! I’ll start with the concepts and how they work.

Industry Experts Discuss Cybersecurity Trends and a New Fund to Shape the Future

Cribl's Ed Bailey and Angel Investor Ross Haleliuk discuss trends in the CyberSecurity industry and Ross will be making a big announcement about his new fund to shape the future of the cybersecurity industry. Ross is a big believer in focusing on the security practitioner to provide practical solutions to common issues by early investment in companies he thinks will promote these values. Ed and Ross will discuss trends in the industry and common struggles that both Cribl and his new fund seek to address by adding value and giving security practitioners choice and control over how they run their security program.

How to Achieve Full Stack Observability in Highly Distributed Environments Webinar

Your modern IT infrastructure has become an increasingly complicated mix of on-premises, public and private cloud applications, devices and environments. Forward-thinking organizations are addressing this complexity by transitioning to a proactive “observability” approach for infrastructure management. This methodology produces and then applies actionable data to optimize and secure the entire network.

How 3 Companies Implemented Distributed Tracing for Better Insight into Their Systems

Distributed tracing enables you to monitor and observe requests as they flow through your distributed systems to understand whether these requests are behaving properly. You can compare tiny differences between multiple traces coming through your microservices-based applications every day to pinpoint areas that are affecting performance. As a result, debugging and troubleshooting are simpler and faster.

Reduce 60% of your Logging Volume, and Save 40% of your Logging Costs with Lightrun Log Optimizer

As organizations are adopting more of the FinOps foundation practices and trying to optimize their cloud-computing costs, engineering plays an imperative role in that maturity. Traditional troubleshooting of applications nowadays relies heavily on static logs and legacy telemetry that developers added either when first writing their applications, or whenever they run a troubleshooting session where they lack telemetry and need to add more logs in an ad-hoc fashion.