Operations | Monitoring | ITSM | DevOps | Cloud

May 2024

Generative AI for Kubernetes: Meet K8sGPT Open Source Project

Troubleshooting within Kubernetes environments can be a daunting task. If we could only have a magical artificial intelligence advisor that could gather all the data about what goes on the system, and tell me what’s wrong, and even how to solve it. Wouldn’t it be nice? K8sGPT is a young open source project that uses generative AI to give Kubernetes superpowers to everyone. It recently turned a year old, and is now part of the Cloud Native Computing Foundation (CNCF).

Mastering CloudTrail Logs, Part 2

In part 1 of this series, we took a look at what CloudTrail logs are, the value addition that CloudTrail logs serve and some of the problems involved in processing and storing these logs. In part two of this series, we will look at how Observo helps organizations process CloudTrail logs at scale and derive value from them. As a quick recap, let’s take a look at what a CloudTrail event looks like.

The Leading OpenSearch Training Resources

OpenSearch has grown to be one of the most widely used open-source search engine projects. The high flexibility of the solution enables it to be the perfect option for a broad range of use cases, such as log and event data analysis, application monitoring and metrics analysis, and security information and event management (SIEM).

Tackling the Unsustainable Skills Challenge in Cybersecurity and Observability

This is the third and final post in a series of blog posts about the disconnect between modern IT and security teams and the vendors they’re forced to work with. If you’re looking for the first and second posts, you can find them here and here.

Simplify Log Management Across Any Cloud

Developers waste countless hours managing logs and juggling tools. Control Plane centralizes log management, making it easy to filter and analyze logs from apps running on any cloud: AWS, Azure, GCP, on-prem, etc. In this video, we demonstrate how Control Plane simplifies log management for your applications deployed across any cloud (or multi-cloud). We showcase the intuitive Log QL query language, built-in Grafana integration, and the flexibility to ship logs to your favorite external log providers like Datadog, S3, Elastic, and CloudWatch.

Finding a Better Way to Work in the Cloud!

With the 4.6 release, Cribl.Cloud Enterprise users now have the opportunity to opt-in to a new cloud experience. As a deeply customer-centric company, we listened to your feedback, and we heard you! We are making our user experience efficient, secure, and flexible. As we work to refine this new experience, we invite you to partner with us and share your input to influence this transformation as it makes its way across the entire Cribl suite!

Comprehensive Guide to Server Uptime Monitoring

This guide offers a deep dive into server uptime monitoring, focusing on the strategies and tools essential for seasoned IT professionals to implement. We’ll explore advanced metrics, fine-tune the deployment of tools like Heartbeat, and dissect integration practices with the ELK stack. Designed for technical leaders who manage complex infrastructures, this guide aims to enhance your methodologies in maintaining high availability and optimizing operational performance across your server ecosystems.

OpenTelemetry: The Key To Unified Telemetry Data

OpenTelemetry (OTel) is an open-source framework designed to standardize and automate telemetry data collection, enabling you to collect, process, and distribute telemetry data from your system across vendors. Telemetry data is traditionally in disparate formats, and OTel serves as a universal standard to support data management and portability.

Modern Observability 101

In technology, having “modern” capabilities is standard. Staying ahead of the curve is critical, and keeping outdated technology or processes going can be a recipe for disaster in a complex, ever-changing landscape. Ensuring the smooth functioning and performance of software systems is paramount. This is where modern observability—a sophisticated approach to monitoring and understanding the inner workings of applications and infrastructure—is required.

Deploy The ELK Stack on Kubernetes with Helm

The main objective of the ELK (Elasticsearch, Logstash, and Kibana) is to aggregate logs. However, with the increased usage of ELK and Kubernetes as a pairing the solution can go beyond the aggregation of standard logs and include monitoring and analysis of Kubernetes telemetry data. Therefore, more users are looking at deploying the ELK stack on Kubernetes. Yet, deploying the ELK stack on Kubernetes can be a complex task but with the assistance of Helm charts, the process is much simpler.

Introducing the Elastic distribution of the OpenTelemetry Java Agent

As Elastic continues its commitment to OpenTelemetry (OTel), we are excited to announce the Elastic distribution of the OTel Java Agent. In this blog post, we will explore the rationale behind our unique distribution, detailing the powerful additional features it brings to the table. We will provide an overview of how these enhancements can be utilized with our distribution, the standard OTel SDK, or the vanilla OTel Java agent.

Navigating the Maze of Incumbent Pricing Models in IT and Security

This is the second in a series of blog posts about the disconnect between modern IT and security teams and the vendors they’re forced to work with. If you’re looking for the first and last posts, you can find it here. In the dynamic world of managing observability and telemetry data, pricing models for tools and platforms are showing their age, creating a significant disconnect between vendors and the IT and security teams they serve.

Multi-Project Routing For Google Cloud

When sending data to Google Cloud, like logs, metrics, or traces, it can be beneficial to split the data up across multiple projects. This division may be necessary since each team has its own project, a central project is used for security audit logs, or for any other reason that your organization has. BindPlane has effective tools to manage this process. In this walkthrough, we will add fields to telemetry entries, allowing us to associate entries with a specific project and properly route them.

How to Create an S3 Bucket with AWS CLI

Managing an Elasticsearch cluster can be complex, costly, and time-consuming - especially for large organizations that need to index and analyze log data at scale. In this short guide, we’ll walk you through the process of creating an Amazon S3 bucket, configuring an IAM role that can write into that bucket, and attaching that IAM role to your Amazon S3 instance - all using the AWS Command Line Interface (CLI).

Mastering Data Distribution with OpenSearch Shards and Replicas

OpenSearch is an open-source distributed search and analytics engine created for scalability, performance, and ease of use. It is built on Apache Lucene and is a fork of Elasticsearch, designed in response to concerns about Elastic's decision to move away from open-source licensing for certain features in Elasticsearch and Kibana.

Reducing MTTR with the Elastic Observability AI Assistant

In this quick overview, discover how the Elastic Observability AI Assistant can streamline your operations and significantly reduce Mean Time to Recovery (MTTR). In just a minute or two, we'll highlight the key features and benefits of integrating AI into your observability strategy. Perfect for IT professionals and SREs who are looking for an efficient solution to improve system uptime and performance. Watch now to learn how AI can make a real difference in your response times!

Performance Optimization with Elastic Observability

Welcome to our quick overview of Performance Optimization with Elastic Observability! In this video, we explore the basics of how Elastic Observability can enhance your system’s performance monitoring and management. Discover key features that help you keep your applications running smoothly and efficiently, without deep diving into complexities. Perfect for anyone looking to get a quick grasp of what Elastic Observability can offer.

Incident Management and Troubleshooting with Elastic Observability

Welcome to our quick guide on enhancing your incident management and troubleshooting capabilities using Elastic Observability. In this brief overview, we'll highlight how Elastic Observability can streamline your operations and help you quickly pinpoint and resolve issues. Whether you're looking to improve your response times or just want a snapshot of what Elastic can offer, this video is the perfect starting point.

Finding unknown/unknowns in logs for SREs with Elastic Observability

Welcome to a quick overview of how Elastic Observability can help SREs tackle the elusive unknown/unknowns in their system logs. In just a minute or two, this video will introduce you to the basic strategies and tools that Elastic provides to enhance your site reliability through smarter data insights. Perfect for professionals looking to quick-start their monitoring capabilities without getting overwhelmed. Dive in and discover how to transform your logs into actionable insights!

Custom Alerts, SLOs, and Anomaly Detection with Elastic Observability

In this overview, we'll introduce you to the key features of Elastic Observability, focusing on custom alerts, service level objectives (SLOs), and anomaly detection. Whether you're managing infrastructure, ensuring service reliability, or overseeing software performance, these tools are essential for maintaining system health and efficiency. This video provides a quick glimpse into how Elastic Observability can streamline your monitoring tasks and alert you to issues before they impact your services. Perfect for those looking to enhance their observability strategy.

Kubernetes Logging | Set Up K8s Log Monitoring with OpenTelemetry

Kubernetes is a powerful orchestration tool for managing containers, but it comes with its own set of challenges. One of the biggest hurdles is effectively logging what's happening in your system. As your applications grow and spread across clusters, keeping track of their behavior becomes crucial. In this article, we will discuss logging in Kubernetes, common Kubernetes log types, and how logs can be effectively tracked and managed.

The Journey to 100x-ing Control Plane Scale for Cribl Edge

At Cribl, we value the simplest and quickest path to shipping new things. This is especially true with shipping new products. We took this approach with Cribl Edge, so we could get it into the hands of existing and potential customers as soon as possible to learn more about their needs and requirements. In order to ship a high-quality Edge product quickly, we based all of the systems for management and data streaming directly on the existing, battle-tested systems we built for Stream.

Elastic's RAG-based AI Assistant: Analyze application issues with LLMs and private GitHub issues

As an SRE, analyzing applications is more complex than ever. Not only do you have to ensure the application is running optimally to ensure great customer experiences, but you must also understand the inner workings in some cases to help troubleshoot. Analyzing issues in a production-based service is a team sport. It takes the SRE, DevOps, development, and support to get to the root cause and potentially remediate. If it's impacting, then it's even worse because there is a race against time.

Introducing Elastic's OpenTelemetry Distribution for Node.js

We are delighted to announce the alpha release of the Elastic OpenTelemetry Distribution for Node.js. This distribution is a light wrapper around the OpenTelemetry Node.js SDK that makes it easier to get started using OpenTelemetry to observe your Node.js applications.

What's New With Mezmo: Real-Time Alerting

Here at Mezmo, we see the purpose of a telemetry pipeline is to help ingest, profile, transform, and route data to control costs and drive actionability. There are many ways to do that as we’ve previously discussed in our blogs, but today I’m going to talk about real-time alerting on data in motion, yes - on streaming data, before it reaches its destination.

Monitoring vs Observability: What is Reality?

Before we start, I have a confession: I absolutely love Digg (people are still Digging things, right?) errr...Reddit. It actually is my front page to the internet, where I research upgrades for my home lab/VR/other niche hobbies, watch silly videos, ingest low-effort memes, judge if people are ‘AHs’ or not on /r/amitheasshole, and occasionally talk trash to other Redditors about my Michigan-based sports teams.

Kubectl Logs Tail | How to Tail Kubernetes Logs

The kubectl logs tail command is a tool that allows users to stream the logs of a pod in real-time while using Kubernetes. This command is particularly useful for debugging and monitoring applications, as it enables users to view log output as it is generated and quickly identify any issues or problems with their application. In this article, we will see how to use the kubectl logs tail command to stream logs, the benefits of using the command, and an advanced tool for streaming logs.

Scaling in the Cloud with Cribl's Universal Receiver

Scaling cloud services is a critical task for Site Reliability Engineers, and it’s a challenging one. As organizations grow, the amount of data and the number of users of it grow like crazy, pushing traditional data management methods to their limits. SREs not only have to keep everything running, they’ve got to make sure it runs smoothly, efficiently, and swiftly.

A New Era of Cloud Security with Cribl and Wiz

Cribl is an integrations company at heart. We want to help every company develop a data strategy that gives them more control, improves security, and provides flexibility to adapt to their ever-changing data needs. Today, we’re thrilled to announce that we are a Certified Wiz Integration (WIN) Partner to help customers take their cloud security game to the next level.

Announcement: New Integration With Panther Labs SIEM

Observo.ai is excited to share that we now integrate with Panther Labs, a modern SIEM built for the cloud. This enables Panther users to leverage Observo.ai’s powerful telemetry data pipeline features. Observo.ai was created to help Security and DevOps teams solve their biggest telemetry problems. Using Artificial Intelligence, Observo.ai optimizes and transforms data from any source and routes it to the destinations where it has the most value.

Does Your Observability Practice Lack Maturity? Here's What to Do.

Observability isn’t new. But organizations are struggling to adopt mature observability practices, and the impact on business is palpable. Organizations are seeing the value of observability for their applications and infrastructure—the results of our 2024 Observability Pulse survey of 500 global IT professionals reflects that across the board.

Deploying The ELK Stack on Kubernetes

The ELK (Elasticsearch, Logstash, and Kibana) stack’s main objective is to aggregate logs, but the vastly popular open-source project has numerous uses alongside aggregating logs. ELK can easily integrate with Kubernetes and is a common solution that enables users to gather, store, and examine Kubernetes telemetry data. However, with the continual rise of micro-service architecture, users are searching for an improved method of aggregating and searching through logs for debugging purposes.

Square Pegs, Round Holes: The Challenge of Integrating MELT Data into Traditional Data Warehouses

This is the first in a series of blog posts about the disconnect between modern IT and security teams and the vendors they’re forced to work with. If you’re looking for the second and third posts, you can find them here and here. Imagine this scenario: You’re grappling with the ever-escalating costs of your legacy solutions. What’s the logical next step? For many, it’s exploring the new wave of tools emerging, such as data warehouses.

Cribl Collaborates with Microsoft: Empowering Enterprises to Strengthen their Security Operations

As the cybersecurity landscape becomes more and more complex. It seems like we hear about a major breach of a different company every day. Enterprises are looking for robust solutions to help them manage the surge in data and security incidents. That’s why our recent collaboration announcement with Microsoft means so much to us. It’s not just a piece of paper; it’s a testament to our dedication to providing customers with the best tools and solutions for the job.

Database Monitoring: troubleshooting from the bottom up

A healthy relationship between services and databases is fundamental to overall application performance. Unchecked database issues can compromise application efficiency, user experience, and ultimately, your organization’s bottom line. To steer clear of these consequences, monitoring your databases should be a key component of your observability—and with the launch of Coralogix Database Monitoring, it can be.

Managing High Volume with OpenTelemetry

As your systems grow, so do the challenges of managing high-volume telemetry data. From horizontal scalability strategies to efficient data aggregation and storage techniques, we'll cover everything you need to know to keep pace with your expanding infrastructure. Don't let scalability constraints hinder your observability efforts—learn how OpenTelemetry can empower you to manage high volumes of telemetry data effectively and efficiently.

Building a Custom OTel Collector: A Step by Step Guide

Ready to tailor your telemetry data collection to fit your exact needs? Watch as we go step-by-step through constructing a custom OpenTelemetry Collector. From defining requirements to implementing custom processors and exporters, leave this feeling empowered to create a collector perfectly aligned with your infrastructure and observability goals.

SolarWinds Observability simplifies searching live event messages and log archives

New reverse tail UI, API-based searches, and copy-paste permalinks Searching event data in SolarWinds® Observability just got easier. A new reverse tail display option lets you move the log search bar and change the scroll of the events from bottom to top. For SolarWinds Papertrail™ fans, moving the search bar and changing the scroll will make you feel right at home. To access this customization feature, select display options and toggle the reverse tail option.

The benefits of utilizing locally hosted models with Elastic AI Assistant

A way for public sector organizations to leverage generative AI today to solve security challenges With its ability to sift through large amounts of data to find unusual patterns, generative AI now plays a key role in helping teams protect their organizations from cyber threats. It also helps security professionals by augmenting their skills and bridging gaps in their knowledge.

Getting started with the Elastic AI Assistant for Observability and Amazon Bedrock

Elastic recently released version 8.13, which includes the general availability of Amazon Bedrock integration for the Elastic AI Assistant for Observability. This blog post will walk through the step-by-step process of setting up the Elastic AI Assistant with Amazon Bedrock.

The Art of Visibility: Constructing an OpenTelemetry Observability Pipeline

Craft an observability pipeline that offers unparalleled insights into your systems and applications. Watch as we explore the art of constructing an OpenTelemetry observability pipeline, from instrumenting your codebase to effectively analyzing and visualizing telemetry data. Whether you're aiming to enhance troubleshooting, optimize performance, or gain a deeper understanding of your environment, this video series will equip you with the knowledge and tools to elevate your observability game.

The Leading Redis Monitoring Tools

Redis, which stands for remote dictionary server, is an open-source, in-memory data structure store that is commonly used as a database, cache, and message broker. Utilizing Redis provides numerous benefits for your team and organization, which have helped drive the tool's increase in popularity. A key example of this is speed, Redis works primarily in memory, making it particularly fast for data operations.

Why an Observability Pipeline is a Must Have for Security

Security is paramount for almost any sized organization. With the rapid pace of technological advancements and the increasing reliance on digital infrastructure, organizations face an ever-evolving landscape of cyber threats and risks. Protecting sensitive data, intellectual property, and customer information is no longer optional; it is a critical component of maintaining trust and credibility in the marketplace.

The OpenTelemetry Collector: A Deep Dive

Delve into the intricate workings of the OpenTelemetry Collector in this comprehensive webinar. Watch as we explore advanced features, optimization techniques, and best practices for maximizing the efficiency of your telemetry data collection. Whether you're a seasoned user or just getting started, this deep dive promises to unlock invaluable insights into harnessing the full potential of the OpenTelemetry Collector.

Top Security Data Types: Exploring the OCSF Framework

In cybersecurity, it’s a big challenge to handle diverse data formats across various platforms. The Open Cybersecurity Schema Framework (OCSF) aims to address this by standardizing data security formats and simplifying the process of threat hunting. Major players like IBM, AWS and others are working together to standardize data with this open-source project, emphasizing its importance.

The Best ELK Training Courses

The ELK Stack combines three tools, Elasticsearch, Logstash, and Kibana into a complete solution that numerous organizations and teams utilize. Mastering a new tool or process can be challenging enough but learning three at once, including how these three tools interact with each other, is particularly difficult. However, to ease the learning process, there are numerous training courses and certifications for the ELK stack to help you deeply grasp how it operates and how it can be best utilized.

Identity Governance in Cribl.Cloud

This blog post explores Cribl.Cloud‘s approach to Identity Governance (IG), a crucial strategy for securing access to critical systems and data. Learn how Cribl.Cloud leverages IG to ensure security, compliance, efficiency, and customer trust, while also tackling the challenges of managing custom SaaS APIs within an IG framework.

When Your Open Source Turns to the Dark Side

Not that long ago, in a galaxy that isn’t remotely far away, a disturbance in the open source world was felt with wide-ranging reverberations. Imagine waking up one morning to find out that your beloved open source tool, which lies at the heart of your system, is being relicensed. What does it mean? Can you still use it as before? Could the new license be infectious and require you to open source your own business logic? This doom’s day nightmare scenario isn’t hypothetical.

Pipeline Talk: Between Two Fernders Edition

Cribl’s co-founders, Clint Sharp, Dritan Bitincka, and Ledion Bitincka, recently took time to host a Between two Fernders edition of Pipeline Talk at the Cribl offices to discuss a wide variety of topics, including Cribl Lake, the N-Gage, WWE aspirations, fishing poles, how CAT6 cabling is not named after actual cats, and wondering if Apple’s iPhone will be a consumer hit (Yes, we know what year it is, but the host clearly doesn’t).

Elastic Observability on Google Cloud - Access insights in real-time with AI

With the power of Elastic on Google Cloud, you can bring your logs, metrics, traces, and profiling together at scale for unified visibility and AI-powered insights across your entire ecosystem. Discover how organizations of all sizes unify and visualize all their data in one place using the combined innovation of Elastic and Google Cloud.

Sentry vs Coralogix: Comparison of RUM capabilities, pricing & more

As Coralogix is a full-stack observability platform with log analytics, RUM, APM, SIEM and more, it’s hard to really compare it to Sentry’s very limited offering of error tracking and some other real user monitoring functionality. Sentry is also insanely expensive in comparison to Coralogix. Nonetheless, we shall attempt to assess how Sentry’s RUM offering stacks up.

The Best 15 Interactive Dashboard Examples

Your organization, irrespective of its size, is likely creating a substantial amount of data, and deriving value and insights from this data is vital. This is where dashboards can assist you. With reporting dashboards, you can cut through the noise, and select the metrics that are pivotal to your team to begin visualizing them and the trend of these metrics through continuous monitoring, enabling your team to acquire actionable insights.

Data Storage Costs Keeping You Up at Night? Meet Archived Metrics

We all have been there! Getting the largest metrics plan available, turning on real-time monitoring, and…. You know what happens next… BIG BILL! With the explosion of telemetry from microservices, containers, and cloud stacks, engineering teams often have to choose between data and budget. To help our Splunk champions, we are introducing Archive Metrics to make storing data up to ten times cheaper.