Operations | Monitoring | ITSM | DevOps | Cloud

Differentiating Sumo Logic Mo Copilot using Amazon Bedrock

Sumo Logic Mo Copilot is a natural language assistant that helps first responders derive insights from logs and resolve issues faster using contextual suggestions and plain English queries. It has been in preview since May 2024 with dozens of customers. Choosing a foundation model was a critical step in its development. Let’s explore our high-level requirements for Copilot, the role of foundation models and the rationale for standardizing on Amazon Bedrock.

Big Data, Zero Hassle: Cribl Edge for Centralized Agent Management

Today’s IT and security environments have gone from “big” to “massive” in just a decade or two—endpoints have practically exploded (think hundreds of thousands of servers, not just a hundred). Add in a dizzying array of data types and vendors, and what do you get? A whole lot of chaos. So why, oh why, does agent management still feel like it’s stuck in the early 2000s?

Introducing the Logz.io AI Agent, Accelerating the Future of Observability

Logz.io introduces its AI Agent in Beta, using GenAI to revolutionize observability. The AI Agent simplifies monitoring with automated data analysis and root cause detection, accelerating issue resolution by 3-5x for beta users—marking a critical step toward fully autonomous observability.

From stateful to stateless: Sumo Logic's transition from Lucene to Parquet-based architecture

Ensuring scalability, performance, and cost-effectiveness is a constant challenge for cloud-native log management and observability. At Sumo Logic, we faced this challenge head-on by transitioning from a stateful, Lucene-based architecture to a completely stateless, Parquet-based architecture. This transformation lets us improve data storage efficiency, streamline operational complexity, and meet the demands of an ever-increasing data scale.

Threat Hunting with Cribl Search

Imagine you’re the protector of a castle. Your walls are tall, the gates are strong, and the guards are well-trained. But what if an intruder was still able to slip past your defenses? Even with the best security tools, not every threat will be caught. Threat hunting is the proactive approach to finding attackers that might have bypassed your defenses.

The Path to Autonomous Observability

Autonomous observability for system monitoring and management aims to use GenAI and machine learning to automatically detect, diagnose and resolve issues. In conversations about cloud observability today, discussions often shift from “what’s possible” to “what’s practical.” Too often, these conversations highlight the shortcomings of current observability processes, tools and financial models.

Enhancing Log Analysis with Machine Learning (ML)

Log Analysis has been a beneficial practice for organizations for numerous years, and over these years it has continuously evolved. This has been in part driven by the increasing volume of logs that companies are required to monitor. Now, log analysis is shifting again, incorporating machine learning (ML) and artificial intelligence (AI) to assist data analysts in identifying system log patterns and anomalies.

October '24 BindPlane Update

I'm covering our powerful new feature: the coalesce processor in BindPlane! I’ll walk you through how to use it to simplify your telemetry data by merging mismatched field names—like user and username—into one unified field (usr). We’ll configure a BindPlane Gateway, capture telemetry from various sources, and route it all to Honeycomb and S3. With the coalesce processor, field names get standardized quickly, making your dashboards and alerts far more intuitive.

Encoding HAProxy logs in machine-readable JSON or CBOR

Standardized logging formats are important for teams that rely on logging for observability, troubleshooting, and workflow integration. Using structured formats simplifies parsing and eliminates the need to interpret fields manually, ensuring consistency across logging formats. This reduces manual work, prevents brittleness from unstructured logs, and simplifies integration between teams that feed logs into a shared aggregation system.

Scaling Culture on Purpose: How Cribl is Building for the Future After Our Series E

Cribl’s recent $319M Series E round marks a significant milestone in our journey to becoming a generational company. While this growth opens the door to new opportunities for our company, it also presents a challenge: how do we ensure our amazing culture scales alongside the business? At Cribl, we believe in Culture on Purpose—an intentional, values-led approach to evolving our culture as we grow.

State of Observability 2024 Reveals How Leaders Outpace Their Peers

In 2024, simply having an observability practice is a given. In this era of observability, a high-functioning team will set leaders apart from their peers. Leading observability practitioners don’t fix issues by putting hundreds of people into a virtual room, or frantically messaging in a temporary Slack channel to find root causes. Because leaders embed observability into their development practices early, a feature launch is a quiet non-event.

Reduce Observability Costs with OpenTelemetry Setup

Maintaining and visualizing telemetry data efficiently is super important for DevOps and SecOps teams. OpenTelemetry, a fantastic open-source observability framework, can really help with this without being too costly. Picture having a simple process that improves your data and helps your team make smart decisions without spending too much money. Let's chat about some budget-friendly ways to set up OpenTelemetry agents.

Accelerate Visibility and Analysis With New Cribl Search Packs

Our new Cribl Search Packs give you a framework for packaging, sharing, and installing config bundles that align with a given data source or use case. Similar in concept to our original Cribl Stream Packs framework, Cribl Search Packs help users find value in their datasets more quickly across common use cases. In fact, Stream Pack users were a powerful driver in the development of Search Packs.

Debugging Kubernetes Autoscaling with Honeycomb Log Analytics

Let’s be real, we’ve never been huge fans of conventional unstructured logs at Honeycomb. From the very start, we’ve emitted from our own codestructured wide events and distributed traces with well-formed schemas. Fortunately (because it avoids reinventing the wheel) and unfortunately (because it doesn’t adhere to our standards for observability) for us, not all the software we run is written by us.

Master debugging with four ways to visualize your traces

In a world where microservices rule and distributed architectures are the norm, understanding how a single request flows through your system can be an overwhelming challenge. But don’t worry—there’s light at the end of the tunnel! And not just one light, but four.
Sponsored Post

How to Detect Threats to AI Systems with MITRE ATLAS Framework

Cyber threats against AI systems are on the rise, and today's AI developers need a robust approach to securing AI applications that address the unique vulnerabilities and attack patterns associated with AI systems and ML models deployed in production environments. In this blog, we're taking a closer look at two specific tools that AI developers can use to help detect cyber threats against AI systems.

Laptops, Desktops, and Data-Oh My! Cribl Edge Has You Covered

As organizations continue to become more reliant on distributed and hybrid workforces, the need for comprehensive data collection across every endpoint—servers, applications, desktops, and laptops—has never been more critical. But let’s be real: agents can be a total headache. That’s where Cribl Edge comes in, now with support for desktops and laptops (in preview)!

Effortless Data Compliance with Cribl Lake

Organizations generate, collect, and store vast amounts of telemetry data. With this data comes the growing responsibility to ensure compliance with various regulations, from GDPR to HIPPA. Data compliance ensures data is handled, stored, and processed according to laws and standards protecting personal information. But what makes compliance regulations scary is that it’s ever-changing and rules vary across industries, making it complex to manage.

What is log analysis? Overview and best practices

In today’s complex IT environments, logs are the unsung heroes of infrastructure management. They hold a wealth of information that can mean the difference between reactive firefighting and proactive performance tuning. Log analysis is a process in modern IT and security environments that involves collecting, processing, and interpreting log information generated by computer systems. These systems include the various applications and devices on a business network.

What are SLOs/SLIs/SLAs?

You’ve likely noticed how some pizza places promise delivery in 30 minutes, or they’ll give you your money back. But what are they really promising? They’re setting a clear performance goal and backing it up with confidence. How do they measure their performance? They track how long each delivery takes. And why do they make this promise? Because fast service is key to keeping their business thriving.

Stronger together: Sumo Logic and AWS partnership expands with five new competencies

For over a decade, we’ve worked closely with AWS to help our joint customers ensure the health and security of their mission-critical applications. That’s why we’re so excited to have recently renewed our Strategic Collaboration Agreement (SCA) with AWS and to announce five new AWS competencies across multiple industries.

Cisco uses Elastic to save 5,000 support engineer hours a month

With the precision of search and the intelligence of AI, Cisco uses Elastic on Google Cloud to create richer search experiences, so support engineers can quickly find the answers they need. Scaling from this success, Cisco's Search team added AI models, semantic search, and vector search to more than 50 internal- and external-facing apps, helping them innovate more quickly and increase overall operational efficiency.

Unlock the Real Value of Logs With Honeycomb Telemetry Pipeline and Honeycomb for Log Analytics

At Honeycomb, we know how important it is for organizations to have a unified observability platform. This is why we’re launching Honeycomb Telemetry Pipeline and Honeycomb for Log Analytics: to enable engineering teams to send and analyze data—including logs—into a single, unified platform. For too long, teams have had to wrangle large volumes of logs, their context scattered across multiple teams and tools, leading to knowledge silos.

The Leading Java Performance Monitoring Tools

Java is a flexible and commonly used programming language known for its platform independence, object-oriented design, and robustness. It was originally developed by Sun Microsystems (now owned by Oracle Corporation) in the mid-1990s and soon gained popularity due to its "Write Once, Run Anywhere" (WORA) principle, allowing developers to write code that can operate on any device or platform with a Java Virtual Machine (JVM).

Introducing pipe syntax in BigQuery and Cloud Logging

Writing complex SQL queries can be challenging, but BigQuery's new pipe syntax offers a more intuitive way to structure your code. Learn how pipe syntax simplifies both exploratory analysis and complex log analytics tasks, helping you gain insights faster. Watch along and discover how to leverage pipe syntax in BigQuery for a more efficient analytics experience.

What is Data Center Colocation (Colo)?

As IT costs continue to balloon, many organizations are caught between the desire to scale and the pressure to cut costs. It’s an incredibly delicate balancing act leaders struggle to maintain: while 66% of companies in one study said they plan to increase their IT budgets, 84% were worried about a recession, while 63% struggled to secure IT talent. By spending on infrastructure, organizations are forced to spend less on innovation. But what if there is a way to have both?

What is Digital Experience Monitoring?

Digital experience monitoring (DEM) is the evolution of application performance monitoring (APM) and end user experience monitoring (EUEM) into a comprehensive tool that analyzes the efficacy of an enterprise’s applications and services. Essentially, DEM combines these functions and goes beyond both — all to ensure consistency across the customer experience.

Scaling Product Management for Hyper-Growth: Lessons from Cribl

Cribl has been experiencing rapid growth over the past six years as customers increasingly seek tools to modernize their data strategies. We introduced a new product, Cribl Lake, to help customers address even more diverse data management challenges. With customer data growing at a 28% CAGR, organizations are looking for solutions that can help them manage and optimize their data infrastructure.

Budget-Friendly Logging

OpenTelemetry has quickly become a must-have tool in the DevOps toolkit. It helps us understand how our applications are performing and how our systems are behaving. As more and more organizations move to cloud-native architectures and microservices, it's super important to have great monitoring and tracing in place. OpenTelemetry provides a strong and flexible framework for capturing data that helps DevOps engineers keep our systems running smoothly and efficiently.

Optimize your RAG workflows with Elasticsearch and Vectorize

We’re excited to announce Vectorize now integrates with Elasticsearch vector database! This powerful combination simplifies building retrieval augmented generation (RAG) pipelines, allowing AI engineers to focus on building applications with unprecedented speed and accuracy. Elasticsearch vector database enables fast and efficient real-time search and retrieval of vector data, making it an excellent database for RAG applications.

OpenTelemetry Tips Every DevOps Engineer Should Know

OpenTelemetry has quickly become a must-have tool in the DevOps toolkit. It helps us understand how our applications are performing and how our systems are behaving. As more and more organizations move to cloud-native architectures and microservices, it's super important to have great monitoring and tracing in place. OpenTelemetry provides a strong and flexible framework for capturing data that helps DevOps engineers keep our systems running smoothly and efficiently.

Azure Logging Unleashed: Your Key to Cloud Performance

The Azure Cloud platform processes an extensive variety of data including Eventhub Diagnostic Logs, Kubernetes Metrics, SQL Logs, Activity Logs, Container Activity Logs, and Azure Metrics. Depending on the requirements of your organization these logs offer various levels of importance and priority. But it’s more than likely that you will be monitoring a large variety of these logs.

RabbitMQ vs Kafka vs Redis

RabbitMQ, Apache Kafka, and Redis are some of the most popular microservices message brokers on the market. However, while they’re all the same type of tool, they each offer different features that make them better adapted for specific use cases. To further understand this, in this article, we will outline the main similarities and differences between these tools and highlight which is the best tool for various use cases.

Best Practices for Client-Side Logging and Error Handling in React

Logging is an essential part of development. While working on React projects, logging provides a way to get feedback and information about what’s happening within the running code. However, once an app or website is deployed into production, the default console provides no way to continue benefiting from logs.

Simplifying Your Data Node Migration with Graylog

Migrating your data infrastructure can sound daunting, especially when you’re dealing with complex systems like OpenSearch. But what if it could be easier—almost ridiculously easy? If you’re thinking, “Hey, wait a second—could this be as seamless as it sounds?” You’re in for a pleasant surprise. In this blog, we’re diving into how moving and Simplifying Your Data Node Migration with Graylog makes the process smooth, secure, and efficient.

Java Logging Basics: Concepts, Tools, and Best Practices

Imagine you’re a detective trying to solve a crime, but all the evidence is invisible. Sounds impossible, right? That’s exactly what it’s like trying to debug a Java application without proper logging. Java logging is your magnifying glass, your fingerprint kit, and your trusty notepad all rolled into one. It’s the unsung hero that helps you understand what’s going on under the hood of your application. But logging isn’t just about catching bugs.

The 3 pillars of observability: Unified logs, metrics, and traces

Understanding telemetry signals for better decision-making, improved performance, and enhanced customer experiences Telemetry signals have evolved significantly over the years — if you blinked, you could have missed it. In fact, much of the common wisdom about observability needs a refresh. If your observability solution doesn’t consider the current state of telemetry, you might need an upgrade.

How search accelerates your path to "AI first"

The combination of AI and search enables new levels of enterprise intelligence, with technologies such as natural language processing (NLP), machine learning (ML)-based relevancy, vector/semantic search, and large language models (LLMs) helping organizations finally unlock the value of unanalyzed data. Search and knowledge discovery technology is required for organizations to uncover, analyze, and utilize key data.

Troubleshooting Microservices with Splunk Observability Cloud and the AI Assistant for Observability

In this video, I’m going show you how to troubleshoot microservices in Splunk Observability Cloud using features like APM’s Service Map and Tag Spotlight to identify what’s causing our microservice to produce high error rates. We’ll then review Related Logs in Log Observer to determine why the error in our service is occurring.

The new era of observability: Why logs matter more than ever

20 years ago, software ate the world. The old ways of monitoring, failing over, or routinely rebooting quickly became inadequate and with a new focus on software excellence, how we monitor and maintain them had to be rethought. Even back then, when new software was released on an annual basis, it was clear that developers and futurists needed to build, inform, and optimize their approach, which required a deeper understanding of the application experience.

Top Tips for Querying OpenSearch

OpenSearch allows you to store a sizeable amount of data, commonly logs, metrics, and documents. You access useful data within OpenSearch by querying to get specific information, deep analysis, and insights for decision-making. With OpenSearch, you can perform complex searches by using natural language, Boolean operators, and filters to pinpoint relevant information efficiently.

Revisiting improved HTTP logging in ASP.NET Core 8

A few years ago, I had a play with HTTP logging added in ASP.NET Core 6. ASP.NET Core 8 introduced a set of additional configuration options that I believe are essential to make this feature usable. I will recap the details from the previous post below, but for more context, the first part of this series is here. In this post, I'll go through some of the changes introduced in HTTP logging since last. Before I jump into the improvements, let's recap how to set up HTTP logging.

How to Build a Data Migration Plan? A Step By Step Guide

Data growth is growing at an extraordinary pace, with a compound annual growth rate (CAGR) of 28% projected over the next few years. For organizations dealing with logs, metrics, and traces, this massive data expansion brings both opportunities and challenges. As data volumes soar, having flexibility in where you store and analyze it—whether in a SIEM, object storage, or other platforms—has become essential.

Understanding Java Logs

Logs are the notetakers for your Java application. In a meeting, you might take notes so that you can remember important details later. Your Java logs do the same thing for your application. They document important information about the application’s ability to function and problems that keep it from working as intended. Logs give you information to help fix coding errors, but they also give your end users information that helps them monitor performance and security.

Launch Week Keynote: Log Management and Analytics

Logs have always been a critical component of observability, extending beyond embedded development. Many embedded teams rely heavily on logs and strive to enhance efficiency. Some struggle to extract value from their logs due to the complexities of collection and analysis. Regardless of your team's current situation, we are here to help. Our new Log Management and Analytics offering streamlines log collection, accelerates investigations, and transforms your logs into easily accessible insights across your entire fleet.

Introducing the Observability Center of Excellence: Taking Your Observability Game to the Next Level

Chasing false alerts — or worse, having your system go down with no alerts or telemetry to give you a heads-up — is the nightmare we all want to avoid. If you’ve experienced this, you’re not alone. Before joining Splunk, I spent 14 years as an observability practitioner and leader for several Fortune 500 companies and in my 2.5 years with Splunk I have had the opportunity to work with customers of all shapes and sizes.

How to Integrate Docker with Logit.io

Docker is an open-source container service provider, designed to help developers build, run, and share container applications. Users building and running these container applications need to conduct effective debugging and monitoring practices and for this, they have turned to Docker logging. To understand the importance of this, the latest edition of our how-to guide series surrounds Docker.

Using Trace Data for Effective Root Cause Analysis

Solving system failures and performance issues can be like solving a tough puzzle for engineers. But trace data can make it simpler. It helps engineers see how systems behave, find problems, and understand what's causing them. So let’s chat about why trace data is important, how it's used for finding the root cause of issues, and how it can help engineers troubleshoot more effectively.

All about Explore Logs for Grafana Loki (Loki Community Call October 2024)

In this Community Call, Senior Software Engineer Trevor Whitney talks to us all about Explore Logs for Grafana Loki, an open-source app for visualizing logs from Loki in Grafana without needing to learn and write LogQL queries. He is joined by Senior Developer Advocates Nicole van der Hoeven and Jay Clifford. Community Calls are monthly meetings that are open to everyone interested in the development of Loki. They are an opportunity for software engineers working on Loki to discuss new features as well as for open-source users of Loki to ask questions.

The Best Tips for Implementing Effective Alerts

When monitoring your logs metrics and traces it’s crucial that you can detect issues early to ensure the uptime of applications, alleviate bottlenecks, and enhance the performance of your systems. If you’re an experienced developer or IT professional this is a straightforward task when you’re viewing the data in front of you. However, when you aren’t viewing your data, it's just as important to guarantee that your systems are functioning optimally. This is achieved through alerts.

Transforming cybersecurity with Elastic Search AI: A game-changer for Proficio

How Proficio leveraged Elastic Security on AWS to revolutionize threat detection and response In today’s rapidly evolving digital landscape, maintaining robust cybersecurity defenses has never been more critical. Proficio, a leading managed security services provider, faces the continual challenge of monitoring an expansive array of data points and potential vulnerabilities.

Webinar Recap | Next Gen Log Management: Maximize Log Value with Telemetry Pipelines

During our webinar, Next Gen Log Management: Maximize Log Value with Telemetry Pipelines, we discussed how you can take your log management strategy to the next level with telemetry pipelines and unlock the full potential of your data. Bill explained that the rapid growth of log data is driving up storage and management costs. He emphasized the need for an intelligent, adaptable log management system to efficiently handle this situation.

Logz.io Earns 15 G2 Badges for Fall 2024: AI-Powered Observability That Delivers

At Logz.io, we believe that observability should be simple, smart, and fast—powered by AI to help teams move with confidence. This Fall, our users recognized that commitment by awarding Logz.io 15 badges on G2 across multiple categories and global markets. From ease of use to fast implementation, users and businesses alike are experiencing how AI-driven observability can transform their operations. Here’s a breakdown of what we achieved and why it matters for you.

Cribl and CrowdStrike Deepen Partnership with Falcon Next-Gen SIEM integration

Cribl is The Data Engine for Security and IT data, and integrations fuel our mission. Since day one, Cribl has been delivering new Stream integrations to meet customers where they are in their data management journey. No matter where customer data resides or needs to go, we want to be there for every customer. It’s your data, and Cribl was created to help you unlock it.

How to Slash Cyber Security Costs with Cribl Stream

Imagine the panic of a business owner who starts the day with a devastating realization: their entire database has been compromised, and the attackers demand a ransom that threatens the very survival of the business. Unfortunately, this isn’t just a nightmare what-if, it’s an all-too-common reality in today’s connected world.

Splunking GenAI Applications for Observability Insights

Has your organization finally developed that game changing generative AI application? Is your CTO, CIO, or CEO banking on it being a success? I bet they are! Now, here’s the big question: Are you prepared to monitor and troubleshoot your new application once users get engaged? Fear not, my boy Derek Mitchell has you covered with two incredible Splunk Lantern articles which goes deep into how Splunk Observability Cloud allows you to instrument GenAI apps to gain critical observability insights.

Top 6 Tips for Forwarding Logs

Log forwarding can be seen as the first step towards centralized log management. With centralized log management, your organization can gain from enhanced visibility, monitoring, and analysis capabilities, making it a coveted practice for numerous organizations. Log forwarding is crucial for maintaining robust IT security and operational efficiency, allowing organizations to manage and analyze logs from multiple systems in a centralized, scalable manner.

Infrastructure and Observability as Code | An Introduction

In this video I will introduce you to the concept of Observability as Code and what that looks like in Splunk Observability Cloud. I’ll first discuss the issues you might encounter managing infrastructure manually, and then define Infrastructure as Code so that you have a better understanding of the motivation behind Observability as Code. We’ll briefly introduce Terraform and then I’ll discuss the benefits of implementing Observability as Code using Splunk’s Terraform provider in Splunk Observability Cloud.

Top PostgreSQL Monitoring Tools

PostgreSQL PostgreSQL is a powerful open-source relational database management system (RDBMS) and is one of the most popular relational databases with over 1.5 billion users. It’s renowned for its reliability, robustness, and comprehensive capabilities. It is capable of managing a broad variety of workloads, from small single-machine applications to large-scale enterprise databases.

Agents of Mass Collection: Cribl Edge Set-up and Tips

Collection agents emerged to alleviate the pain of having log files distributed around your application servers. However, they brought new problems since each log analysis tool wanted its own agent, trading in its own protocols and/or formats, usually targeting only a single use case. Meaning you had to install multiple agents for different use cases. Onboarding data and managing all these agents seems to be an afterthought.

What I Wish I Knew Before Building My First OTel Collector

Starting your journey to build your first OTel Collector can be really exciting, but it can also feel a bit overwhelming. OpenTelemetry, or OTel, is an amazing tool that can help standardize the collection of observability data, but it's normal to feel a bit lost at first. There are lots of little details and best practices that can make the whole process easier, but many of us end up learning them the hard way.