Operations | Monitoring | ITSM | DevOps | Cloud

Sponsored Post

How MDR Services Can Optimize Threat Intelligence

Managed Detection and Response (MDR) services play a critical role in cybersecurity. These technologies remotely monitor, detect, and respond to threats, blending threat intelligence with human expertise to hunt down and neutralize potential risks. However, one of the biggest challenges MDRs face is managing the sheer volume and variety of threat intelligence data they receive. This data comes from internal resources and the numerous security technologies their customers use, making it difficult to create a cohesive picture of the threat landscape.

How to Get Started with a Security Data Lake

Modern SecOps teams use Security Information and Event Management (SIEM) software to aggregate security logs, detect anomalies, hunt for threats, and enable rapid incident response. While SIEMs enable accurate, near real-time detection of threats, today's SIEM solutions were never designed to handle the volume of security data organizations generate daily. As daily log ingestion grows, so do the costs of data management.

Mastering Microservices Logging - Best Practices Guide

Microservices architectures have revolutionized software development, enabling scalability and flexibility. However, they also introduce complexities in system monitoring and troubleshooting. Effective logging is crucial for maintaining visibility and diagnosing issues in these distributed environments. This comprehensive guide explores best practices for microservices logging, helping you navigate the challenges and implement robust logging strategies.

Reduce SNMPv3 Trap Volume With Cribl Lookups

Despite new technologies and telemetry formats, like Model-driven Telemetry/Streaming Telemetry and OpenTelemetry, SNMP traps continue to be a significant source of events for monitoring teams. If you’ve been in IT operations, you’ve likely had a request to parse SNMP traps into a human-readable format so that they can be analyzed, probably deduplicated, and passed to a ticketing system for triage and remediation. The challenge? SNMP traps can be excessively chatty.

New GenAI Search Revamps Customer Experience

Splunk has launched a GenAI summary feature in splunk.com and docs.splunk.com search platforms designed to give users a quick and accurate glance of the most pertinent information they are looking for. This GenAI feature serves up a contextual high-level summary pulled from various relevant search results on topics ranging from Splunk product and feature usage to general Splunk terminology.

A Day in the Life of a Mezmo SRE

What keeps an SRE at the top of his game? I had an insightful conversation with Jon Duarte, a Site Reliability Engineer (SRE) at Mezmo and he walked me through his role and the various tasks he manages on a typical day. Here’s Jon offering a brief glimpse into the challenges he faces, the thought processes behind his approach, and the innovative solutions SREs come up with.

Once Again, Logz.io is an Observability Visionary

When Gartner publishes their annual observability industry research, it’s always exciting to find your company named among the most successful and high-profile providers in this space. That’s why Logz.io is thrilled to find itself listed as a Visionary for the third consecutive year in the Gartner Magic Quadrant for Observability Platforms (previously known as the Magic Quadrant for Application Performance Monitoring and Observability).

Cribl Closes $319M Series E Round at a $3.5B Valuation to Revolutionize Enterprise Data Management

I’m so excited to share that Cribl has closed a $319M Series E round! The oversubscribed round was led by GV (Google Ventures), joined by new investor CapitalG along with participation from existing investors GIC, IVP, and CRV. This round values Cribl at $3.5 billion, up 40% from our Series D round in 2022, and includes both primary and secondary.

The Best Elasticsearch Alternatives

Elasticsearch is a distributed search and analytics engine that provides real-time operations and scales Horizontally. This assists users in making quick and effective searches, as well as analyzing, and visualizing huge data volumes. Users commonly commend Elasticsearch for its data indexing and storage capabilities. They highlight its efficiency in indexing text data and its proficiency in managing large data sets for persistence and retrieval.

Introduction to Splunk Synthetic Monitoring in Splunk Observability Cloud

In this video I’m going to introduce you to Splunk Synthetic Monitoring in Splunk Observability Cloud. I’ll explain what synthetic monitoring is and then demonstrate a simple example by creating a browser test for a sample e-commerce site. I’ll also demonstrate how you can link issues found through synthetic monitoring with backend code due to its integration with Splunk APM.

Conquering Data Silos with Cribl: The Universal Receiver Makes Data Integration a Breeze

As a solutions engineer, I always handle the complex challenge of collecting IT and security data. The variety of modern ephemeral systems increases the complexity of collection requirements. Cloud, PCF, and Kubernetes emit metrics, logs, and traces through methodologies like Cloud Foundry’s Nozzle, Prometheus scrapers, and OpenTelemetry collectors. I often find all of these deployed in parallel in a single enterprise environment to meet the evolving needs of IT Ops or SecOps.

Supercharging Engineer Productivity with Real World AI

That’s the assessment of Senior DevOps Engineer and Logz.io user Armin Morattab when discussing the impact of AI on his day-to-day job. He dives deep on AI, observability, and strategies for improving workflows with Logz.io Co-founder Asaf Yigal in our webinar, AI in Observability: Real Engineers Talk Real Uses Cases.

What you should know about Datadog Flex Logs

Late last year, Datadog announced something called Flex Logs, a “more affordable” warm storage tier for log data. Designed for high-volume datasets that are infrequently queried and don't require real-time analysis, the Flex Tier offers Datadog Log Management customers a third option for data storage.

Fundamentals of a Successful Logging and Observability Strategy

Your team is responsible for ensuring the reliability and performance of your organization’s critical applications and infrastructure. What keeps you up at night? Your applications are more complex, distributed and cloud-native than ever, meaning that understanding what’s happening under the hood has never been more complex than it is now. Is it system bugs, or data bottlenecks? Chasing alerts for latency or service degradation that may or may not be business-critical?

Introduction to Log Observer Connect in Splunk Observability Cloud

Log Observer Connect will allow you to connect to and view/query logs from your Splunk Enterprise or Splunk Cloud instance from within Splunk Observability Cloud. In this video, I will introduce you to Log Observer Connect in Splunk Observability Cloud and walk you through a demonstration of how it works. You’ll learn how to view and query logs, as well as save queries for later use. I’ll also walk you through a practical example of when you might use Log Observer Connect through the use of Related Logs.

Setup Log Observer Connect in Splunk Observability Cloud

Log Observer Connect will allow you to connect to and view/query logs from your Splunk Enterprise or Splunk Cloud instance from within Splunk Observability Cloud. In this video, I will briefly explain what Log Observer Connect is and then show you how to connect your Splunk Observability Cloud organization to a Splunk Enterprise instance through Log Observer Connect. TOC.

SNMP Traps as Logs | LogicMonitor

In this short demo video, Michael Rodrigues, Senior Product Manager, will give you a tour of SNMP Traps as Logs, a new way to monitor SNMP traps with LogicMonitor. SNMP Traps as Logs enables real-time, event-driven notifications for critical networking issues within a user-friendly interface, unlocking instant insights. By ingesting SNMP traps as logs instead of EventSources, you can consolidate network troubleshooting efforts within a single pane of glass for a holistic Network Monitoring approach, eliminate monitoring gaps, improve reliability, and facilitate resource planning.

Observability Meets Security: Build a Baseline To Climb the PEAK

When we hunt in new environments and datasets, it is critical to build an understanding of what they contain, and how we can leverage them for future hunts. For this purpose, we recommend the PEAK Threat Hunting Framework's baseline hunting process.

The Leading End to End Monitoring Tools

End-to-end monitoring refers to the comprehensive assessment of the whole IT environment to understand the overall state of the IT infrastructure and how it impacts user experience. Traditional monitoring techniques have differed from end-to-end monitoring in that they view the IT environment from a more holistic and user-centric perspective than other traditional ways of monitoring.

aNN vs kNN: Understand their differences and roles in vector search

In today's digital era — where data grows exponentially and becomes increasingly complex — the ability to efficiently search and analyze this vast ocean of information has never been more important. But it's also never been more challenging. It's like trying to find a needle in a haystack but with the added challenge of the needle constantly changing its form. This is where vector search emerges as a game-changer, changing how we interact with large data sets.

Your Data Your Cloud: Cribl Stream Managed Worker Groups in Microsoft Azure

One of our most commonly asked questions is when we will support Worker Groups in Azure. We’ve heard you loud and clear; some exciting news will make your data management much more straightforward. We’re introducing a Cribl-managed Cribl Stream data plane, also known as Worker Groups, in Microsoft Azure. These Worker Groups are oil to your engine—essential for data operations, handling everything from shaping and transforming to enriching and processing your data.

What Is Five 9s in Availability Metrics?

What comes to mind when you hear that an IT component has “five 9s availability”? Five 9s availability of >= 99.999% is the peak metric for IT availability. Five 9s predicts that a measured component — whether it is a server, communication line, app, service, or any other item — will be available at least 99.999% of the time during a specific period.

Splunk Named a Leader in the Gartner Magic Quadrant for Observability Platforms

"Transformative Solution" says a Director of IT in a $30B+ retailer. "Best Monitoring and Observability Tool > Splunk," is how a software engineer in a software company labels it. These are only a couple of the terms our customers use when describing the value they are getting from Splunk. With these descriptions in mind, we are elated that Splunk has been named a Leader in the 2024 Gartner Magic Quadrant for Observability Platforms for the second year in a row in this category.

Meet Your New Query Sidekick: The Coralogix AI Query Assistant

Becoming an expert in any query language can take years of dedicated study and practice. At Coralogix, however, we believe observability should be accessible to everyone. That’s why we’re thrilled to announce the launch of our latest innovation (and your new sidekick): the AI Query Assistant. The AI Query Assistant revolutionizes the way you interact with your data.

New Hybrid Worker Group Support in Cribl Lake

Cribl Lake is simple, it’s storage, it’s simplified storage to keep large volumes of IT and security data for long retention periods. And now it’s even easier for you to start using Cribl Lake. In addition to Cribl-managed Cloud Worker Groups, cloud customers can now use self-managed Hybrid Worker Groups to send data directly to Cribl Lake. This means all your worker groups, whether hybrid or cloud, can write data to Cribl Lake — all coordinated by your Cribl.Cloud Leader.

IBM partners with Elasticsearch to deliver Conversational Search with watsonx Assistant

To meet customer needs for scale, speed, and precision, IBM partners with Elasticsearch to deliver retrieval augmented generation (RAG) capabilities that can be seamlessly integrated into the IBM watsonx Assistant’s new Conversational Search feature. Customers using IBM watsonx Assistant and watsonx Orchestrate can now build conversational AI assistants grounded on their company data with comprehensive search capabilities with RAG.

How DPM monitoring helps you manage your metrics volume

At Sumo Logic, we’re committed to helping you scale without breaking your budget. As you may have heard, we recently launched Flex Licensing, a first-of-its-kind economic model that offers free, unlimited log data ingest so different teams can capture and analyze critical data across their enterprise in one place. We’re also committed to tackling related challenges raised by other data sources — like metrics.

Cribl Copilot: Lets You Bypass the Learning Curve

Think of it as your digital concierge to achieve faster time-to-value IT and security teams face more challenges than ever, with data growing at 28% CAGR and taking numerous shapes and forms. Cribl’s suite of products – Stream, Edge, Search, and Lake – is built on a unified data processing engine specifically designed for IT and security data.

Under Pressure? Let Cribl 4.8 Take the Heat Off Your Data Management Woes

The demands on IT, observability, and security teams have never been greater. With data volumes exploding at a 28% CAGR and hybrid environments becoming the norm, organizations are facing significant challenges: those rapidly growing data volumes I mentioned, the intricacies of hybrid and cloud-native architectures, and the need for real-time insights. Oh, and don’t forget the constant threat of security breaches.

Deploying OpenSearch Effortlessly with Terraform

Creating OpenSearch clusters is crucial for organizations aiming to harness the power of distributed search and analytics. These clusters allow businesses to efficiently store, index, and examine extensive amounts of data in real time, offering valuable insights for decision-making and operational efficiency. A significant advantage of creating OpenSearch clusters is that they support replication and shard allocation, which ensures high availability and fault tolerance.

Stack Overflow rolls out generative AI using Elasticsearch and Azure Open AI

Stack Overflow puts Elastic at the heart of OverflowAI powered by Azure OpenAI, a new search tool that enables developers to retrieve trusted information from a knowledge base of 60 million questions and answers. About Elastic Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale. Elastic’s solutions for search, observability, and security are built on the Elastic Search AI Platform — the development platform used by thousands of companies, including more than 50% of the Fortune 500.

Elastic named a Leader in the 2024 Gartner Magic Quadrant for Observability Platforms

Elastic has been named a Leader in the 2024 Gartner Magic Quadrant for Observability Platforms. The need for observability platforms continues to evolve as operations teams deal with increased complexity and exponential data growth. Emerging trends like generative AI are driving a paradigm shift in proactive root cause detection and resolution.

Beyond RAG basics: Advanced strategies for AI applications

Our recent virtual event with Cohere dove deep into the world of retrieval augmented generation (RAG), focusing on the critical considerations for building RAG applications beyond the proof-of-concept stage. Our speakers, Lily Adler, principal solutions architect at Elastic, and Maxime Voisin, senior product manager at Cohere, shared valuable insights on the challenges, solutions, and best practices in this evolving field of natural language processing (NLP).

Navigating Open Source Software: All Your Questions Answered

Open source software refers to computer programs with source code available for anyone to inspect, modify, and distribute. Unlike proprietary software, open source software is developed collaboratively by a community of developers. One of the main benefits of open source software is cost savings. Because the source code is freely available, organizations can use and customize the software without paying licensing fees, reducing costs, especially for large-scale deployments.

observIQ Expands Advanced Support for Sumo Logic in Security and Observability Data

We’re excited to announce that as part of our expanded alliance with Sumo Logic, observIQ extended its support for Sumo’s platform. This allows customers to send logs and metrics to Sumo Logic, leveraging our telemetry pipeline, BindPlane. We’ve also made it possible to automatically recommend processors in our pipeline that format data specifically as Sumo Logic expects—once Sumo Logic is a destination for BindPlane.

Introduction to K8s Horizontal Pod Autoscaling | Monitor Autoscaling in Splunk Observability Cloud

In this video, I’m going to introduce you to Horizontal Pod Autoscaling in Kubernetes and monitoring autoscaling events in Splunk Observability Cloud. I’ll first walk through our simple application deployment definition. We will analyze the metrics of that application in Splunk Observability cloud, identifying that the application is under resource pressure. I’ll then discuss the scaling options at our disposal, and we will walk through an implementation of a Horizontal Pod Autoscaler that will automatically scale our pods according to the load they are receiving.

How the Cribl SRE Team Uses Cribl Products to Achieve Scalable Observability

This is the first of a planned series of blog posts that explain how the Cribl SRE team builds, optimizes, and operates a robust Observability suite using Cribl’s products, Cribl.Cloud operates on a single-tenant architecture, providing each customer with dedicated AWS accounts furnished with ready-to-use Cribl products. This provides our customers with strict data and workload isolation but presents some interesting and unique challenges for our Infrastructure and operations.

How to Start Contributing to Open Source with OpenTelemetry

Today, open source software is everywhere – from Linux-based servers, to Android smartphones, to the Firefox Web browser, to name just a handful of open source platforms in widespread use today. But the open source code driving these innovations doesn't write itself. It's developed by open source contributors – and you could be one of them.

Dogfooding at Mezmo: How we used telemetry pipeline to reduce data volume

Like many other organizations, we at Mezmo struggle with a lot of telemetry data, and for a while our team configured our logs to be sent to a global Mezmo Log Analysis account in our SaaS so we would have a single pane of glass to view all of our logs. Our SRE team wanted to make sure that we have experience utilizing our new pipeline product. We set out some goals before we started using telemetry pipeline.

Best Windows Server Monitoring Tools

Server monitoring involves continuously observing and tracking the performance, availability, and health of servers within an IT infrastructure and is a vital process for organizations aiming to enhance their servers. By conducting server monitoring, with the assistance of server monitoring tools, your organization can detect issues such as hardware failures or software glitches promptly allowing for quick resolutions as server monitoring tools continuously track server health and performance metrics.

What is Log Aggregation? A Complete Guide

As modern IT infrastructure becomes increasingly complex, businesses generate massive amounts of logs compared to the past in real time. Therefore, streamlining this unstructured log data into a more structured form becomes vital with this growing complexity. Organizations must collect unstructured log data from various sources, extract meaning from them, and store them in a centralized repository. That’s where Log Aggregation comes in.

Event Logs Explained: Your Guide to System Health

Event logs contain critical information and the analysis of these logs will support organizations in the detection of many security incidents, from auditing user access to observing malicious traffic and even isolating monitor rule changes on a firewall. By collecting event logs systematically and analyzing them, organizations can obtain insights into their IT environment for maintaining operational efficiency and security.

Elastic Search 8.15: Accessible semantic search with semantic text and reranking

In 8.15, great search results are even more accessible for our customers. Our latest release brings semantic reranking, additional vector search tools, and more third-party model providers and promotes our native Learning to Rank (LTR) to generally available. And now search is more performant than ever with additional speed and efficiency improvements.

Elastic Observability 8.15: AI Assistant, OTel, and log quality enhancements

Elastic Observability 8.15 announces several key capabilities: New and enhanced native OpenTelemetry capabilities: Elastic AI Assistant enhancements: Large language model (LLM) observability for Azure OpenAI: Elastic Observability now provides deep visibility on the usage of the Azure OpenAI Service. The integration includes an out-of-the-box dashboard that summarizes the most relevant aspects of the service usage, including request and error rates, token usage, and chat completion latency.

Applying a Data Engineering Approach to Telemetry Data

The exponential growth of telemetry data presents a significant challenge for organizations, who often overspend on data management without fully capitalizing on its potential value. To unlock the true potential of their telemetry data, organizations must treat it as a valuable enterprise asset, applying rigorous data engineering principles to glean the critical insights and accelerated investigations this data is meant to enable. The telemetry data platform approach democratizes access across disciplines and personas and fosters widespread utilization across the organization.

Managing Observability Pipeline Chaos

The cloud environment has generated an unprecedented volume of data, making it increasingly difficult for enterprises to manage. With multiple SaaS and cloud-based applications in play, differentiating which data needs processing for analysis versus storage for regulatory compliance is a significant challenge. The growing number of data sources only complicates this further. So, getting clarity and control over this chaos is the goal, without having to overhaul your entire system.

How to integrate Okta logs with Grafana Loki for enhanced SIEM capabilities

Identity providers (IdPs) such as Okta play a crucial role in enterprise environments by providing seamless authentication and authorization experiences for users accessing organizational resources. These interactions generate a massive volume of event logs, containing valuable information like user details, geographical locations, IP addresses, and more. These logs are essential for security teams, especially in operations, because they’re used to detect and respond to incidents effectively.

Cribl Search Provides an Audit Capability to Assess Your Snowflake Account

Only last month, Cribl added Snowflake to its growing list of accessible data stores it can search. Using Cribl Search, admins can now leverage Cribl’s search-in-place capability to query data located in Snowflake’s data warehouse. Boy, did we have the timing right? Today, Snowflake customers and other incident response teams are still determining the nexus of the incident.

Shh, It's a Secret: Keeping Them Safe in Cribl's Software

Remember when you used to jot down passwords on sticky notes? Well, those days are long gone. In today’s world of data pipelines, secrets, similar to API keys, are like digital VIP passes. They open doors to critical systems and keep sensitive info on lockdown. At Cribl, we’re all about top-notch data security, and that means guarding your secrets like treasure. Let’s dive into our game plan for keeping secrets safe throughout the entire software development lifecycle (SDLC).

Cribl Lake Wins CRN 2024 Tech Innovators Award for Data and Information Management

The greatest innovations are often the simplest. They address fundamental needs and make life easier in the most direct way. Cribl Lake was just announced as the winner of CRN’s 2024 Tech Innovators Award for Data Information Management. We are so happy and honored by this recognition, which solidifies our belief that the best innovations are indeed the simplest.

How to Monitor JVM with OpenTelemetry

The Java Virtual Machine (JVM) is an important part of the Java programming language, allowing applications to run on any device with the JVM, regardless of the hardware and operating system. It interprets Java bytecode and manages memory, garbage collection, and performance optimization to ensure smooth execution and scalability. Effective JVM monitoring is critical for performance and stability. This is where OpenTelemetry comes into play.

An Overview of the OpenTelemetry Collector's Configuration File

In this video, I’ll provide an overview of the OpenTelemetry Collector’s configuration file (config.yaml) with examples from the Splunk distribution. I will briefly explain the components of the Splunk OTel Collector, and walk you through a sample generic configuration of the OTel Collector. We’ll then use the Splunk Observability Cloud interface to construct the commands needed to install the Splunk OTel Collector on a specific host. This installation will copy a default Splunk OTel Collector configuration onto the host, and we’ll review the Splunk specific components of this configuration.

Introducing Squadcast's Audit Logs: Enhanced Visibility and Control

Maintaining comprehensive records of user and entity-related changes within your Incident Management platform is crucial. Organizations have long relied on external analytics tools for these insights. However, the demand for an integrated solution within Squadcast has been growing. We are excited to introduce Squadcast's Audit Logs feature, designed to address this need directly within our platform.

Data Is a Blizzard: Just Because Each Snowflake Is Unique Doesn't Mean Your Search Tools Have to Be Too

Cribl Search is agnostic, allowing administrators to now query Snowflake datasets as they can dozens of other Lakes, Stores, Systems & Platforms. The data that IT and security teams rely on to monitor network operations continues to grow at a 28% CAGR, and it’s stressing many organizations’ ability to analyze all this data effectively. In fact, in some cases, less than 2% of it ever gets looked at.

How to Send Grafana Alloy Logs to Grafana Loki | Ask the Experts | Grafana

In this video, Matt Durham, Sr. Software Engineer on the Grafana Alloy team, shows you how to send Grafana Alloy logs to Loki. Specifically, we address the question: "Is it possible to send data from one Grafana Alloy to another? Could anyone supply me with config examples of such interactions? If I send data from Grafana Alloy directly to Loki, it is working. If I send data from Grafana Alloy to another, and then to Loki, the second instance gives me an error.".

You don't need ALL those metrics!

Metrics are key to monitoring system health and performance but you probably are ingesting far more metrics than you will ever need or use. The issue is that popular tools in this space, such as OpenTelemetry and Prometheus, leverage node exporters to emit a plethora of metrics. OpenTelemetry tracks even the minutest details of system performance. Prometheus exporters can generate a vast array of metrics, ranging from CPU usage to disk I/O, and everything in between.

The Power of Combining a Modular Security Data Lake with an XDR

The 2024 Global Digital Trust Insights survey from PwC reports that 36% of businesses have experienced a data breach that cost more than $1 million to remediate. Cyber threats are clearly on the rise and in today’s volatile threat environment, it is a matter of when - not if - a cybersecurity incident will occur. Digital adversaries are becoming more sophisticated and relying on weak links to exploit company applications and infrastructure.

Decision Intelligence: An Introduction

Every day, employees and leaders of enterprise IT organizations make multiple decisions that affect their company’s success or failure. To stay ahead of the competition and drive innovation, an increasing number of organizations are turning to decision intelligence (DI), a relatively new field combining data science, decision theory and artificial intelligence, to augment and improve decision-making.

Graylog Geolocation: Mapping Your Log Data

In today’s distributed work environment, understanding the geographic origin of network traffic has become more crucial than ever. As organizations adapt to remote work, IT teams face the challenge of monitoring and analyzing an expanding array of IP addresses from various locations. Graylog’s geolocation feature offers a powerful solution to this challenge, allowing teams to extract and visualize geographic information from IP addresses in their logs.

Unlocking Business Insights with Telemetry Pipelines

Imagine running a large company where data-driven decisions give you a competitive edge. You use a lot of business intelligence tools that tap into vast amounts of data, such as sales figures, inventories, and expenses. This analysis tells you how your company is performing. However, it does not reveal how your "company infrastructure" is performing. This crucial information comes from your systems in the form of telemetry data, such as logs and events.

The Leading Network Device Monitoring Tools

Ensuring the security of your network infrastructure is critical for all organizations, and this requires going beyond traditional network monitoring and incorporating the monitoring of network devices, such as routers, switches, and other network devices. Whilst network monitoring includes the monitoring of devices, dedicated network device monitoring is a more thorough process for guaranteeing the health and performance of your organization's network devices.

Unlock the Value of Cloud: Introducing Splunk Cloud Value Calculator

In the rapidly evolving digital landscape, organizations are increasingly turning to the cloud powered with AI capabilities to enhance efficiency, scalability and innovation. Splunk, a leader in security and data observability, has been at the forefront of this transformation.

Setting up and Understanding OpenTelemetry Collector Pipelines Through Visualization

Observability provides many business benefits, but comes with costs as well. Once the (not-insignificant) work of picking a platform, taking an inventory of your applications and infrastructure, and getting buyin from leadership (both from the business and engineering sides of the house) is done, you then have to actually instrument your applications to emit data, and build the data pipeline that sends that data to your observability system.