Operations | Monitoring | ITSM | DevOps | Cloud

December 2022

Efficient Help Desk Processes with Centralized Log Management

Another day starting up your laptop or workstation, logging into programs, and waiting for that first call to come in. As an IT help desk analyst, you love when you can solve people’s problems, but sometimes the number of calls feels overwhelming. Although each analyst tier responds to different customer or employee concerns, you all share the same basic job functions like answering calls, asking questions, and research answers.

How to Configure the OTel Community Demo App to Send Telemetry Data to Coralogix

If you’re just getting familiar with full-stack observability and Coralogix and you want to send us your metrics and traces using the new OpenTelemetry Community Demo Application, this blog is here to help you get started. In this simple, step-by-step guide, you will learn how to get telemetry data produced by the OpenTelemetry Demo Webstore into your Coralogix dashboard using Docker on your local machine.

Ingesting and analyzing 2022: an LM Logs success story

A new year means a new set of goals. In 2022, we set some lofty goals to help our customers achieve clarity across their modern IT infrastructure. We set out to do this by improving our log collection and analysis within LM Envision, our unified observability platform, which was announced at LogicMonitor’s Elevate user conference this summer. At the conference, we gathered feedback to understand the various ways our customers access and review log data.

What are SysLog formats? How to use them?

Syslog is a standard for message logging that allows devices such as routers, switches, and servers to send event messages to a central log server. The messages sent by these devices are known as syslog messages and include information such as the date, time, device hostname, and message content. Syslog was originally developed as a part of the BSD operating system, but many other operating systems and network devices have since adopted it.

Logs UI | An intuitive UI for Log Management

A logs UI is a user interface for displaying log data. Logs are records of events that happen on a computer system, such as messages indicating that a particular operation has been performed or an error has occurred. A logs UI typically allows a user to view and search through log data and may also provide features such as filtering and highlighting to help the user find specific log entries of interest.

Logging as a service | Log Management with Open Source

Logging as a service (LAAS) is a type of cloud computing service that allows organizations to store and manage their log data in a central location. This type of service typically includes features such as centralized storage, real-time analytics, and search capabilities, as well as tools for visualizing and analyzing log data. Logs help you debug and troubleshoot your applications. They are also useful for other purposes like auditing and compliance, performance monitoring, and security.

Client logging | Best practices and examples

Client logging refers to the practice of collecting and storing log messages generated by client software, such as a web browser or mobile application. These log messages can provide valuable information about the behavior and performance of the client software, as well as any errors or issues that may have occurred. Client logging is often used by developers to troubleshoot and debug software issues, as well as to gather data for analysis and performance optimization.

Perf8: Performance metrics for Python

One tool for all your Python performance tracking needs We're building this neat service in Python to ingest data in Elasticsearch from various sources (MySQL, Network Drive, AWS, etc.) for Enterprise Search. Sucking data from a third-party service to Elasticsearch is usually an I/O-bound activity. Your code sits on opened sockets and passes data from one end to the other. That's a great use case for an asynchronous application in Python, but it needs to be carefully crafted.

Coralogix Makes Observability Collaborative

In the world of observability, there are several distinct problems to solve. Fast queries, intuitive visualizations, scalable storage, and more. The technical problems receive the most attention; however, there is another, more subtle problem. How do observability platforms facilitate collaboration on the scale needed by organizations?

2022 Year in Review

If you are like me, I always look forward to reading (here writing) a company's Year in Review and this year is no different. However, as I reflect back on 2022, I realized we achieved a five year anniversary. An anniversary of completing a very big vision of transforming customer’s cloud object storage such as AWS S3 into the first stream-based Search+SQL Analytic Database. Initially providing access via the Elastic (Search) API, then Presto (SQL), at scale and in production.

JSON Logs | Best Practices, benefits, and examples

JSON (JavaScript Object Notation) is a lightweight data-interchange format that has gained widespread popularity in recent years due to its simplicity and flexibility. It is easy for humans to read and write and for machines to parse and generate, making it a great choice for transmitting data in web applications. Logs serve multiple purposes for application developers. They are essential to understand what's happening in your application.

How OpsRamp Log Management Helps You to Find and Fix Issues Faster

OpsRamp has enhanced its hybrid observability capabilities by adding an integrated log management solution to unify log, event and alert data within customers’ monitoring and event management command center. Presenting this log data as part of a unified view of IT performance data and integrating it with remediation capabilities will allow enterprises and service providers to expedite the process of identifying and resolving potential issues before they impact their business operations.

Datadog on Building an Event Storage System

When Datadog introduced its Log Management product, it required a new event data storage platform, as storing logs and events is a completely different problem from storing metrics, which was the first Datadog product. Over time, Datadog introduced more and more products that needed to store and index multi-kilobyte timeseries “events”, re-using the Event Platform infrastructure from Log Management. The increased use of the Event Platform and the new feature requirements coming from new products started exposing the limitations of the legacy system and the need for a new approach

SigNoz - Logs Performance Benchmark

Logs are an integral part of any system that helps you get information about the application state and how it handles its operations. The goal of this blog is to compare the commonly used logging solutions, i.e., ElasticSearch(ELK stack) and Loki(PLG stack), with SigNoz on three parameters: ingestion, query, and storage. Performance benchmarks are not easy to execute. Each tool has nuances, and the testing environments must aim to provide a level playing field for all tools.

How to set up Heartbeat Alerts within Sematext

Monitoring is an essential part of any IT infrastructure, as it allows you to keep track of the performance and health of your systems. However, simply monitoring your systems is not enough - you need to be alerted when something goes wrong so that you can take action to fix the issue. With Sematext, we automatically set up heartbeat alerts for many of our integrations. However, you can always customize your alerts to suit your needs. You can choose to receive these alerts via email, SMS, or even through a chat tool like Slack.

Optimizing the AWS CloudWatch Log Process

Amazon’s native monitoring and management service AWS CloudWatch is great for basic monitoring and alerts. However, on its own, it may not be the best solution for analyzing log data at scale — especially if you need to analyze data outside of AWS. Many teams may find themselves restricted by retention issues and basic analytic features with Amazon CloudWatch logs for troubleshooting use cases.

Dos and Don'ts of Observability: Lessons Learned from RedMonk

On November 16, 2022, I sat down with analyst KellyAnn Fitzpatrick from RedMonk to discuss my favorite topic: observability. This time, we looked at observability in a context of what to do and what to avoid doing as you’re starting and going on an observability journey. Click the image below (or here) for a replay of the session. A machine-generated transcript is available at the end of the post.

Top 10 Logging Frameworks Across Various Programming Platforms

A logging framework is a software tool that helps developers output diagnostic information during the execution of a program. This information is used to debug the program or monitor its performance. There are many different logging frameworks available, starting with simple logging libraries to full-fledged logging and observability platforms.

Logstash Alternatives in 2023: 10 Best Options

Data processing involves collecting, organizing, and manipulating data in a systematic manner in order to extract useful information from it. It involves a series of steps that are performed on a set of data to transform it into a more meaningful and functional form for a specific purpose. Starting from collecting the data to the end part of processing it, data undergoes several layers of checks and balances before it is let out as we see it.

Application Performance Monitoring in the Gaming Industry

The gaming industry delivers specialized software at scale to users who expect a flawless interface. Application performance monitoring (APM) will measure critical software performance parameters using telemetry data. By monitoring this data, teams can ensure their game delivers the best user experience and quickly detect when the software needs updates to fix errors or meet key performance indicators (KPIs).

Grafana Agent v0.30: Flow adds support for logging pipelines and graduates to beta

Grafana Agent v0.30 is here! The past couple of Grafana Agent releases have been pretty exciting for us. We introduced Agent Flow as a new way to configure, run, and debug telemetry pipelines. We also announced OpenTelemetry Collector components to expand on our Big Tent philosophy and allow users to switch seamlessly between the Prometheus and OTel ecosystems. This latest release continues that momentum by introducing Loki components for building logging pipelines and marking Flow mode as beta!

11 unique insights into SLOs and reliability management

A quarter has passed since we launched our Reliability Management capabilities that help developers focus on defining, monitoring and managing Service Level Objectives (SLOs) to drive great digital experiences. Reducing alert fatigue and balancing innovation with reliability are common outcomes that customers expect from Reliability Management. If you are new to SLOs, these insights from our customers capture common practices among peer developers.

What is a log shipper - Top 7 Log Shippers that you can use

Centralizing logs (arranging all records in one place) is often challenging as we need to decide whether to use a log shipper or directly log from the application. If you are not familiar with a log shipper, logging directly from the library might be a suitable option for development (it is easy to configure). However, in production, you'll likely want to use one of the available log shippers, mainly due to buffers, since blocking the application or dropping data (immediately) may not be an option.

Configuring OpenTelemetry Agents to Enrich Data and Reduce Observability Costs

BindPlane OP is a powerful, open source tool that makes it easy to build and manage telemetry pipelines to ship data from IT environments of any kind and size to any analysis tool or storage destination. BindPlane OP installs and configures OpenTelemetry agents, which support a wide variety of sources and can be configured to ship data to multiple destinations while enriching or reducing data simultaneously.

Tis the Season: 3D Observability for Prometheus + Grafana + Octoprint

You may get lucky this holiday season with a new 3D printer, either as a gift or something you give yourself as a reward for all your hard work this year. Household 3D printers have made tremendous strides in ease of use and affordability over the last decade.

Log monitoring and unstructured log data, moving beyond tail -f

Log files and system logs have been a treasure trove of information for administrators and developers for decades. But with more moving parts and ever more options on where to run modern cloud applications, keeping an eye on logs and troubleshooting problems have become increasingly difficult.

Modern observability and security on Kubernetes with Elastic and OpenTelemetry

The structured nature of Kubernetes enables a repeatable and scalable means of deploying and managing services and applications. This has led to widespread adoption across market verticals for both on-premises and cloud deployment models. The autonomous nature of Kubernetes operation, however, demands comprehensive, fully-converged observability and security. This is uniquely possible today using the Elastic platform.

ChaosSearch re:Invent 2022 | theCube

At ChaosSearch we transform customer's #AWS #S3 into a Stream-based Search+SQL hot analytic database. Hear how we work with S3 to provide the most simple, scalable and cost-efficient: All on one unified platform (S3 + Chaos = Better Together). Great to see Ed Walsh, Kevin Miller, and David Vellante on the SiliconANGLE & theCUBE at #reinvent2022!

The only Helm chart you need for Grafana Loki is here

The community has spoken, six Helm charts is not enough! We agree! In all seriousness though, six charts is simply too many to maintain. And while it might sound counterintuitive, that’s why we are announcing a new Helm chart. By focusing on the “Grafana Labs way" to run Grafana Loki using Helm, we believe this will help us and the community concentrate our Helm efforts into a single chart. This new chart is released under grafana/loki at Helm version 3 or higher.

Elastic recognized as a Leader in the 2022 Gartner Magic Quadrant for Insight Engines

We’re pleased to announce that Elastic has been named a Leader in the 2022 Gartner® Magic Quadrant™ for Insight Engines. This is our second year of inclusion in the Gartner Magic Quadrant for this category, and this year’s evaluation places Elastic as the furthest entry on the "Completeness of Vision" axis.

Automate Observability Tasks with Logz.io Machine Learning

As an observability provider, we are always confronted with our clients’ goal for faster resolution of problems and better overall performance of their systems. By working on large-scale projects at Logz.io, I see the same main challenge coming up for all: extracting valuable insights from huge volumes of data generated by modern systems and applications.

Product Spotlight: Announcing Power Search for Log Restore

We’re excited to announce significant improvements to our Archive+Restore capabilities – which enables low-cost long term log storage in AWS S3 or Azure Blob, while providing access to ingest those logs into Logz.io at any time. The first enhancement is Power Search, which will make it faster to restore logs from archived log data in AWS S3 (and soon for Azure Blob) in our Open 360™ platform.

Elastic recognized as a Strong Performer in The Forrester Wave: Artificial Intelligence for IT Operations (AIOps), Q4 2022

We are excited to announce that Elastic has been recognized as a Strong Performer in The Forrester Wave™: Artificial Intelligence for IT Operations (AIOps), Q4 2022 in our first year participating! As organizations modernize their infrastructure and applications, operations and development teams are faced with an exponential growth in data.

A guide to cyber threat hunting with Promtail, Grafana Loki, Sigma, and Grafana Cloud

Fact: The Security Operations team at Grafana Labs loves logs. They are a key pillar of observability for many reasons, such as how they are stuffed full of details to help us diagnose the “why?” when things go wrong. This is especially true when the information pertains not to a series of unfortunate events, but instead to an adversary trying to cause us harm.

Can Your Cloud Migration Strategy Keep Up With the Speed of Business?

A hybrid infrastructure brings business benefits but it also brings new challenges. Migrating workloads to the cloud is a complex operation that generates more data than engineering teams can adequately manage. Traditional monitoring tools are limited in helping teams find and fix problems during and after a cloud migration. This can throw business strategies off course, limit customer value and hurt the bottom line.

Product Spotlight: Smart Tiering + LogMetrics to Optimize Costs

Is all observability data worth the same cost? If you guessed no, then you’d obviously be correct. Anyone familiar with the very nature of gaining targeted observability knows that some data points hold more value than others. Yet, many observability platforms still treat all types of log data the same, and as a result, related costs remain uniform. One of the most persistent observability challenges today is the cost of indexing log data.

Learn about the meaning and value of cloud-native from experts at Atchison Technology, Qumu, Microsoft, and Techstrong Group

In the past decade, we've seen explosive growth in the adoption of the cloud-based infrastructure model. IT organizations are increasingly choosing to reduce their up-front investments in IT infrastructure by deploying their applications into cloud environments. These environments offer on-demand availability of data storage and computing power that organizations need to handle high volumes of data and growing demand for application access and services.

What is FluentD, and how does it work with Kubernetes?

FluentD is a free and open-source data collector. With its decentralized ecosystem, it’s known for its built-in reliability and cross-platform compatibility. One of the biggest challenges in big data collection is the lack of standardization between collection sources. They just aren’t able to talk to each other. With FluentD, you can address one of the biggest challenges to big data log collection.

Building Resilience in Manufacturing with the Power of Data

Resilience has become the new strategic imperative for manufacturers during these testing times. As the world’s challenges make headlines, so do the innovative responses of manufacturing leaders. Savvy manufacturers automate, overhaul fundamental processes, modernize their security posture and reduce their CO2 footprint. Forward-focused organizations double down on their cloud investment to become more agile and resilient. And none of it is possible without data.

Grafana Loki top 5 query performance tips

In this video, we will discuss some key tips and techniques you can use to optimize the performance of your Loki queries in Grafana Loki. By following these best practices, you can ensure that your Loki queries are executed efficiently and effectively. Start correlating your data with Grafana Cloud and the new FREE tier. Special thanks to Ed Welch for the inspiration

Why Observability Is Important for IT Ops

Everyday when you come into work, you’re bombarded with a constant stream of problems. From service desk calls to network performance monitoring, you’re busy from the moment you login until the moment you click the “shut down” option on your device. Even more frustrating, your IT environment consists of an ever-expanding set of network segments, applications, devices, users, and databases across on-premises and cloud locations.

What You Need to Know About Log Management Architecture

You’ve made the decision to implement a centralized log management solution because you know that it’s going to save you time and money in the long term. However, to get the most bang for your log management buck, you need to understand how the different parts of your log management deployment work. Once you understand each resource, you can implement a more efficient log management architecture.

Splunk - The Data Platform for the Automotive World | Driving Transformation with Data

Tackling the mobility revolution from visibility to action, fast and at scale. The automotive industry is transforming. From being led by engineering to competing through software. From internal combustion to electrification. From a driver-focus to autonomous driving. From personal ownership to shared mobility. Automakers need to master more of their value chain and establish greater dependencies with key technology partners.

Data Warehouse vs Database: Comparing Common Data Storage

Knowing the differences between data warehouses and databases can clear up a lot of confusion for many people, especially with the volume of data we have these days. In this blog post, I'll discuss the differences between these two types of data systems. I'll also provide some examples to help illustrate the points we make. Let's get started! (This article was written by Austin Chia.)

How Universal Profiling unwinds stacks without frame pointers and symbols

Elastic Universal Profiling is based on technology that came into Elastic as part of the acquisition of optimyze.cloud — a startup that had developed Prodfiler.com, the world’s first frictionless fleet-wide in-production multi-runtime profiler that was launched in August 2021. In order to bring the vision of frictionless deployability, low performance overhead, “just run it everywhere” magic to the broader market, a number of technical innovations were necessary.

Announcing Logz.io's Data Optimization Hub

To help our customers reduce their overall observability costs, we’re excited to announce the Data Optimization Hub as part of our Open 360™ platform. The new hub inventories all of your incoming telemetry data, while providing simple filters to remove any data you don’t need. Gone are the days of paying for observability data you never use.

Logging and global error handling in .NET 7 WPF applications

While developing elmah.io support for WPF, I had the chance to look into WPF for the first time in many years. I couldn't stop myself from digging down into all sorts of details about how logging has evolved in WPF since I last wrote a WPF app. In this post, I'll share some of the findings I made in this rediscovering journey.

How to Augment an Existing Data Lake with Exabeam and Cribl Stream

Organizations have different data lakes they use to search, whether it is Splunk, Qradar, or Sumo Logic just to name a few. Exabeam (UEBA Advanced Analytics) sits on top of those existing data lakes and pulls specific sources by running continuous queries every few minutes into Exabeam. The image below shows a Splunk query to pull windows event logs into Exabeam Advanced Analytics over the port (8089). The query is complex.

Graylog 5.0 - A New Day for IT & SecOps

We are excited to announce the release of Graylog 5.0! Graylog 5.0 brings updates across our entire product line, including changes to infrastructure, Security, Operations, and our Open offerings. For more detailed information on what’s changed, visit our changelog pages for Graylog Open and Graylog Operations/Graylog Security.

Observability and Its Influence on Scrum Metrics

Scrum metrics are an essential indicator of your team’s progress. In an agile team, they help you understand the pace and progress of every sprint, ascertain whether you’re on track for timely delivery or not, and more. Although scrum metrics are essential, they are only one facet of the delivery process — sure, they ensure you’re on track, but how do you ensure that there are no roadblocks during development? That’s precisely where observability helps.

Three Key Considerations for Deploying Best-of-Breed Observability | AWS reInvent 2022, Ed Walsh

Organizations today need a broad set of cloud services to modernize their applications, keep their systems secure, and ultimately deliver for their customers. At the same time, application-generated operational data is complex, constantly growing, and coming from a variety of sources. This complexity requires a robust plan to ensure its availability for observability and analytics at scale. With today's solutions, TCO can vary wildly, which makes it critical to understand how costs are generated and quickly mount, including deploying your infrastructure, managing ongoing operations, managing data retention, scaling the stack, and building growth plans. Watch this lightning talk to learn about the three key considerations for success.

FluentD vs Logstash - Choosing a Log collector for Log Analytics

When we have large-scale, distributed systems, Logging becomes essential for observability, monitoring, and security. No matter what architecture (Monolith/Microservices) our systems have, they are complex due to the number of moving parts they have and the challenges they face around management, deployment, and scaling. In this scenario, Log management tools rescue the DevOps and SRE teams in order to help them monitor and improve performance, debug errors, and visualize events.

Grafana Loki 2.7 release: TSDB index, Promtail enhancements, and more

Grafana Loki 2.7 has arrived! With it comes an experimental feature we are rather excited about: a redesigned index based off of the Prometheus TSDB index. While we are still in the early stages, this enhancement in Grafana Loki, which we previewed at ObservabilityCON 2022, creates a smaller storage footprint, better query performance, and much more that we will dive into below!

Product Spotlight: Logz.io Telemetry Collector for Fast Data Shipping

Today we’re excited to announce Logz.io Telemetry Collector – an agent that can send logs, metrics, and traces to Logz.io in a single installation as part of our Open 360™ platform. With Telemetry Collector, customers can get started monitoring their services with Logz.io faster than ever by simplifying the data collection process.

Too many tools? Best practices for planning and implementing a successful IT tool consolidation strategy

IT tool consolidation is the ongoing and combined effort of all members of an IT organization to ensure that employees (only) use IT hardware, software and services that create and demonstrate explicit value for stakeholders in the business. The best metaphor for tool consolidation is in my kitchen, where common sense principles around value creation provide useful guidelines for any consolidation process.

Alerts in Sematext | Sematext Cloud Guide

Alerts are one of the most necessary features in any monitoring tool. Sematext Cloud comes with a whole host of alerts to keep you informed and updated on your system's performance. Sematext Comes with alert presets for 100+ integrations, so you can spend less time creating alerts and more time on what matters. Send Notifications and Alerts for yourself and the whole team via Slack, OpsGenie, paper duty, and many more. Monitor multiple distributed systems from a single UI.