Operations | Monitoring | ITSM | DevOps | Cloud

August 2022

11 Best Redis Monitoring Tools [2022 Review]

Redis is an open-sourced, BSD 3 licensed, highly efficient in-memory data store that can be easily used as a distributed, in-memory key-value store, cache, or message broker. It is known for being extremely fast, reliable, and supporting a wide variety of data structures, making it a very versatile tool widely adopted across the industry. Redis was architectured with speed in mind and is designed in a way that it keeps all the data in memory.

OpenTelemetry Logs, OpenTelemetry Go, and the Road Ahead

We’ve got a lot of OpenTelemetry-flavored honey to send your way, ranging from OpenTelemetry SDK distribution updates to protocol support. We now support OpenTelemetry logs, released a new SDK distribution for OpenTelemetry Go, and have some updates around OpenTelemetry + Honeycomb to share. Let’s see what all the buzz is about this time! 🐝🐝

How adding Kubernetes label selectors caused an outage in Grafana Cloud Logs - and how we resolved it

Hello, I’m Callum. I work on Grafana Loki, including the hosted Grafana Cloud Logs offering. Grafana Loki is a distributed multi-tenant system for storing log data — ingestion, querying, all that fun stuff. It also powers Grafana Cloud Logs.

Debunking 4 Cybersecurity Myths About Machine Learning

Machine learning has infiltrated the world of security tooling over the last five years. That’s part of a broader shift in the overall software market, where seemingly every product is claiming to have some level of machine learning. You almost have to if you want your product to be considered a modern software solution. This is particularly true in the security industry, where snake oil salesmen are very pervasive and vendors typically aren’t asked to vigorously defend their claims.

Observability: You Can't Buy It, You Must Build It!

In Part 1 of this series, we talked about the origins of observability and why you need it. In this blog (Part 2), we will cover exactly what observability is, what it isn’t, and how to get started. Before we can dive into how to approach observability, let’s get one thing clear: You can’t buy a one-size-fits-all observability solution.

How to monitor Couchbase with Google Cloud Ops

You can now easily monitor couchbase metrics and logs in Google Cloud. All of our logging and monitoring Google Cloud contributions are available through the Google Ops Agent GitHub repository. You can check it out here! The Google Ops Agent uses the built-in Prometheus exporter and receiver to monitor Couchbase sources running Couchbase 7.0. You can find documentation on the Prometheus exporter in the Couchbase documentation.

Aggregations and Chains: Performance Measurement in Cribl Stream Pipelines

In this post, we’ll discuss two functions in the Cribl Stream arsenal: The Aggregations function, which allows you to perform stats and metrics collection in flight, and the Chain function allows you to call one Pipeline from within another. The event flow will continue when the Chained Pipeline returns. To demonstrate their use, we’ll answer this question: How long did it take for Cribl to process events using your pipeline?

Splunk Synthetic Monitoring in Splunk Observability Cloud - Product Demo

Splunk Synthetic Monitoring is now available in Splunk Observability Cloud allowing IT and engineering teams to proactively detect issues impacting web and API performance and end-user experience and troubleshoot and remediate issues in the web browser, the server, or a third-party dependency—all within a single UI. Watch this quick demo to learn more.

Using Splunk Observability Cloud to Monitor Splunk RUM

As a principal engineer on the Splunk Real User Monitoring (RUM) team who is responsible for measuring and monitoring our service-level agreements (SLAs) and service-level objectives (SLOs), I depend on observability to measure, visualize and troubleshoot our services. Our key SLA is to guarantee that our services are available and accessible 99.9% of the time.

Observability: A Concept That Goes Back to the Founding of the Internet

With its market size reaching more than $2 billion in 2020, you’d think that a universal definition of the term observability would have emerged by now. But it turns out that a clear definition of a term or industry isn’t necessarily a prerequisite for the rapid growth of its market size — just ask everyone at your next dinner party to define blockchain for you and see how many different answers you get!

Elastic Observability helps monitor your Azure workloads on the new Arm-based VMs

Microsoft Azure’s recently launched new Azure Virtual Machines (VMs) feature the Ampere Altra Arm-based processor. These new VMs are engineered to efficiently run horizontally scalable workloads such as web servers, application servers, and open source databases. They deliver excellent price-performance and represent an important addition to Microsoft Azure's portfolio of instance types.

Snowflake DB: Observing a Snowflake From Cloud to Chart

You’ve probably heard something like this before: “It’s a managed service! We don’t need to worry about anything!” But when it comes to your production workloads, database monitoring is imperative. With the new Snowflake Dashboards and Detectors in the Splunk Observability Content Contributors repository you can start seeing the details of individual Snowflakes.

Creating Homebrew Formulas with GoReleaser

We chose to use GoReleaser with our distro of the OpenTelemetry Collector in order to simplify how we build and support many operating systems and architectures. It allows us to build targeting a matrix of GOOS and GOARCH targets as well as automate creating a wide range of deliverables. Ones we have utilized are building tarballs, nfpm packages, docker images, and Homebrew formula.

The Top 15 Distributed Tracing Tools (Open Source & More)

As distributed environments become more complex, users often use distributed tracing tools to improve the visibility of issues evident within their traces. Throughout this post, we will examine some of the best open-source and other generally popular distributed tracing tools available today.

Building a Cost-Effective Full Observability Solution Around Open APIs and CNCF Projects

A full Observability stack has the goal of providing full centralized visibility to Development, Operations and Security teams into all of the Metrics, Logs and Traces generated by the applications and services under their domain. Many companies address these observability needs by buying a complete application performance management (APM) solution from a single vendor, like DataDog.

Configuring an OpenTelemetry Collector to connect to BindPlane OP

Bindplane OP is the first open source, vendor-agnostic, agent and pipeline management tool. It makes it easy to deploy, configure, and manage agents on thousands of sources, and ship metrics, logs, and traces to any destination. This blog shows you how to configure an existing OpenTelemetry Collector from any source to connect to Bindplane OP without needing to remove or reinstall the collector.

Goats on the Road: Getting More Value From Observability Data

The best part of my job is talking with prospects and customers about their logging and data practices while explaining how Cribl focuses on getting more value from observability data. I love to talk about everything they are doing and hope to accomplish so I can get a sense of the end state. That is vital to developing solutions that provide overall value across the enterprise and not just a narrow tactical win with limited impact.

Overcoming data chaos ft. Thomas Hazel, founder of ChaosSearch

In this episode, Rob is joined by Thomas Hazel, founder and CTO of ChaosSearch. Every software company has tons of data to manage. Have we set ourselves up for failure? How do we recover from a data mess? Learn how Thomas embraces chaos to tackle big data problems by taking risks and embracing failure.

Why Do You Need Smarter Alerts?

The way organizations process logs have changed over the past decade. From random files, scattered amongst a handful of virtual machines, to JSON documents effortlessly streamed into platforms. Metrics, too, have seen great strides, as providers expose detailed measurements of every aspect of their system. Traces, too, have become increasingly sophisticated and can now highlight even the most precise details about interactions between our services. But alerts have remained stationary.

5 FinTech Log Analytics Challenges Equifax Solved with ChaosSearch

Global data, analytics and technology companies such as Equifax, and their Engineering teams, depend on log analytics for a variety of operational analytics use cases, from application troubleshooting to streamlining cloud operations and regulatory compliance management. ChaosSearch is uniquely positioned to help companies like Equifax significantly reduce the time, cost, and complexity of log analytics.

How to monitor Solr with OpenTelemetry

Monitoring Solr is very critical because it handles the search and analysis of data in your application. Similifying this monitoring is necessary to gain full visibility into Solr’s availability and ensure it is performing as expectedn. We’ll show you how to do this using the jmxreceiver for the OpenTelemetry collector. You can utilize this receiver in conjunction with any OTel collector: including the OpenTelemetry Collector and observIQ’s distribution of the collector.

Autoscaling Elasticsearch/OpenSearch Clusters for Logs: Using a Kubernetes Operator to Scale Up or Down

When we say “logs” we really mean any kind of time-series data: events, social media, you name it. See Jordan Sissel’s definition of time + data. And when we talk about autoscaling, what we really want is a hands-off approach at handling Elasticsearch/OpenSearch clusters. In this post, we’ll show you how to use a Kubernetes Operator to autoscale Elasticsearch clusters, going through the following with just a few commands.

SIEM-pler Migrations with Cribl Stream

A SIEM (Security Information Event Management) platform, along with several other tools that make you crave Alphabet Soup (XDR, UBA, NDR, etc), is a critical component of any organization’s security infrastructure. Between a constantly growing volume of logs, increasing attacks and breaches, and challenges finding qualified staff, many organizations may consider a SIEM migration. There could be several reasons for this.

Why You Shouldn't Use OpenTracing In 2022

OpenTracing was an open-source project developed to provide vendor-neutral APIs and instrumentation for distributed tracing across a variety of environments. As it is often extremely difficult for engineers to see the behaviour of requests when they are working across services in a distributed environment, OpenTracing aimed to provide a solution to heighten observability.

Mezmo Named to Inc. 5000's List of Fastest Growing Companies in the Nation

Inc. is shining a light on Mezmo as one of the fastest growing companies in the nation. We are truly honored to be featured alongside innovative brands like Sentry and Calendly, who are building the future of tech. Our position on the list at number 695 reflects our 900% growth in revenue and 300% growth in the size of our team from 2018 to 2021.

Best Practices for Navigating the Security Poverty Line

InfoSec, like any other aspect of IT, is a matter of three factors coming together: people, process and technology. All of these factors cost time and money in some way. The truth is, there are very few organizations out there who can supply their own security programs, staff, technology, processes and everything needed for InfoSec to an efficient degree. Everyone has to compromise in some way.

Are Your Engineers Gonna Need A Bigger Boat?

If you asked your engineering team how well they can handle all of the security and observability data they’re managing, would you get a resounding “Yeah boss, we’re good to go!” in response? Possible, but unlikely. Chances are they feel like they’re stuck on a boat that’s taking on water, spending their day using tiny buckets to scoop some of it out, with no way to plug any of the leaks.

Moving from an IT and Security Data Admin to an Observability Engineer

Join Ed Bailey, Nick Heudecker, and Jordan Perks as they discuss what it means to transition from acting simply as an IT and security data administrator to becoming a true observability engineer. In your role as an observability engineer, you’ll guide an organization on observability data best practices, enhance existing tool functionality, help control cost, and improve overall compliance.

Centralizing Log Data to Solve Tool Proliferation Chaos

As companies evolve and grow, so do the number of applications, databases, devices, cloud locations, and users. Often, this comes from teams adding tools instead of replacing them. As security teams solve individual problems, this tool adoption leads to disorganization, digital chaos, data silos, and information overload. Even worse, it means organizations have no way to correlate data confidently. By centralizing log data, you can overcome the data silos that tool proliferation creates.

What's Missing From Almost Every Alerting Solution in 2022?

Alerting has been a fundamental part of operations strategy for the past decade. An entire industry is built around delivering valuable, actionable alerts to engineers and customers as quickly as possible. We will explore what’s missing from your alerts and how Coralogix Flow Alerts solve a fundamental problem in the observability industry.

Resiliency As the Next Step in the DevOps Transformation

We’ve reached the point in the DevOps transformation where efficiency and automation are no longer the highest objectives. The next step is engineering past automation and towards fully autonomous, self-healing systems. If you aren’t conversing about building this type of resilience into your systems and applications, there’s never been a better time than now to start.

IT Salaries: Trends, Roles, & Locations for 2022-2023

IT roles have never been more in demand and IT salaries have never been higher, according to recent reports and data sources. Whether you are hiring, looking for a career change, or simply work in tech, it’s important to stay up-to-date on the state of employment in the industry. This blog post will review, roundup, and summarize some of the latest trends for IT salaries and demand by role and location (among other variables) to help you get a clear view of the landscape.

Memory Profiling for Java Applications, a Splunk APM Product Walkthrough

Splunk’s Product Manager Priit Potter walks you through how to identify memory bottlenecks in Java applications, in this detailed product walkthrough. See how Priit troubleshoots his own application, visualizes memory performance problems, and uses flame graphs to detail the line of code responsible for the problem, all with the help of Splunk Application Performance Monitoring.

How Cloud Network Monitoring Is Critical To Business Success

A rising number of businesses are adopting and utilizing cloud services and capabilities with remarkable success. But embracing cloud tools and services often brings unexpected changes for business leaders and IT teams, especially because of the way in which cloud adoption has altered how networks are monitored and managed.

Data Legends Podcast with Wes Gelpi: Special 2 Part Series

Leading a team of data and analytics professionals isn’t easy; it takes more than just understanding the goal. It’s about the journey and how the people on the journey collaborate. Wes Gelpi, Director of Research & Development at SAS, joins us in a special 2-part episode. Gelpi has a rich history of taking challenging situations and running with them.

What Is Distributed Tracing

Systems and applications alike have become progressively distributed as microservices, open-source tools, and containerisation have gained traction. In order to actively monitor and respond quickly to issues that arise in our environment, distributed tracing has proven to be vital for businesses such as Uber, Postmates, Hello Fresh and TransferWise. It is, however, important to clarify what distributed tracing actually means.

How to Grow Your Own Cybersecurity Talent

The cyberthreat landscape has expanded in recent years, accelerated by enterprises promoting remote work and more reliance on cloud computing. These are a business necessity, and yet, facing down cybersecurity threats often doesn’t come with an expansion of resources to address them. In a future post, I’ll discuss more about the Security Poverty Line, and how organizations deal with its harsh trade-offs and compromises in an uncompromising landscape.

How to Monitor SAP Hana with OpenTelemetry

SAP Hana monitoring support is now available in the open source OpenTelemetry collector. You can check out the OpenTelemetry repo here! You can utilize this receiver in conjunction with any OTel collector: including the OpenTelemetry Collector and observIQ’s distribution of the collector. Below are quick instructions for setting up observIQ’s OpenTelemetry distribution, and shipping SAP Hana telemetry to a popular backend: Google Cloud Ops.

BindPlane OP Reaches GA

Today we’re excited to announce BindPlane OP – the first observability pipeline built for OpenTelemetry – is out of beta and now generally available. You can download the latest version here. Two months ago we released BindPlane OP in beta, and while we were confident we had something special, the response surpassed all of our expectations.

Announcing the Winners of the Cribl Packs Contest

It’s time for the Black Hat conference in the United States, so we’re onsite meeting with customers and prospects looking to untangle their data from the grip of vendors holding their data hostage. We aim to start a rebellion against this lock-in and encourage customers to focus on radical choice and control with their observability data. Pushing back against “The Empire” is challenging, but you can achieve it with Cribl Stream and Edge.

To Observability and Back Again: A Context's Journey

How do you pass context from events that concern Security teams to Development teams who can make changes and address those events? Often this involves a series of meetings and discussion that can take days or weeks to filter down from security event to developer awareness. Compounding the problem, developers generally do not have access to Splunk Core, Cloud or Enterprise indexes used by security teams, and indeed, may use only Splunk Observability for their metrics, traces and even logs.

Improving DevOps Performance with DORA Metrics

Everyone in the software industry is in a race to become more agile. We all want to improve the performance of our software development lifecycle (SLDC). But how do you actually do that? If you want to improve your performance, first determine what KPI you’d like to improve. DORA metrics offer a good set of KPIs to track and improve. It started as a research by the DevOps Research and Assessment (DORA) and Google Cloud (which later acquired DORA), to understand what makes high performing teams.

Lessons Learned From Building a Company and Raising Kids

When I had my first child almost six years ago, I expected that most of my time would be spent in the role of a teacher rather than a student. I have two kids now — and I’m certainly teaching them as much as I can as they grow and learn to navigate the world — but if someone were keeping score, my kids might end up on top when it comes to who’s taught who more. Another thing that surprised me is how similar building a family is to build a company from the ground up.

What is Infrastructure as Code?

Cloud services were born at the beginning of 2000 with companies such as Salesforce and Amazon paving the way. Simple Queuing Service (SQS) was the first service to be launched by Amazon Web Services in November 2004. It was offered as a distributed queuing service and it is still one of the most popular services in AWS. By 2006 more and more services were added to the offering list.

Monitoring smart city IoT devices with Grafana and Grafana Loki: Inside the Fuelics observability stack

For smart cities of the future, monitoring infrastructure metrics like fuel and water levels is vital to optimizing operations. Fuelics PC designs and deploys battery-operated narrowband IoT (NB-IoT) sensors that monitor fuel, water, waste, and even parking capacity at the edge, then transmit that data to the cloud for easy viewing and monitoring.

The Real Opportunity for Improving Outcomes with Monitoring and Observability

If you were pulled into a meeting right now and asked to give your thoughts on how to achieve better outcomes with monitoring and observability, what would you recommend? Would you default to suggesting that your team improve Mean Time To Detect (MTTD)? Sure, you might make some improvements in that area, but it turns out that most of the opportunities lie in what comes after your system detects an issue. Let’s examine how to measure improvements in monitoring and observability.

Elasticsearch on Docker Tutorial | Elastic Docker Containers Configuration - Sematext

In this Elasticsearch/Docker tutorial, we will install and run an Elasticsearch cluster on a single Docker host. We will pull an Elasticsearch Docker image (and Kibana), create a Docker network for the cluster, and deploy it on a local host. Containerizing instances of Elasticsearch helps create a scalable and mobile infrastructure, while not sacrificing system performance. Follow along to create and configure a truly open-source Elasticsearch cluster in Docker.

Goats on the Road: What Customers Are Telling Us

The best part of my job is talking with prospects and customers about their logging and data practices. I love to talk about everything they are currently doing and hope to accomplish so I can get a sense of overall goals and understand current pain points. It’s vital to come up with solutions that provide broad value across the enterprise and not just a narrow tactical win with limited impact.

How to Overcome Datadog Log Management Challenges

Datadog has made a name for itself as a popular cloud-native application performance monitoring tool, measuring a system’s health and status based on the telemetry data it generates. This telemetry includes machine-generated data, such as logs, metrics and traces. Cloud based applications and infrastructure generate millions (even billions) of logs – and analyzing them can generate a wealth of insights for DevOps, security, product teams and more.

Data Observability Explained: How Observability Improves Data Workflows

Organizations in every industry are becoming increasingly dependent upon data to drive more efficient business processes and a better user experience. As the data collection and preparation processes that support these initiatives grow more complex, the likelihood of failures, performance bottlenecks, and quality issues within data workflows also increases.

Web Browser Update Problems: How to Monitor Website Performance Anomalies Caused by New Browser Versions

When new web browser versions are released, new bugs are inevitably introduced, which can degrade a website’s performance and increase the overall page load time. This can severely impact a user’s engagement and a business’s bottom line.

Log Forwarding with HAProxy and Syslog

Developing a strategy for collecting application-level logs necessitates stepping back and looking at the big picture. Engineers developing the applications may only see logging at its ground level: the code that writes the event to the log—for example a function that captures Warning: An interesting event has occurred! But where does that message go from there? What path does it travel to get to its destination?

Splunk Snags Six 'Best of' Awards From Customer Reviews on TrustRadius

Splunk is honored to be the recipient of a series of six new awards from TrustRadius—all based on customer reviews. In this round, TrustRadius grants its “Best of” Awards to the top three products per Best Feature Set, Best Value for Price and Best Relationship in each respective category.

Monitorama 2022: the good, the bad and the beautiful (Part 2)

ICYMI, this year’s Monitorama marks a return to the in-person event following a pandemic hiatus. In Part 1 of this series, I shared what it was like to navigate a tech conference in the post-pandemic world and the most engaging themes of the conference including tracing, SLIs, OpenTelemetry, and more. Now for Part 2, let’s dive into the talks. As usual the Monitorama talk selection team did a bang-up job. Every talk was interesting, but a few jumped out at me for some very specific reasons.

A Complete Guide to Tomcat Monitoring: How to, Metrics & Choosing the Best Tools

The Apache Tomcat is an open-source implementation of the Jakarta Servlet, Jakarta Server Pages, Jakarta Expression Language, Jakarta WebSocket, Jakarta Annotations and Jakarta Authentication specifications, all being a part of the Jakarta EE Platform. That is the official description of Apache Tomcat.

Introducing Splunk Operator for Kubernetes 2.0

The Splunk Operator for Kubernetes team is extremely pleased to announce the release of version 2.0! This represents the culmination of many months of work by our team and continues to deliver on our commitment to provide a high-quality experience for our customers wishing to deploy Splunk on the Kubernetes platform.

Alerting Techniques for an observable platform

Observable and secure platforms use three connected data sets: logs, metrics, and traces. Platforms can link these data to alerting systems to notify system administrators when an event requires intervention. There are nuances to setting up these alerts so the system is kept healthy and the system administrators are not chasing false positive alerts.

How to Leverage Cribl and Exabeam: Parser Validating

Organizations leverage many different cybersecurity and observability tools for different departments. It’s common to see the IT department using Splunk Enterprise, while the SOC uses Exabeam. Both of these tools use separate agents, each feeding different data to their destinations. Normally this isn’t a problem unless you’re talking about domain controllers. Domain controllers only allow a single agent, meaning you can’t feed two platforms with data.

Cribl.Cloud Simplified with Consumption Pricing

One year ago, we launched Cribl.Cloud as a cloud-hosted option for our industry-leading data pipeline product, Cribl Stream. Customers had a choice of either deploying on-premises with a subscription-based tiered license model or opting for our cloud service with a similar tiered billing model. Fast-forward one year, and Cribl is now a multi-product company with several unique observability products (Stream, Edge, AppScope, and soon Search) to offer our customers.

Announcing the General Availability of Synthetic Monitoring Within Splunk Observability

Today we’re proud to announce the general availability of best-in-class Synthetic Monitoring capabilities within Splunk Observability Cloud. Now, IT and engineering teams can proactively measure, monitor and troubleshoot their critical user flows, APIs and services, connected across Splunk Observability.

Postcard From .conf22: Customers Inspire Our Latest Release

They say, “What happens in Vegas, stays in Vegas,” but I wanted to highlight the role our customers played at last month’s.conf22, our annual users’ event at the MGM Grand. It was awesome meeting customers in person again, and connecting virtually with thousands more. We had a terrific turnout with 8,200+ customers and partners representing 113 countries and more than 6,500 organizations.

Get better visibility into DevOps performance in one place with Atlassian integrations

Every company is a software company and every company wants to get better at it. That’s why Sumo Logic built a set of integrations with Atlassian DevOps solutions. Leveraging data from Atlassian, Sumo Logic now enables you to visualize the key, actionable insights behind the DevOps Research Assessment (DORA) metrics to continuously improve your software delivery performance. Sumo Logic’s observability platform presenting Atlassian data brings the following benefits, to name a few.