Operations | Monitoring | ITSM | DevOps | Cloud

January 2021

Using AWS Athena with Coralogix S3 Archive

Coralogix can be configured to automatically and dynamically archive logs to an S3 bucket. This saves Coralogix customers money, but of course there are times when the data needs to be reindexed. This operation counts the reindexed logs against the daily quota. Many times customers would like to search and focus on the exact logs to be reindexed or even query the logs outside of Coralogix all together.

Log Management in Hosted Platforms Like DigitalOcean

With DigitalOcean Monitoring, you can collect metrics for visibility, monitor Droplet performance, and receive alerts when problems arise in your infrastructure. Many users often want to extend this infrastructure monitoring with application-level monitoring. This means debugging issues requires expertise, familiarity with your product and infrastructure, and often the involvement of many people in various fields—all to chase down a single problem.

Cloud-First Strategy and Its Benefits for Business

A cloud-first strategy can feel like a big jump from traditional setups. One of the benefits of a hybrid or on-premises strategy is you feel like you’re in control. You and your team know where your critical servers live. You can touch them. Your team understands your security processes, and you can easily verify security personnel follow them. Those are all significant benefits. However, a growing number of software teams are choosing to move to cloud-first strategies.

Coralogix - Panel Discussion: Elasticsearch is Not Open Source Anymore

Does SSPL license endanger your intellectual property? As of January 2021, Elasticsearch is no longer open source. From version 7.11 and onwards, all ELK products (Elastic, Logstash, Kibana) will be registered under the new SSPL license created by Mongo and now adopted by Elastic. In this panel, our IP expert lawyer discusses the new license and helps explain whether it impacts your business or puts it at risk.

Open Source in Application Monitoring

Open source projects are a powerful way to accelerate application development. Open source as a support function to monitoring can help support standards and better Observability and Monitoring practices. Learn about the OpenTelemetry project as a tool to improve the quality and flexibility of traces, spans, logs for better monitoring and Observability practices.

observIQ's Stanza Log Agent Now Part Of OpenTelemetry Project

Today I’m happy to announce that observIQ’s Stanza Log Agent will become a key part of the OpenTelemetry project. This has been in the works for many months and the team at observIQ is thrilled to see it becoming a reality. We’re particularly pleased to see it happening just as we launch our log management platform which will be the first platform to take full advantage of the log agent technology now incorporated into OpenTelemetry.

Building Autocomplete with ANTLR and CodeMirror

At Sumo Logic, we’re dealing with a large amount of data. To help our customers explore the data quickly and effectively, our product lets them write Logs, Metrics, and Tracing queries. One of the challenges we dealt with recently was improving the query building experience in our new, revamped Metrics UI.

Centralized Log Management and Cloud Environments

Even before new hybrid workforce models, many companies already moved a lot of services to the cloud. COVID-19 digital transformation strategies instantly increased the number of access points and endpoints. This led to a rapid increase in event log data followed by all kinds of other issues -- performance, availability, security, and ultimately increased IT costs amongst other things. A centralized log management solution for your cloud environment can help you manage the above and more.

A Practical Guide to Logstash: Input Plugins

In a previous post, we went through a few input plugins like the file input plugin, the TCP/UDP input plugins, etc for collecting data using Logstash. In this post, we will see a few more useful input plugins like the HTTP, HTTP poller, dead letter queue, twitter input plugins, and see how these input plugins work.

Getting to Know Google Cloud Audit Logs

So you've set up a Google Cloud Logging sink along with a Dataflow pipeline and are happily ingesting these events into your Splunk infrastructure — great! But now what? How do you start to get meaningful insights from this data? In this blog post, I'll share eight useful signals hiding within Google Cloud audit logs that will help you uncover meaningful insights. You'll learn how to detect: Finally, we’ll wrap up with a simple dashboard that captures all these queries in one place.

Secure Your Endpoints with Sophos & Logz.io

Intercept X is Sophos’ endpoint security solution, including anti-ransomware, zero-day exploit prevention, plus managed endpoint defense and response. It employs a layered approach reliant on multiple security techniques for endpoint detection and response (EDR). Those tactics include app lockdown, data loss prevention, web control and malware detection.

Unify your data with Grafana, wherever it lives: The ElastiSpLoki dashboard

At Grafana Labs, we believe you should unify your data, not your database. We want to help you with your observability, not own it But what if you have multiple teams using multiple open source and commercial solutions? Not a problem. To give an example, here is a quick demo of Splunk, Elastic, and Loki logs combined into one UI in #Grafana This is more than a dashboard; it's a composite panel with transformations of all three sources Your teams should be able to use best-of-breed technologies rather than being locked into one

The Central Source of Truth: Fall Guys and Mediatonic

Mediatonic is a sprawling video game studio based in the UK, with a number of successful titles to their name: Heavenstrike Rivals, Gears POP!, and Murder by Numbers among them. In 2020, they struck gold again with Fall Guys: Ultimate Knockout. But this game would be special, and the need of handling these kinds of gaming logs at this kind of scale would be, too. This battle royal-style fighting game pits 60 players against each other until one reigns supreme.

Is the New Elasticsearch SSPL License a Threat to Your Business?

The recent changes to the Elasticsearch license could have consequences on your intellectual property. On the 14th of January 2021, Elastic announced through their blog that Elasticsearch and Kibana will be moving over to a Server Side Public License (SSPL). This license change, effective from Elasticsearch version 7.11, has business owners that rely on the ELK stack rightly concerned.

Truly Doubling down on open source #2

Earlier this week, I wrote a blog stating our intention to fork Kibana and Elasticsearch. This was a huge decision on our end, one that we did not take lightly. A few days have passed since this announcement and I wanted to share how humbled and excited we are with the responses from companies and individuals who are eager to participate and contribute.

Troubleshooting Kubernetes Job Queues on DigitalOcean, Part 2

Kubernetes work queues are a great way to manage the prioritization and execution of long-running or expensive menial tasks. DigitalOcean managed Kubernetes services makes deploying a work queue straightforward. But what happens when your work queues don’t operate the way you expect? SolarWinds® Papertrail™ advanced log management complements the monitoring tools provided by DigitalOcean and simplifies both the debugging and root cause analysis process.

How to Save Hundreds of Hours on Lambda Debugging

Although AWS Lambda is a blessing from the infrastructure perspective, while using it, we still have to face perhaps the least-wanted part of software development: debugging. In order to fix issues, we need to know what is causing them. In AWS Lambda that can be a curse. But we have a solution that could save you dozens of hours of time. TL;DR: Dashbird offers a shortcut to everything presented in this article.

Solr Performance: Troubleshooting Solr Slow Queries Using Logs and Metrics

Let’s say you get an alert that one or more queries is slow. Or that your users complain, whichever comes first 🙂 We’ve all been there… How do you find the root cause for this slowness and then fix it? In this article, I’ll go through my usual thought process: first, I’d try to find which queries are slow. Then, I’d dig deeper: Let’s take a specific example and run through each step.

Truly Doubling Down on Open Source

A couple of days ago, Elastic announced that it will change the licensing of Elasticsearch and Kibana as of the 7.11 release to a proprietary dual license (under the SSPL license) and away from the open-source Apache-2.0 license. This move has caused extensive turmoil and frustration in the open-source community, especially with organizations that rely on Elasticsearch. Let me start with the end in mind.

Centralized Log Management for Optimizing Cloud Costs

Centralized Log Management offers the visibility you need to optimize your cloud usage to keep infrastructure costs down. Cloud-first infrastructures are the future of modern business operations. As organizations like Google and Twitter announce long-term plans for enabling a remote workforce, maintaining a competitive business model includes scaled cloud services adoption. While the cloud offers scalability that can save money with pay-as-you-need services, managing the costs is challenging.

Kubernetes is eating the world; you can digest K8's plume

Innovation in hypervisor technology in the early 2000’s from both commercial and open source projects was the genesis for the public cloud as we know it today. Virtualization and Moore’s law, together with advances in storage technology, mobile and wireless, created a data explosion that continues to accelerate through today.

The Elastic SSPL licensing change & ChaosSearch: FAQs

There’s no question that Elastic has built a truly amazing company, based on the Apache 2.0 open source business model, and on the shoulders of other projects like Lucene. Last week, Elastic announced that, starting with version 7.11, Elasticsearch will now be licensed via SSPL, a license that Mongo released in 2018. So you may be wondering what this all means. Here are what we anticipate will be a few Frequently Asked Questions around this Elasticsearch licensing change.

Network Security: The Journey from Chewiness to Zero Trust Networking

Network security has changed a lot over the years, it had to. From wide open infrastructures to tightly controlled environments, the standard practices of network security have grown more and more sophisticated. This post will take us back in time to look at the journey that a typical network has been on over the past 15+ years. From a wide open, “chewy” network, all the way to zero trust networking. Let’s get started.

A Practical Guide to Logstash: Parsing Common Log Patterns with Grok

In a previous post, we explored the basic concepts behind using Grok patterns with Logstash to parse files. We saw how versatile this combo is and how it can be adapted to process almost anything we want to throw at it. But the first few times you use something, it can be hard to figure out how to configure for your specific use case.

Amazon: NOT OK - why we had to change Elastic licensing

We recently announced a license change: Blog, FAQ. We posted some additional guidance on the license change this morning. I wanted to share why we had to make this change. This was an incredibly hard decision, especially with my background and history around Open Source. I take our responsibility very seriously. And to be clear, this change most likely has zero effect on you, our users. It has no effect on our customers that engage with us either in cloud or on premises.

The Importance of Cloud Performance and Security Platforms

Work, education, and even many of our leisure activities have all moved on-line at an incredible pace due to current social distancing mandates. The digital backbone of the Internet and the SaaS services that drive our personal and professional lives are now foundational. Ensuring that these systems are operating optimally and securely is of paramount importance.

How to Troubleshoot AWS Lambda Log Collection in Coralogix

AWS Lambda is a serverless compute service that runs your code in response to events and automatically manages the underlying compute resources for you. The code that runs on the AWS Lambda service is called Lambda functions, and the events the functions respond to are called triggers. Lambda functions are very useful for log collection (think of log arrival as a trigger), and Coralogix makes extensive use of them in its AWS integrations.

Cloud Profiler provides app performance insights, without the overhead

Do you have an application that’s a little… sluggish? Cloud Profiler, Google Cloud’s continuous application profiling tool, can quickly find poor performing code that slows your app performance and drives up your compute bill. In fact, by helping you find the source of memory leaks and other errors, Profiler has helped some of Google Cloud’s largest accounts reduce their CPU consumption by double-digit percentage points.

Multi-Cloud Archive & Restore: Azure Blob Storage and AWS S3 Support

Logz.io has recently launched its Smart Tiering solution, which gives you the flexibility to place data on different tiers to optimize cost, performance and availability. Our mission has been to make Smart Tiering a multi-cloud and multi-region service. As part of this launch, we are glad to announce that the Historical Tier now supports Microsoft Azure Blob Storage, alongside AWS S3.

Kusto: Table Joins and the Let Statement

In this article I’m going to discuss table joins and the let statement in Log Analytics. Along with custom logs, these are concepts that really had me scratching my head for a long time, and it was a little bit tricky to put all the pieces together from documentation and other people’s blog posts. Hopefully this will help anyone else out there that still has unanswered questions on one of these topics.

Kusto: Custom Logs in Log Analytics

In this article, I’m going to discuss custom logs in Log Analytics. Along with table joins and the let statement that I discuss in another blog, custom logs is a concept that I struggled to wrap my head around for a long time, as there don’t seem to be very many comprehensive guides out there as of yet. Here is a summary of everything I have managed to piece together from documentation and other people’s blog posts.

How to Monitor Amazon DynamoDB Performance

One of Amazon Web Services’ (AWS) most well-known services is AWS DynamoDB. Some of AWS’s most notable customers use DynamoDB for their database needs – companies such as Netflix, The Pokemon Company, and Snapchat. DynamoDB is relatively simple to set up and configure, and it integrates well with many web-based applications. DynamoDB supports technology solutions in gaming, retail, bank and finance, and the software industry.

How to get started quickly with metrics, logs, and traces using Grafana Cloud integrations

Grafana Cloud is the easiest way to get started observing metrics, logs, traces, and dashboards. When we say “easiest,” we mean it: Grafana Cloud is designed so that even novice observability users can use it. As a new user, you are not required to dive into the complexity of setting up Prometheus and figuring out how to create Grafana dashboards from scratch. Integrations are the reason why.

Kick off 2021 by learning Elastic solutions with free 15-minute guides

Elastic solutions solve many different business challenges from powering search bars to creating observable systems to detecting and responding to threats. And with the amount of capabilities each offers, learning how to maximize the power of our solutions for enterprise search, observability, and security is critical to realizing Elastic's full value. But finding the time to build new skills can be challenging.

Not Another New Year's Resolution

I hope I’m not alone in starting 2021 with some sense of optimism. While several hard months remain ahead of us, I am hopeful and also expecting that some sense of normality will return by the summer months. Either way, this gives us an opportunity to reflect on the challenges we have faced. 2020 was testing. We learnt a lot about ourselves and our businesses in the most challenging of circumstances.

Embracing Open Source data collection

Open source has come a long way. One of my favorite reports on the subject is Red Hat’s State of Enterprise Open Source. For 2020, 95% of respondents said that open source is strategically important to their business needs. Here, I will be recapping my recent Illuminate presentation about embracing open source data collection and I thought it’s important to first talk about how open source has changed.

A Practical Guide to Logstash: Syslog Deep Dive

Syslog is a popular standard for centralizing and formatting log data generated by network devices. It provides a standardized way of generating and collecting log information, such as program errors, notices, warnings, status messages, and so on. Almost all Unix-like operating systems, such as those based on Linux or BSD kernels, use a Syslog daemon that is responsible for collecting log information and storing it.

The new Grafana Cloud: the only composable observability stack for metrics, logs, and traces, now with free and paid plans to suit every use case

Oftentimes users of open source are told to go download it and figure it out… or pay for a managed solution in the cloud. So the typical choice is free and do-it-yourself or expensive and easy. With our new changes to Grafana Cloud, we are making it both free and easy to have a real, composable observability solution.

Recapping Re:Invent 2020

As with many things in 2020, this year’s AWS re:Invent was quite different from any previous iterations. For starters, instead of a week of live talks, face-to-face sessions, and a room full of booths, this year the event was fully online and stretched out for three weeks. As sponsors of this year’s event, we were excited to participate and continue to make an impact on the AWS community.

Yes, Virginia, There is a -Santa Claus- Way to Detect Unemployment Fraud

Fraud rates for Unemployment Insurance Benefits (UIB) and Pandemic Unemployment Assistance (PUA) are out of control. In May 2020, Brian Krebs of Krebsonsecurity published two articles detailing fraud that was occurring in several different state’s UIB portals. These states had been warned by the US Secret Service to be on the lookout for this. Reading the articles, the common theme is that many states are missing rudimentary controls for combating fraud.

Improve Your Security Posture By Focusing on Velocity, Visibility, and Vectors

In the wake of the widely publicized FireEye breach and the alarming SolarWinds supply chain attack, this presents an ideal opportunity for reflection on the broader shift taking place across the world—the transition from legacy on-prem infrastructures to the cloud.

10 Best Tools for Monitoring Apache Cassandra in 2021

A large amount of data requires special tools. Apache Cassandra is one of those databases that can handle a large amount of data spread among many commodity servers, providing high availability and fault tolerance without a single point of failure. Developed under the umbrella of Apache Software Foundation, it ensures full visibility into the code base and being free of charge.

Splunk Cloud Self-Service: Announcing The New Admin Config Service API

In our last blog, "What's New in Splunk Cloud: Part 1," we reviewed a host of new Splunk Cloud features that we have delivered through our accelerated releases since the beginning of 2020. A large part of this effort focused on empowering Splunk Cloud admins and making their experience as self-service as possible. In this blog, we will examine our latest effort to continue this empowerment: Splunk Cloud’s Admin Configuration Service (ACS).

Ship Your ModSecurity Logs to Logz.io Cloud SIEM

Now, you can ship ModSecurity logs to Logz.io to automatically surface high-priority attacks identified by ModSecurity. Logz.io will automatically parse those logs to project a greater bird’s-eye-view of your security situation within dedicated dashboards. ModSecurity is a prolific web application firewall (WAF) popularly used to help secure web servers. It supports Apache HTTP, IIS, and NGINX. It can deploy either as a proxy server or within a web server itself.

Troubleshooting Kubernetes Job Queues on DigitalOcean, Part 1

Kubernetes work queues are a great way to manage the prioritization and execution of long-running or expensive menial tasks, such as processing large volumes of employee migration to a new system, ranking and sorting all the planets in the universe by Twitter tags, or even post-processing every frame of the latest Avengers movie.

Getting started with Elastic Cloud

Elastic Cloud puts the power of the Elastic Stack in your hands within minutes. Whether you’re trying to add search capabilities with Elastic Enterprise Search, monitor critical systems and applications with Elastic Observability, or protect your organization from cyber threats with Elastic Security, taking the first step is easy.

PostgreSQL vs MySQL: Use Cases & Attributes To Help You Choose

Choosing whether to go with PostgreSQL or MySQL depends on your needs as they are both great databases to use under different circumstances. In this article we will run through a few of the top reasons and use cases to help you choose between these choices for database creation. Note: As a matter of fact, MySQL is so popular it became part of the LAMP stack (Linux, Apache, MySQL, PHP) used for building many web servers.

Find logs fast with new "tail -f" functionality in Cloud Logging

When you’re troubleshooting an app or a deployment, every second counts! Cloud Logging helps you troubleshoot by aggregating logs from across Google Cloud, on-premises or other clouds, indexing, aggregating logs into metrics, scanning for unique errors with Error Reporting and making logs available for search, all in less than a minute. And now, we’ve built two new features for streaming logs to give you even fresher insights from your logs data.

Elastic Contributor Program: How to contribute code

We created the Elastic Contributor Program to encourage knowledge sharing in our community and to recognize and reward the hard work of our awesome contributors. There are six different contribution types accepted in the program: event organization, presentation, written content, video, translation, and code. In this blog post, we’ll cover how to contribute code in the many free and open projects that Elastic maintains.

Service Map & Dashboards (beta) Provide Insight into Health and Dependencies of Microservice Architecture

With almost every blog you read about monitoring, troubleshooting, or more recently, the observability of modern application stacks, you’ve probably read a statement saying that complexity is growing as a demand for more elasticity increases which makes management of these applications increasingly difficult. This blog will be no exception, but there’s a good reason for that: we just enabled the first Sumo Logic customers with powerful new tools to tackle these exact challenges.

Centralized Log Management and a Successful 2021

With 2020 dominated by a global pandemic, organizations expedited their digital transformation strategies. (According to TechFirst podcast, COVID19 accelerated digital transformation by an average of 6 years.) One of the most significant changes was the rapid move to a remote workforce. This required stopgap measures to keep the business running. While these measures met the company’s immediate needs, the measures also introduced anticipated and unanticipated issues.

How to escape special characters with Loki's LogQL

In my ongoing Loki how-to series, I have already shared all the best tips for creating fast filter queries that can filter terabytes of data in seconds. In this installment, I’ll reveal how to correctly escape special characters within a string in Loki’s LogQL. When writing LogQL queries, you may have realized that in multiple places you have to write strings delimited by double quotes.

Monitoring Microservices the Right Way

This article was originally published on InfoQ at December 3rd 2020. If you’ve migrated from a monolith to a microservices architecture you probably experienced it: Modern systems today are far more complex to monitor. Microservices combined with containerized deployment results in highly dynamic systems with many moving parts across multiple layers.

How to Connect Elastic Security to Jira - Version 7.10

Elastic Security cases provide the ability to open and track incidents directly in the app, which you can send to external systems like Atlassian’s Jira. Case connection for Atlassian’s Jira includes Jira Service Desk, Jira Core, and Jira Software. In this video, you’ll learn how to connect Elastic Security to the Jira Service Desk.

How to migrate from self-managed Elasticsearch to Elastic Cloud on AWS

Increasingly, we are seeing on-prem workloads being moved onto the cloud. Elasticsearch has been around for many years with our users and customers typically managing it themselves on-prem. Elasticsearch Service on Elastic Cloud — our managed Elasticsearch service that runs on Amazon Web Services (AWS), Google Cloud, and Microsoft Azure across many different regions, is the best way to consume the Elastic Stack and our solutions for enterprise search, observability, and security.

How to Contribute to Detection Rules in Elastic Security - Version 7.10

Elastic Security has open sourced all our detection rules to work alongside the security community to stop threats at scale and arm every analyst. As part of our belief in the power of open source, Elastic includes prebuilt rules within the Security App to detect threats automatically. In this video, you’ll learn how you can contribute by creating a new rule, adding your new rule to the detection rules repo, and getting credit for it in the Elastic contributor program.

Stop Enforcing Security Standards. Start Implementing Policies.

In days gone by, highly regulated industries like pharmaceuticals and finance were the biggest targets for nefarious cyber actors, due to the financial resources at banks and drug companies’ disposal – their respective security standards were indicative of this. Verizon reports in 2020 that, whilst banks and pharma companies account for 25% of major data breaches, big tech, and supply chain are increasingly at risk.

Is CloudWatch Really Cost Efficient?

One of the keys to CloudWatch’s success is its no bang, no buck billing system. The pricing structure has been designed from the outset to ensure that CloudWatch users only pay for what they actually use. In addition, the CloudWatch Free Tier allows first time users to test the waters without shelling out. The downside of this flexibility and adaptability is complexity.

Scale Your Prometheus Metrics Indefinitely with Thanos

Prometheus metrics are an essential part of your observability stack. Observability comes hand in hand with monitoring, and is covered extensively here in this Essential Observability Techniques article. A well-monitored application with flexible logging frameworks can pay enormous dividends over a long period of sustained growth, but Prometheus has a problem when it comes to scale.