Operations | Monitoring | ITSM | DevOps | Cloud

June 2021

Introducing Logz.io's New Lookz!

When Logz.io was founded in 2015, we set out to simplify logging with the ELK Stack by delivering Elasticsearch and Kibana as a managed cloud service. But logs only tell part of the story – DevOps teams also need metric and trace data to better monitor the health and performance of their environment and quickly pinpoint the root cause of new problems. Importantly, using multiple tools to collect and analyze this data adds complexity and extra work.

Introducing Live Tail

At observIQ, we pride ourselves on delivering simple and powerful functionalities, quickly. We’re excited to announce the addition of Live Tail to the observIQ featureset. Live Tail emulates the terminal experience, giving you the ability to analyze, visualize and debug live – all in a single place. Never be worried about what the outcome of your deployment will be because Live Tail lets you troubleshoot, react and reassess issues in your deployment in real-time.

What We Learned About Enterprise Cloud Services From the 2021 Azure Outage

Azure, AWS, and GCP cloud services are invaluable to their enterprise customers. When providers like Microsoft are hit with DNS issues or other errors that lead to downtime, it has huge ramifications for their users. The recent Azure cloud services outage was a good example of that. In this post, we’ll look at that outage and examine what it can teach us about enterprise cloud services and how we can reduce risk for our own applications.

How Psyonix wins with better logging

When you grow your peak concurrent users by 5x nearly overnight, ensuring that your operations can successfully support that growth can be a make or break for your success. Rocket League is a popular online multiplayer game created by Psyonix described as arcade-style soccer and vehicular mayhem. In the summer of 2020, the game maker decided to switch the business model of the game from an upfront purchase to a free to play model.

That's A Data Problem - The New Normal | Daniel Newman & Splunk's Kristen Robinson

The COVID-19 pandemic accelerated digital transformation directly impacting companies’ people strategies. As companies adjust to the new normal, leadership must keep human centricity, flexibility and employee well-being at the top of their agenda - putting people at the heart of the process. Tech Analyst Daniel Newman and Chief People Officer at Splunk, Kristen Robinson, discuss how to manage the new normal. The pair touches on culture, connectivity and how to prepare for the unexpected.

Monitoring Kubernetes with the Elastic Stack using Prometheus and Fluentd

Kubernetes is an open source container orchestration system for automating computer application deployment, scaling, and management, and seems to have established itself as the de facto standard in this area these days. The shift from monolithic applications to microservices brought by Kubernetes has enabled faster deployment, where dynamic environments become commonplace. But on the other hand, this has made monitoring applications and their underpinning infrastructure more complex.

Art of Data: Bringing Data to Esports

You may have seen the announcement that Splunk and McLaren Racing have expanded their partnership, which sees Splunk as an Official Global Partner of the McLaren Shadow Esports team and the Logitech McLaren G Challenge. As a budding Esports fan and data enthusiast, it’s really exciting to see these two worlds collaborate and accelerate the virtual racing experience.

Understanding Where You Fit in the Web Performance Maturity Curve

We all know that faster is better. Research and results clearly indicate that faster experiences with fewer errors result in increased usage, conversion, and revenue. With the desire to improve business metrics in mind, organizations often seek immediate improvements in customer experience across digital properties. However, without proper planning and coordination, these attempts consistently fail.

Full Stack Django Monitoring, Part 2

In the first part of this series, we deployed a Django application on a DigitalOcean Droplet and created a simple Django application. To monitor our Django application, we installed the SolarWinds® APM Integrated Experience featuring AppOptics™, Loggly®, and Pingdom®. In the conclusion of this article, we’ll explore the different types of monitoring provided by the APM Integrated Experience.

An Intro Guide to Game Engine Logging & Locating Your Logs

Game development is an entirely different beast to other industries. Marketing, development, and release are more tightly interwoven than in other sectors, with a lot of pressure to meet community-anticipated milestones and launch. As such, it’s important to have game engine logging and monitoring pipelines set up for your projects. In other platforms, version upgrades and roll-outs tend to be sudden, with no definitive date set.

Product Demo

This 45-minute product demo provides a demonstration of how Coralogix is disrupting the application monitoring and observability market with our game-changing technology. We're working to redefine the way organizations approach logging in their modern DevOps and CI/CD environments. We are increasing developer productivity (less time searching the logs, more time developing), and saving companies upwards of 60% on the overall cost of data volume storage (due to our underlining architecture).

[Webinar] Troubleshooting in Fast Paced Environments with Komodor & Coralogix

On June 2nd, 2021, we participated in a live panel discussion with our friends from Coralogix, featuring our CTO & co-founder, Itiel Shwartz, and Coralogix’s Head of DevSecOps, Oded David. Widespread adoption of agile methodologies, CI/CD pipelines, distributed architectures, and more have enabled software development to reach a rate and scale that would have seemed unimaginable just a few years ago. Of course, along with the benefits of new methodologies and technologies comes a new set of troubleshooting challenges that need to be addressed as well.

That's A Data Problem - Thriving in an Uncertain World | Daniel Newman & Splunk's Doug Merritt

The COVID-19 pandemic unveiled the importance of business resiliency. Moving forward, the case for prioritizing business resilience is beyond doubt. Leadership must leverage data and system resilience to meet new threats that could impact their business model and operations. Tech Analyst Daniel Newman and CEO of Splunk, Doug Merritt, discuss how to build business resilience focusing on data strategies, people-first leadership and investing to be ready for a future of uncertainty.

That's A Data Problem - How Do Security Programs Drive Business Results?

The sheer number of cybersecurity attacks against companies continues to grow, and with accelerated cloud transformation, IT teams are facing new challenges. To drive innovation and stay competitive, companies need to ensure they are using cloud securely, prioritizing a security first approach and mitigating risks to drive business results.

Why You Need Real-Time for Faster MTTR

“If you ain't first, you're last.” While that famous one-liner from Ricky Bobby (Will Ferrell) in the cult hit Talladega Nights is more joke than catchphrase, it hits home for those of us in the world of DevOps and Observability. Faster is better. And in our technology-driven world of online transactions and complex environments, faster isn’t just better — it’s crucial.

Log Management Challenges in Modern IT Environments

Modern IT environments have presented many difficult-to-overcome challenges to organizations in recent times. One such challenge is gaining visibility into the systems. One may argue that due to cloud computing and limitless storage, it is now very easy to overcome some of the conventional challenges regarding visibility. However, the architecture has changed into infrastructure scheduling and microservices. Hardware and software programs are now more complex, with their own set of challenges.

The Importance of Log Management for Your Home Network

The team at observIQ is just like every one of you reading this, we are avid programmers, gamers, traders, thinkers, and innovators who build an elaborate home network for fun, work, and for the simple reason that we enjoy technology. We are constantly growing the size and footprint of our home networks and labs as well – adding custom apps, devices, and servers, making it challenging to gauge our technical footprint.

How to Use Observability to Reduce MTTR

When you’re operating a web application, the last thing you want to hear is “the site is down." Regardless of the reason, the fact that it is down is enough to cause anyone responsible for an app to break out into a sweat. As soon as you become aware of an issue, a clock starts ticking — literally, in some cases — to get the issue fixed. Minimizing this time between an issue occurring and its resolution is arguably the number one goal for any operations team.

How Log Analytics Powers Cloud Operations: Three Best Practices for CloudOps Engineers

At the turn of the 20th Century, enterprises shut down their clunky generators and started buying electricity from new utilities such as the Edison Illuminating Company. In doing so, they cut costs, simplified operations, and made profound leaps in productivity. The promise of modern cloud computing invites easy comparisons to those first electric utilities: outsource to them, save money and simplify.

Is Operational Resilience in Financial Services actually just a data problem?

Operational resilience is currently a hot topic in Financial Services, largely because of the impact that COVID has had on how customers interact with financial institutions. Almost overnight, the financial services industry had to cope with a large volume of transactions moving to digital channels at the same time as its employees were forced to set up home offices so that they could continue to work remotely.

Easily ingest data to Elastic via Splunk

As organizations migrate to Elastic from incumbent vendors, quickly onboarding log data from their current solution into Elastic is one of the first orders of business. Data onboarding often involves having to adjust ingestion architecture and implement configuration changes across data sources. We want to ensure that users trialing or migrating to Elastic can get data in quickly to start seeing the power of Elastic solutions as quickly as possible.

New in Kibana: How we made it easier to manage visualizations and build dashboards

Our Kibana team has been hard at work implementing and executing on a new Kibana strategic vision to streamline the dashboard creation process and sand down the rough edges of creating visualizations for dashboards. We accomplished our goal and reduced the overall time it takes users to go from a blank slate to a meaningful dashboard that conveys insights about the data.

Using pre-built Monitors to proactively monitor your application infrastructure

SREs, developers and DevOps staff for mission-critical modern apps know being notified in real-time when or before critical conditions occur can make a massive difference in end-user digital experiences and in meeting a 99.99% availability objective.

The Spike Protection Bundle with Index Rate Alerting

For DevOps teams that want to accelerate release velocity and improve reliability, logs can unlock the insights you need to move faster. But for managers and budget owners, logging can be an unpredictable pain. Trying to estimate logging spend, especially with the adoption of microservices and container-based architecture, seems like an impossible task.

Announcing LogDNA Agent 3.2 GA: Take Control of Your Logs

The LogDNA Agent is a powerful way for developers and SREs to aggregate logs from their many applications and services into an easy-to-use web interface. With only 3 kubectl commands, the installation process is quick and simple to complete for any number of connected systems. To help control the logs that are stored and surfaced in the LogDNA web interface, users can set Exclusion Rules, which enables the exclusion of certain queries, hosts, and tags directly from the UI.

LogDNA | Log Management for DevOps

LogDNA is a modern log management solution that empowers DevOps teams with the insights that they need to develop and debug their applications with ease. Users can get up and running in minutes, see logs from any source instantly in Live Tail, and effortlessly search them with natural language. Custom Parsing, Views, and Alerts put users in control of their data every step of the way.

Instrumenting Java Applications for Tracing with OpenTelemetry and Jaeger

The aim of this article is to demonstrate how you can instrument a Java application using Opentelementry and Jaeger. In this example, we will be instrumenting our Java application using OpenTelemetry and the OpenTelemetry Java client, and the tracing data will be exported and visualized using Jaeger. We will use the Logz.io Jaeger backend as it is compatible with common tracing standards like Zipkin, OpenTelemetry, and OpenTracing.

Understanding the DoD's Data Strategy: Part 1

As my colleague, Tim Frank, wrote about recently in his blog post, "The Department of Defense Data Strategy: An Important Start," in late 2020 the Department of Defense (DoD) released its new Data Strategy — providing focus and direction for the Department’s efforts to become data-centric at all levels of its enterprise.

Introducing New Cloud Security Monitoring & Analytics Apps

Companies generate data at an exponential rate, and the task of analyzing data to produce relevant security insights can be overwhelming. With evolving market dynamics and threat landscapes, security teams have a greater need for integrated and scalable monitoring that provides real-time and meaningful insights into the state of organizational security posture.

Instrumenting Microservices with Istio for Distributed Tracing

Previously, I wrote a Beginner’s Guide to Jaeger + OpenTracing Instrumentation for Go providing guidance on manually instrumenting Go services. This is useful for cases where we want fine-grained tracing of specific functions. However, what if all we want is to trace a service’s inbound and outbound calls with little to no additional code?

Tutorial: Set Up Event Streams in CloudWatch

When building a microservices system, configuring events to trigger additional logic using an event stream is highly valuable. One common use case is receiving notifications when errors are seen in one of your APIs. Ideally, when errors occur at a specific rate or frequency, you want your system to detect that and send your DevOps team a notification. Since AWS APIs often use stateless functions like Lambdas, you need to include a tracking mechanism to send these notifications manually.

How to Monitor Full-Stack Django Applications

Modern web applications can be complex. A typical application stack usually involves several components spread across different layers. For example, HTML5 and AngularJS can make up a site’s front end. User inputs and queries from the front end can be passed on to containerized microservices running on a middleware, which in turn could pass the queries to a back-end database. Systems like WAFs and LDAP servers can be used for security and authentication.

Monitor Salesforce logs with Datadog

Visibility into your Salesforce environment is crucial for keeping your data secure and ensuring a seamless user eperience. That’s why we are excited to announce that Datadog can now collect Salesforce event logs directly from your Real-Time Event Monitoring stream, giving you deep insights into the security and operational performance of your Salesforce environment.

How to configure Elastic Cloud on Kubernetes with SAML and hot-warm-cold architecture

Elastic Cloud on Kubernetes (ECK) is an easy way to get the Elastic Stack up and running on top of Kubernetes. That’s because ECK automates the deployment, provisioning, management, and setup of Elasticsearch, Kibana, Beats, and more. As logging and metric data — or time series data — has a predictable lifespan, you can use hot, warm, and cold architecture to easily manage your data over time as it ages and becomes less relevant.

What the Fastly Outage Can Teach Us About Observability

On Tuesday June 8th, the Content Delivery Network Fastly experienced an outage that made large swaths of the web unavailable for nearly an hour. To focus on the positive, this outage can serve as a wakeup call for Observability teams, because it shows how much modern sites depend on resources beyond their immediate control, and how hard it is to "observe" these kinds of issues with an incomplete Observability mindset.

Key JVM Metrics to Monitor for Peak Java Application Performance

Monitoring is crucial if you want to see what happens in your system and JVM-based applications are not different. Well, some metrics, like memory and garbage collection, require special attention because they play a major role in your application performance. In this blog post, we will look into the key Java Virtual Machine (JVM) metrics that you should monitor if you care about performance and stability. Those are the memory, the garbage collection, and the JVM threads.

Multi-Project Cloud Monitoring made easier

Customers need scale and flexibility from their cloud and this extends into supporting services such as monitoring and logging. Google Cloud’s Monitoring and Logging observability services are built on the same platforms used by all of Google that handle over 16 million metrics queries per second, 2.5 exabytes of logs per month, and over 14 quadrillion metric points on disk, as of 2020.

Splunk Connector for Ivanti Device Control - Now Available!

Ivanti Device Control is all about securing your endpoints while also providing a detailed overview to quickly identify weak links in your environment. The latter has now become much simpler and quicker to perform! Our new Splunk connector enables you to connect directly to your Ivanti Device Control environment, feeding in all reported events and showing you the most important data in a single dashboard.

How to Monitor Application Logs

In the beginning, there was the Log – or to be a bit more precise, there were application logs. At least that's how it was in the early days of application development, when raw log data itself was more often than not the point where troubleshooting began. Now, of course, the starting point for troubleshooting with cloud-based applications is much more likely to be an automatically-generated alert, or an indication on a monitoring dashboard that something isn't quite right.

Monitoring Kafka Performance with Splunk

Today’s business is powered by data. Success in the digital world depends on how quickly data can be collected, analyzed and acted upon. The faster the speed of data-driven insights, the more agile and responsive a business can become. Apache Kafka has emerged as a popular open-source stream-processing solution for collecting, storing, processing and analyzing data at scale.

Collecting Kafka Performance Metrics with OpenTelemetry

In a previous blog post, "Monitoring Kafka Performance with Splunk," we discussed key performance metrics to monitor different components in Kafka. This blog is focused on how to collect and monitor Kafka performance metrics with Splunk Infrastructure Monitoring using OpenTelemetry, a vendor-neutral and open framework to export telemetry data. In this step-by-step getting-started blog, we will.

Classic Event Viewer Retires

The classic event viewer, introduced in June 2011, has been the heart of SolarWinds® Papertrail™. It’s where we spend most of our time, searching, tailing, and sharing event data. Over the last 10 years, Papertrail fans across the globe have shared their ideas with our development team and helped us improve and refine the event viewer.

Panel Discussion: Troubleshooting in Fast-Paced Environments

Widespread adoption of agile methodologies, CI/CD pipelines, distributed architectures, and more have enabled software development to reach a rate and scale that would have seemed unimaginable just a few years ago. Of course, along with the benefits of new methodologies and technologies comes a new set of troubleshooting challenges that need to be addressed as well. In this Panel discussion, we'll cover the new challenges in accelerated pipelines and how to overcome them.

Visualize Humio logs alongside your other data sources in Grafana Cloud with the new plugin for Grafana

Being able to get the big picture and immediately pivot between siloed data is one of the key values Grafana Cloud provides. Our composable observability platform integrates Prometheus and Graphite metrics, Loki logs, and Tempo traces with Grafana — and also allows you to draw data in from other sources of your choice concurrently.

Elastic License Update

In January 2021, we announced that starting with version 7.11, we would be changing the Apache 2.0 portions of Elasticsearch and Kibana source code to be dual licensed under Elastic License and SSPL, at the users’ discretion. As part of that change, we created Elastic License 2.0 (ELv2) as a permissive, fair-code license, which allows free use, redistribution, modification, and derivative works, with only three simple limitations, outlined in our original announcement.

Introducing Sensu

Since 2010, it has been Sumo Logic’s mission to democratize machine data. Naturally, we tend to focus on the outcomes: reliable and secure applications and systems that are the engines of successful modern businesses. But to drive these outcomes, and before the spotlight-hogging analytics kick in, algorithms require data. And this is where the magic starts! Sensu has been working on championing a monitoring as code approach to building observability pipelines for a decade now.

Why Are SaaS Observability Tools So Far Behind?

Salesforce was the first of many SaaS-based companies to succeed and see massive growth. Since they first started out in 1999, Software-as-a-Service (SaaS) tools have taken the IT sector and, well the world, by storm. For one, they mitigate bloatware by moving applications from the client’s computer to the cloud. Plus, the sheer ease of use brought by cloud-based, plug-and-play software solutions has transformed all sorts of sectors.