Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Mastering Microsoft 365 Monitoring for Businesses

In the ever-evolving landscape of modern business, the shift towards cloud-based solutions has been nothing short of transformative. Among these technological advancements, Microsoft 365 has emerged as a cornerstone, offering a comprehensive suite of tools to streamline operations, boost collaboration, and enhance productivity. As organizations increasingly embrace the cloud, the need to ensure the performance, security, and availability of these critical services becomes paramount.

Telegraf Deployment Strategies with Docker Compose

This article, written by Shan Desai, was originally published on his blog and is reposted here with permission. Shan is a Software engineer currently employed at Emerson Discrete Automation and is an Open-Source Contributor / DIY Tech Enthusiast currently working with Industrial IoT. Telegraf is widely used as a metric aggregation tool thanks to the diverse number of plugins it provides that interface with a multitude of systems without having to write complex software logic.

What causes Azure costs to increase?

As the adoption of cloud computing continues to surge, Microsoft Azure remains one of the leading platforms for businesses seeking scalable and efficient cloud solutions. I have been using Azure for a couple of years now; it provides a wide range of services and features, allowing organizations to host applications, store data, and deploy various workloads on a pay-as-you-go basis.

Simplifying Data Lake Management with an Observability Pipeline

Data Lakes can be difficult and costly to manage. They require skilled engineers to manage the infrastructure, keep data flowing, eliminate redundancy, and secure the data. We accept the difficulties because our data lakes house valuable information like logs, metrics, traces, etc. To add insult to injury, the data lake can be a black hole, where your data goes in but never comes out. If you are thinking there has to be a better way, we agree!

Lookup Tables and Log Analysis: Extracting Insight from Logs

Extracting insights from log and security data can be a slow and resource-intensive endeavor, which is unfavorable for our data-driven world. Fortunately, lookup tables can help accelerate the interpretation of log data, enabling analysts to swiftly make sense of logs and transform them into actionable intelligence. This article will examine lookup tables and their relationship with log analysis.

Scaling Monitoring Administration with Experience-Driven NetOps: AppNeta and DX NetOps

Today, pretty much every critical business service, every critical employee job function, every critical customer transaction, and so much more are all reliant upon network connectivity. It falls to network operations (NetOps) teams to ensure network connections continue to support these demands. Over time, the scale and the complexity of the networks the organization relies upon have continued to grow, making the job of NetOps teams increasingly challenging.

What is Website Maintenance: Your Ultimate Guide to Keeping Your Site Functional

Website maintenance is not that different from keeping up with the maintenance of real brick-and-mortar stores. Would you shop at a dirty store, filled with broken furniture, and selling outdated products? We didn’t think so. Website maintenance plays the same role: it makes the business inviting, makes you look professional, and engages customers.

Observability and the DORA metrics

The Accelerate State of Devops Report highlights four key metrics (known as the DORA metrics, for DevOps Research & Assessment) that distinguish high-performing software organizations: deployment frequency, lead time for changes, time-to-restore, and change fail rate. Observability can kickstart a virtuous cycle that improves all the DORA metrics.

Less is more: How Grafana Mimir queries run faster and more cost efficiently with fewer indexes

Over the past six months, we have been working on optimizing query performance in Grafana Mimir, the open source TSDB for long-term metrics storage. First, we tackled most of the out-of-memory errors in the Mimir store-gateway component by streaming results, as we discussed in a previous blog post. We also wrote about how we eliminated mmap from the store-gateway and as a result, health check timeouts largely disappeared.