Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

InfluxDays Recap - Paul Dix and the Journey of InfluxDB

According to the old adage, life’s a journey not a destination. The same can be said for software. It’s unlikely that any developer would ever say that something they built was truly done. There are always bugs to squash, features to add, and updates to implement. As a company intensely focused on time and the context of time, it comes as little surprise that these themes played a significant role in Paul Dix’s presentation for InfluxDays.

Performance Monitoring for AWS NICE DCV VDI and Cloud Protocol

In today’s article, I will be highlighting eG Enterprise’s monitoring capabilities for Amazon’s AWS NICE DCV VDI protocol that was used first in Amazon’s AppStream 2.0 and now subsequently also in WSP 2.0 for the Amazon WorkSpaces service for digital workspaces.

Monitoring Cloud Database Costs with OpenTelemetry and Honeycomb

In the last few years, the usage of databases that charge by request, query, or insert—rather than by provisioned compute infrastructure (e.g., CPU, RAM, etc.)—has grown significantly. They’re popular for a lot of the same reasons that serverless compute functions are, as the cost will scale with your usage. No one is using your site? No problem: you’re not charged.

Modern Canadian MSSP drives next-gen MDR with Logz.io and Tines

Today’s Managed Security Service Providers (MSSPs) are trying to grow their business quickly, improving margins and onboarding customers with high-quality tool sets that scale with the business. This means reducing cost, improving onboarding time and building the next generation of Managed Detection and Response (MDR) to deal with threats that are increasing in volume and sophistication.

How to mute alerts during maintenance windows or scheduled backups?

The health management APIs in Netdata allows teams to eliminate unnecessary alerting during scheduled maintenance, testing, auto scaling events, and instance reboots. For all SREs, it is absolutely crucial to filter out expected events during maintenance windows and quickly pinpoint critical issues in your infrastructure. Every minute is crucial while dealing with troubleshooting issues and any distractions that may hijack the troubleshooting process should be subdued.

AIOps (artificial intelligence for IT operations)

Artificial intelligence for IT operations (AIOps) is an umbrella term for the use of big data analytics, machine learning (ML) and other artificial intelligence (AI) technologies to automate the identification and resolution of common IT issues. The systems, services and applications in a large enterprise produce immense volumes of log and performance data. AIOps uses this data to monitor assets and gain visibility into dependencies within and outside of IT systems.

What is Jaeger Distributed Tracing?

Distributed tracing is the ability to follow a request through a software system from beginning to end. While that may sound trivial, a single request can easily spawn multiple child requests to different microservices with modern distributed architectures. These, in turn, trigger further sub-requests, resulting in a complex web of transactions to service a single originating request.