Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Why we generate & collect logs: About the usability & cost of modern logging systems

Logs and log management have been around far longer than monitoring and it is easy to forget just how useful and essential they can be for modern observability. Most of you will know us for VictoriaMetrics, our open source time series database and monitoring solution. Metrics are our “thing”; but as engineers, we’ve had our fair share of frustrations in the past caused by modern logging systems that tend to create further complexity, rather than removing it.

Enable and use GKE Control plane logs

Are you having any issues with the control plane components in your GKE Cluster? Are you interested in gaining visibility into the control plane side of the cluster to troubleshoot the issues by yourself? Then GKE Control Plane Logs is a great way to gain insights on what's going on with your cluster. In this video, we provide a quick overview about Control Plane components and logs, and show how to enable control plane logs on the new and existing GKE clusters. Watch this video to learn how to use Control plane logs to troubleshoot webhook and control plane latency issues in GKE clusters.

Announcing Easy Connect - The Fastest Path to Full Observability

Logz.io is excited to announce Easy Connect, which will enable our customers to go from zero to full observability in minutes. By automating service discovery and application instrumentation, Easy Connect provides nearly instant visibility into any component in your Kubernetes-based environment – from your infrastructure to your applications. Since applications have been monitored, collecting logs, metrics, and traces have often been siloed and complex.

Quickstart network investigations with NPM's story-centric UX

Datadog Network Performance Monitoring (NPM) gives you visibility into all the communication that takes place between the network components in your environment, including hosts, processes, containers, clusters, zones, regions, and VPCs. As organizations scale, and as their networks grow in complexity, the massive volume of network data to be monitored can become overwhelming. Knowing precisely what network data to surface to resolve issues within these larger environments can be a challenge.

3 Steps to Get DX NetOps Events in Slack and Google Chat

Network operations centers (NOCs) play a critical role in any organization’s operational and business continuity. To meet their vital charters, NOC teams must constantly strive to maintain uninterrupted network availability and to minimize the business impact of network issues. Within the NOC, effective collaboration is essential for quick troubleshooting and resolution of network issues.

July Product Updates for Sentry

During the past month of July, the Sentry dev team dropped new capabilities to help you better understand, prioritize, and respond to errors and performance problems. From new ways of sorting priority issues to helping you be more proactive in identifying problems earlier in the dev lifecycle, we’ve picked a handful of recent releases to dive into. Plus we’ll highlight a couple of new integrations with our friends at Slack and Atlassian.

Pump the Brakes: Some Key Considerations in Your Journey to AIOps

Every well-oiled machine needs both a gas and a brake pedal. If our article titled How IT Teams Can Leverage AIOps’ Capabilities is the gas pedal in this analogy, then this writing is the proverbial brakes in which we explore some educational pit stops organizations should make on their way to integrating artificial intelligence (AI) and machine learning (ML) into their IT operations (AIOps).

Automatic Instrumentation for OpenTelemetry Go

The OpenTelemetry Go project now supports automatic instrumentation via eBPF! This is a big milestone for the project and makes it significantly easier to generate data from your Go apps: The automatic instrumentation agent is still in s/alpha/beta today, but it’s ready for you to try on your applications!