Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Easily Query Multiple Metrics in Prometheus

In monitoring setups, working with a single metric rarely tells the complete story. The real power of Prometheus lies in its ability to query multiple metrics simultaneously, creating connections between different data points that reveal the true state of your systems. This guide will walk you through everything you need to know about crafting effective multi-metric queries in Prometheus – from basic concepts to advanced techniques that will help you monitor and troubleshoot your infrastructure.

Apache Logs Explained: A Guide for Effective Troubleshooting

Apache logs are a critical tool for monitoring your web server, but they can often feel overwhelming. For DevOps teams, understanding these logs is essential for diagnosing issues and maintaining system reliability. In this guide, we'll explore the setup and analysis of Apache logs, offering practical tips to help you make sense of them and use them effectively for troubleshooting and optimization.

A Practical Guide to Monitoring Ubuntu Servers

Running Ubuntu servers without proper monitoring can lead to unexpected issues. For DevOps engineers and SREs, effective tracking is crucial for maintaining system health and performance. This guide covers everything you need to know about monitoring Ubuntu servers, from the basics to advanced strategies, helping you keep your systems running smoothly, whether you manage a single server or a large fleet.

Unlocking the Power of LLMs and AI Agents for Network Automation

Artificial intelligence is reshaping how enterprises manage and secure their networks, but not all AI is created equal, and not all Large Language Models (LLMs) are ready for the job. While tools like ChatGPT and Google Gemini are transforming communication and productivity, applying general-purpose LLMs to something as specialized and high-stakes as network operations is an entirely different challenge. Networks are dynamic, complex, and context-heavy.

Kubernetes Monitoring in 2025: The Complete Guide to Cluster Visibility

Modern cloud-native applications rely on Kubernetes as their leading container orchestration platform. The adoption of Kubernetes in 2025 has achieved remarkable heights, making it the primary operator of vital enterprise systems across financial technology and healthcare organizations. Kubernetes environments continue to grow increasingly complex, and their dynamics are evolving, so monitoring has become an essential strategic practice.

Grafana Alerting Overview Plus New Features Coming to Grafana 12 | Grafana Labs

In this walkthrough, Grafana’s Ryan Kehoe dives into the biggest improvements designed to help teams create, manage, and route alerts with less friction and more power. Whether you're wrangling multi-source queries or managing alerts across large environments, these updates are for you.

Redoing My Progress WhatsUp Gold Home Lab with Proxmox: A Journey of Failover, Backup and Recovery

Greetings, tech enthusiasts! I hope you’re all doing well. Today, I’m thrilled to share the story of my recent adventure in rearchitecting my home lab with Proxmox. This journey has been a rollercoaster of unexpected challenges, valuable lessons, and rewarding successes. I built a resilient and efficient setup that exceeded my initial expectations by leveraging modern virtualization and storage technologies.

Azure DevOps agent pools: diving deeper

Most of the time the build and deployment pipelines we create will run on compute provided by the Azure DevOps cloud and the only decision we need to make is whether to select a Windows or Linux Agent. Sometimes though, the specification for the VM that Azure DevOps spins up may not be right for our needs. We may need more memory or a particular OS version. This is when custom agents and Agent Pools come into play.