Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

The Benefits of Visibility in Higher Education Networks

Higher education institutions face unique cybersecurity challenges due to their complex networks, diverse user base and open academic environments. With thousands of students, staff and faculty members accessing resources from various locations and devices, universities must have visibility of what’s happening on their networks and robust and responsive cybersecurity protection to help safeguard them.

What is Java Performance Monitoring? [A Guide to DevOps Engineers]

You rolled out a Java application that worked fine in development. Fast, clean, no errors. However, once it went into production, things began to change. Suddenly, the app feels slow. CPU usage climbs without warning. Some users start getting timeouts. You check the dashboards, but nothing jumps out. You look through the logs, but it's mostly noise. And then the questions start coming in - "Is the JVM the problem?" If you've been in that situation, you're not alone.

Advanced Proactive SSL Certificate Monitoring

eG Enterprise version 7.5 introduces advanced capabilities for detailed SSL Certificate Monitoring including monitoring for web servers and apps using SSL. Monitoring SSL certificates is essential to ensure secure connections, prevent service outages, and maintain user trust. Here are a few things you need to monitor and questions you should ask to keep your services and apps running reliably and securely.

Securely query data sources on your Tailscale network using Private Data Source Connect in Grafana Cloud

Balancing security with your observability needs can be a difficult task. We know our users want to leverage platforms like Grafana Cloud to visualize and gain valuable insights into their data, while also keeping their data sources private and secure.

SMS alerts enabled for Early Warning Signals

When service disruptions happen, every second counts. That’s why we’re excited to announce a major update to StatusGator: Early Warning Signals are now available via SMS. Early Warning Signals have already been helping teams stay ahead of outages via email and Slack alerts — and now, with SMS support, you can get real-time notifications directly on your phone, even before incidents are publicly acknowledged.

Use Telegraf Without the Prometheus Complexity

Every system needs observability. You need to know what your CPU, memory, disk, and network are doing, and maybe keep an eye on database query latency or Redis connection counts. But setting that up isn’t always simple. You start with a couple of shell scripts. Then come exporters. Then Prometheus. Before long, you’re managing scrape configs, tuning retention, and watching dashboards fail under load after two days of data.

OpenTelemetry NestJS Implementation Guide: Complete Setup for Production [2025]

NestJS applications require comprehensive monitoring to ensure optimal performance and rapid issue resolution. As your application grows—spanning multiple services, databases, and external APIs—understanding what's happening under the hood becomes critical. That's where OpenTelemetry comes in. OpenTelemetry provides vendor-agnostic observability for your NestJS applications through distributed tracing, metrics, and logs.

Monitoring Ruby on Rails applications with Applications Manager

Ruby on Rails is the go-to framework for organizations to build flexible, database-driven web applications with high speed and efficiency. Enterprises of all sizes rely on it to build user-friendly applications. But like any other modern web stack, optimizing the performance, availability, and reliability of Rails applications, especially in production environments, requires more than just reactive bug fixes.

AIOps in 2025: 4 Components and 4 Key Capabilities

AIOps, or Artificial Intelligence for IT Operations, is the application of artificial intelligence and machine learning to automate and improve IT operations. It combines big data analytics, AI, and machine learning to monitor, manage, and optimize IT environments, enabling organizations to proactively detect, diagnose, and resolve issues more efficiently than traditional methods.

Architecting for Value: A Playbook for Sustainable Observability

You’ve built something amazing. Your services are scaling, your users are happy, and your team is shipping code like never before. Then the cloud bill arrives, and one line item makes your eyes water: observability. That Datadog invoice feels less like a utility bill and more like a ransom note. It’s a modern engineering paradox. The tools that give you sight into your complex systems are the same ones that can blind you with runaway costs.