Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Windows Server Monitoring with Pandora FMS

Pandora FMS is a proactive, advanced, flexible and easy-to-configure monitoring tool tailored to business itself. It adapts to all needs both in servers, network computers, devices and whatever is necessary. In this article, we will focus on Windows Server monitoring, using the software agent installed on our server.

Monitoring Java applications with Elastic: Multiservice traces and correlated logs

In this two-part blog post, we’ll use Elastic Observability to monitor a sample Java application. In the first blog post, we started by looking at how Elastic Observability monitors Java applications. We built and instrumented a sample Java Spring application composed of a data-access microservice supported by a MySQL backend. In this part, we’ll use Java ECS logging and APM log correlation to link transactions with their logs.

Manage Your Splunk Infrastructure as Code Using Terraform

Splunk is happy to announce that we now have a Hashicorp verified Terraform Provider for Splunk. The provider is publicly available in the Terraform Registry and can be used by referencing it in your Terraform configuration file and simply executing terraform init. If you're new to Terraform and Providers, the latest version of Terraform is available here. You will need to download the appropriate binaries and have Terraform installed before using the provider.

Monitor Alcide kAudit logs with Datadog

Kubernetes audit logs contain detailed information about every request to the Kubernetes API server and are critical to detecting misconfigurations and vulnerabilities in your clusters. But because even a small Kubernetes environment can rapidly generate lots of audit logs, it’s very difficult to manually analyze them.

Introducing Prometheus-style alerting for Grafana Cloud

Hi! My name’s Richard Lam, and I’m the new product manager for Grafana Cloud. I’m really excited for my first contribution to this community, both so I can introduce myself to you all, and so I can highlight an awesome new Grafana Cloud feature that’s coming your way! Happy reading, and hopefully this is just the start of many more communications from me.

TrackJS for Node

TrackJS error monitoring, on your servers. We’re thrilled to announce official support for Node environments and the 1.0.0 release of our Node agent. We’ve actually had Node since sometime last year, but we’re finally formalizing it as a first-class citizen and fully-supported part of TrackJS! Here are some of the cool things you can do with TrackJS for Node.

Using Machine Learning for Root Cause Analysis

From a security breach to a complete system outage, when an incident occurs and your network or service is impacted, it’s typically the result of a chain of events. A problem with one service has impacted another service, and so on until finally, you’re facing a problem that’s compromising availability and damaging your customer experience. In the event of a serious incident, your team’s immediate response is to focus on identifying the root cause and restoring service.

How To Succeed When Adopting A Multi Cloud Environment

Today, a vast majority of companies are working with multiple cloud providers. But moving IT operations to the cloud has significant consequences they need to deal with. Discover how Broadcom helps customers to manage critical workloads in multi-cloud environments, simplifying and accelerating the deployment of new business services.

How Automation Helps The Site Reliability Engineer

Automation has been with us for decades now and with years of experience and experimentation we are arriving at a best practice known as site reliability engineering. Site reliability engineering seeks to manage the risk imposed from multiple agile changes to protect business revenues and sustain positive customer experiences.