Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How to run Loki at scale on Kubernetes (Loki Community Call January 2025)

Happy New Year from the Loki Engineering team. To kick off 2025, Nicole and Jay will be joined by Poyzan Taneli from the Loki Engineering team to discuss how to run Loki at scale on Kubernetes. If you are currently running Loki in microservices mode or preparing to do so, we will be discussing best practices for scaling its components to meet the demands of production use cases.

Get Started with the TIG Stack and InfluxDB Core

Time series data is everywhere—from IoT sensors and server metrics to financial transactions and user behavior. To collect, store, and analyze this data efficiently, you need tools purpose-built for the job. That’s where the TIG Stack comes in: Telegraf for data collection, InfluxDB for storage and analytics, and Grafana for visualization. Together, these tools offer a powerful solution for real-time analytics, observability, and monitoring.

Databases and SLOs: How to apply service level objectives to your databases with synthetic monitoring

Wilfried Roset is an engineering manager who leads an SRE team and he is a Grafana Champion. Wilfried focuses on prioritizing sustainability, resilience, and industrialization to guarantee customers satisfaction. Nowadays databases are commonly used to build information systems. Relational or NoSQL, self-managed or as-a-service, those databases often play a critical role in the overall health of your applications.

AI-Powered Log Management: Faster Troubleshooting with Logz.io

Managing logs in a fast-paced cloud-native world can be tough. Log data is growing, and traditional tools just can’t keep up. That’s where Logz.io comes in—a log management and analytics platform powered by AI to make troubleshooting, performance monitoring, and collaboration faster and easier than ever.

AI in Observability: Mapping Root Causes with Precision

Explore how AI is transforming observability by mapping system connections and uncovering root causes with precision. The Logz.io AI Agent analyzes logs, metrics, and service dependencies to provide actionable insights without the need to sift through overwhelming amounts of data.

DX Operational Observability: Onboarding OpenTelemetry in Minutes

In our era of cloud-native applications, robust observability is critical to maintain performance, identify issues, and enhance user experiences. With its advanced capabilities, DX Operational Observability (DX O2) integrates seamlessly with OpenTelemetry, a leading open-source observability framework. In this blog, we explore how to onboard the OpenTelemetry Demo Application to DX O2. The demo application provides a hands-on introduction to combining these powerful tools.