Operations | Monitoring | ITSM | DevOps | Cloud

Multiple players, one stack: Inside Roblox's centralized observability stack

When you sign into the Roblox platform, you get 30 million immersive experiences, ranging from concerts to fashion shows to, of course, video games. But when the observability team at Roblox logs on, they’re not playing around. The Roblox observability engineers are responsible for keeping more than 214 million monthly users happy and engaged by making the wildly popular gaming platform highly available around the world.

Learn how to monitor IoT devices with Grafana

IoT devices open the door to all sorts of computing potential, but they can also produce a flood of telemetry data that users need to properly collect and monitor to ensure those devices are working properly. It’s no wonder so many individuals and businesses use Grafana for IoT use cases, whether they’re starting an aquaponic farm in South Africa, managing an industrial-scale electroplating factory in Ohio, or simply keeping tabs on Pretzel the python at its home in the UK.

Observability overload: Insights into the rise of tools, data sources, and environments in use today

With countless observability tools, data sources, and environments to juggle, the organizations that deploy and manage today’s distributed applications often face an uphill battle to gain visibility into their application performance. That was a key takeaway from the Grafana Labs Observability Survey 2023, which incorporated input from more than 250 industry practitioners who are all too familiar with these complexities.

How to troubleshoot memory leaks in Go with Grafana Pyroscope

Memory leaks can be a significant issue in any programming language, and Go is no exception. Despite being a garbage-collected language, Go is still susceptible to memory leaks, which can lead to performance degradation and cause your operating system to run out of memory. To defend itself, the Linux operating system implements an Out-of-Memory (OOM) killer that identifies and terminates processes that consume too much memory and cause the system to become unresponsive.

Grafana Cloud is now available in AWS Marketplace

Grafana Labs is excited to announce that Grafana Cloud is now available in AWS Marketplace. With this new offering, existing AWS customers can procure, deploy, and scale the fully managed Grafana LGTM observability stack (Loki for logs, Grafana for visualization, Tempo for traces, Mimir for Prometheus metrics) with just a few clicks.

Grafana Alerting: Searching for Grafana alerts just got faster, easier, and more accurate

Grafana Alerting enables users to create and customize alert rules as separate entities and link them to Grafana panels. It also supports various data sources with built-in alerting engines, such as Prometheus, Grafana Mimir, and Grafana Loki, allowing users to manage their alert rules directly from Grafana’s UI.

How to collect and query Kubernetes logs with Grafana Loki, Grafana, and Grafana Agent

Logging in Kubernetes can help you track the health of your cluster and its applications. Logs can be used to identify and debug any issues that occur. Logging can also be used to gain insights into application and system performance. Moreover, collecting and analyzing application and cluster logs can help identify bottlenecks and optimize your deployment for better performance.

How to monitor Microsoft SQL Server performance with Grafana Cloud

A database is one of the most critical components for almost every application. Making sure it is running with the expected read and write latencies is paramount. This can be the difference between a smooth, pleasing user experience and a slow, error-filled one that makes your customers turn their back on a product — and never come back.

How to get started with monitoring Apache Cassandra with Grafana Cloud

Apache Cassandra is a highly scalable, open source NoSQL database system designed to handle large amounts of data across multiple commodity servers with no single point of failure. Apache Cassandra can be run as a single node but starts making sense when its run in a cluster setup. The system is optimized for high write throughput and is known for its ability to handle big data workloads with ease at super-low latencies.

Scrape Azure metrics and monitor AKS using Grafana Agent

As more organizations adopt cloud-based services like Microsoft Azure Kubernetes Service (AKS), it becomes increasingly important to monitor and manage the performance and reliability of these services. If you’re using AKS today, then Grafana Cloud provides the flexibility, performance, and visualizations you need to monitor your distributed applications.