Operations | Monitoring | ITSM | DevOps | Cloud

July 2023

Sponsored Post

5 ELK Stack Pros and Cons

Is your organization currently relying on an ELK cluster for log analytics in the cloud? While the ELK stack delivers on its major promises, it isn't the only search and analytics engine - and may not even be your best option for log management. As cloud data volumes grow, ELK monitoring can become too costly and complex to manage. Fast-growing organizations should consider innovative alternatives offering better performance at scale, superior cost economics, reduced complexity and enhanced data access in the cloud.

Monitoring MLOps Workflows with Flyte-powered Grafana Dashboards - Civo Navigate NA 2023

Learn how to monitor MLOps workflows effectively with Flyte-powered Grafana dashboards in this talk from Navigate NA 2023. Shivay Lamba discusses the importance of monitoring in the MLOps journey, highlighting the unique challenges in monitoring machine learning models compared to standard software development. Discover how Flyte, an open-source project, can help manage and monitor ML tasks efficiently, and see a live demonstration of setting up Grafana dashboards to visualize system metrics like CPU and GPU utilization. Take advantage of this opportunity to enhance your MLOps monitoring skills and optimize your machine learning workflows.

How to monitor your feature flags with LaunchDarkly and Grafana

Feature management is an emerging set of tools and techniques for developing and testing software based around feature flags. It’s intended to increase productivity and performance, as well as improve software quality. Of course, you’ll also need to keep tabs on all those feature flags, so it only makes sense to pair feature management with observability for a more holistic view of your software development cycles.

How Grafana query caching and Amazon Timestream make dashboards faster and more cost-effective

This blog post was co-authored by Igor Shvartser, Senior Technical Product Manager at Amazon Timestream, and Michael Mandrus, Senior Software Engineer at Grafana Labs. Grafana Labs Senior Software Engineers Stephanie Hingtgen and Kevin Minehart also helped with the content.

Simplify managing Grafana Tempo instances in Kubernetes with the Tempo Operator

I’ve been working with Grafana Tempo for about half a year now, and one thing I like about it is that Tempo requires only object storage for storing traces, which is easy to set up in both cloud environments and on-premises. Another outstanding feature is TraceQL, which allows searching for relevant traces with a powerful query language.

Dashboard Fridays: Public Releases

Build using the Jira and Pendo plugins, this SquaredUp dashboard provides a sense of how popular our latest Dashboard Server release is with customers, and whether any bugs have been raised against it. We can now get a quick overview of any issues in the latest release that are affecting our customers. Plus, through the level of uptake of the new release, we can see if we have achieved the level of quality that we were aiming for.

New in Grafana 10: A UI to easily configure SAML authentication

In addition to the built-in user authentication that utilizes usernames and passwords, Grafana also provides support for various mechanisms to authenticate users, so you can securely integrate your instance with external identity providers. We are excited to announce that with the release of Grafana 10.0, we have introduced a new user interface that simplifies the configuration of SAML authentication for your Grafana instances.

Four reasons to try our next-gen dashboards

When you need to troubleshoot faster, rich out-of-the-box content lets you easily monitor the tools in your technology stack. Dashboards are key to our customers’ success — offering you deep insights at a glance and the ability to drill into the details most important to you. A couple years ago, we debuted a new style of dashboards, built on top of a scalable, flexible and extensible charting system.

OpenTelemetry demo app with Grafana, Loki, Prometheus, Tempo (Grafana Office Hours #06)

DevOps Engineer Blueswen Li 劉義瑋 joins us to walk us through some OpenTelemetry demo apps he created, instrumented with Grafana, Loki, Prometheus, and Tempo. He is joined by two of our Developer Advocates, Paul Balogh and Nicole van der Hoeven.

5 steps to start saving on your observability bill with Grafana Cloud Adaptive Metrics

In the observability space, it seems like everyone is talking about how to reduce costs and control the explosion of Prometheus metrics. It’s no wonder — our recent analysis of user environments suggests 20% to 50% of metrics generated are never used, but users are still stuck paying for them.

Using An Infrastructure Monitoring Dashboard

As businesses embrace more cloud-native technologies and IT infrastructure becomes more dispersed, they must connect their business goals and end-user experience with the availability and performance of their IT infrastructure. This change necessitates infrastructure monitoring to assure compatibility with cloud environments, operating systems, storage, servers, virtualized systems, and other components.

Jira Product Discovery Explained

Maidenhead Atlassian Community Event (ACE) are joined by Rina Nir of Radbee and Phill Fox of Adaptavist for a closer look at Atlassian's newest product - Jira Product Discovery which claims to make it easier to prioritize ideas and create roadmaps for Product Managers. Rina and Phil put it to the test, show us what it can do, and share tricks and tips to get the most from it.

Lessons learned from integrating OpenAI into a Grafana data source

Interest in generative AI and large language models (LLMs) has exploded in popularity thanks to a slew of announcements and product releases, such as Stable Diffusion, Midjourney, OpenAI’s DALL-E, and ChatGPT. The arrival of ChatGPT in particular was a bellwether moment, especially for developers. For the first time, an LLM was readily available and good enough that even non-technical people could use it to generate prose, re-write emails, and generate code in seconds.

How to monitor your Apache Mesos clusters with Grafana Cloud

We’re excited to introduce a dedicated Grafana Cloud solution for Apache Mesos, an open-source project for managing clusters in your data center and at cloud scale. Apache Mesos is a distributed systems kernel, running on every machine in a cluster and providing easy orchestration of every resource in the cluster. This allows you to treat compute units, memory, and disk as a single pool of resources.

How Worldline uses Grafana Enterprise and Grafana Mimir to run its platform-as-a-service at a global scale

According to the World Bank, two-thirds of adults around the globe currently make or receive digital payments. Businesses have come to expect quick, reliable processing, and one company at the forefront of that is Worldline. The global payment service provider (PSP) is a leading payment processor and payment provider in Europe, with about 3.4 billion e-commerce transactions made in 2022.

Dashboard Fridays: Steam Player Data

This is a fun dashboard to capture some Steam player statistics using the WebAPI plugin. Created by SquaredUp's Director of Engineering, Josip Dlaka, this handy dashboard displays how long his kids have been online, how many friends they have, and what they have achieved without even leaving their room! SquaredUp allows you to combine and visualize data from multiple data sources in a meaningful way, so this aesthetically pleasing dashboard gives a good overview of key Steam player metrics in Josip's household.

A practical guide to data collection with OpenTelemetry and Prometheus

Grafana Labs has always been actively involved in the OpenTelemetry community, even working with the predecessor projects OpenTracing and OpenCensus. We have been supporting OTLP as the primary input protocol for our distributed tracing project, Grafana Tempo, since its inception, and our Grafana Agent embeds parts of the OpenTelemetry Collector.

Real user monitoring in Grafana Cloud: Get frontend error tracking, faster root cause analysis, and more

The frontend of a web application is the part that users directly interact with. It’s the last mile of the digital service you deliver to your customers and it’s directly associated with customer satisfaction and business objectives. Knowing performance metrics such as CPU or memory is helpful, but at the end of the day, what you care most about is if the user experience is affected.

Grafana Agent v0.35 release: horizontal auto scaling, easy Flow mode migration, and more

Grafana Agent v0.35 is here! The latest release of the Grafana Agent brings with it loads of new features and enhancements. Today, we’ll highlight our work on horizontal scalability and making it simpler than ever to get started using the Agent. Let’s take a look!

Dashboard Fridays: Antarctic Observation Center

This Antarctic Observation Center dashboard shows important weather data for each of the four key research stations, information on the areas of research that they cover, and the human capacity of each station. Built for fun using the SquaredUp Web API plugin, this dashboard streams data from multiple websites including OpenWeatherMap and the British Antarctic Survey, and combines them into one easy view.

Celebrating Grafana 10: Top 10 Grafana features you need to know about

Since Grafana started 10 years ago, there have been more than 43,000 commits to the open source project. Grafana founder Torkel Ödegaard has made more than 7,600 of those commits, and he recently reflected on some personal favorites he’s worked on, ranging from early query builders to the latest navigation updates. Torkel isn’t the only one who has strong feelings.

OpenSearch Dashboards vs Kibana

In this guide, we will compare two of the leading data visualization tools based upon open-source software that are available for use for metrics, traces and log analysis. To allow new users to know exactly which solution may be best suited to their needs, we wanted to explore in more depth a comparison between OpenSearch Dashboards and Kibana across various aspects in our latest guide covering the differences between leading open-source software.

How to monitor Kubernetes network and security events with Hubble and Grafana

Anna Kapuścińska is a Software Engineer at Isovalent, who has a rich experience wearing both developer and SRE hats across the industry. Now she works on Isovalent observability products such as Hubble, Tetragon, and Timescape, as well as the respective Grafana integrations for all of them.

Monitor the past, present, and future of your Kubernetes resource utilization

Greetings, Kubernetes Time Lords! Through a series of recent updates to our multi-purpose Kubernetes Monitoring solution in Grafana Cloud, we’ve made it easier than ever to assess your resource utilization, whether you’re looking at yesterday, today, or tomorrow. All companies that use Kubernetes, regardless of size, should monitor their available resource utilization. If a fleet is under-provisioned, the performance and availability of applications and services are at serious risk.

How to display a metric on a Graphite dashboard

Graphite is free and open-source software. It is used as a time-series database monitoring tool, where you can collect, store and display time-series data in real-time. As you can monitor certain metrics of this data using Graphite, it has a very useful and simple dashboard used to visualize these metrics. This article will show you how to display a metric on your Graphite dashboard. MetricFire specializes in monitoring systems.

Trusted Types: How we mitigate XSS threats in Grafana 10

Grafana is a rich platform for data visualization, giving you full control over how your data should be visualized. However, this flexibility and freedom comes with some challenges from a security perspective — challenges that need to be solved to protect the data in Grafana. For years, cross-site scripting (XSS) has been among the most common web application security vulnerabilities.

Distributed tracing for testing with Grafana Tempo and Tracetest (Grafana Office Hours #05)

Did you know you can use distributed tracing for testing with Grafana Tempo and Tracetest? Distributed tracing can really help you drill down from metrics to root causes, but how can you automate it? Adnan Rahić, Senior Developer Advocate at Tracetest.io, shares how you can do just that, using Grafana + Grafana Tempo + Tracetest.

How we improved Grafana's alert state history to provide better insights into your alerting data

The Prometheus alerting model is a flexible tool in every observability toolkit. When enhanced with Grafana data sources, you can easily alert on any data, anywhere it might live, using the battle-tested label semantics and alerting state machine that Prometheus defines. Often, engineers want to see patterns in their alerts over time, in order to observe trends, make predictions, and even debug alerts that might be firing too often.

Monitor behind a firewall w/ Private Data source Connect on Grafana Cloud (Grafana Office Hours #04)

How do you monitor behind a firewall? With Grafana Cloud Private Data Connect, you can create a secure tunnel to query data sources, even including those in VPCs on public cloud vendors. Senior Software Engineers Stephanie Hingtgen and Georges Chaudy talk to Senior Developer Advocate Nicole van der Hoeven about how it works.

How to visualize time series from SQL databases with Grafana

Relational databases like MySQL, PostgreSQL, Oracle, and others have a wealth of time series data locked inside of them. Often this data can be used to enhance observability dashboards, or keep track of important application factors, like how many users have signed up for a service. In this article, we’re going to show you how to visualize any time series from any SQL database in Grafana using the time series visualization.

Breaking the memory barrier: How Grafana Mimir's store-gateway overcame out-of-memory errors

Grafana Mimir is an open source distributed time series database. Publicly launched in March 2022, Mimir has been designed for storing and querying metrics at any scale. Highly available, highly performant, and cost-effective, Mimir is the underlying system powering Grafana Cloud Metrics, and it’s used by a growing open source community that includes individual users, small start-up companies, and large enterprises like OVHcloud.

DORA Metrics Considerations

DORA metrics, not to be confused with the beloved children’s cartoon character, are a bit trendy at the moment in the world of technology. The DevOps Research and Assessment group (DORA) is run out of Google. They run surveys and do research into what makes organizations successful in the Digital Age. They’re probably most well known for their yearly State of DevOps Reports and the book Accelerate.

How to run faster Loki metric queries with more accurate results

Today I want to talk about metric queries. More specifically, I want to talk about an important concept that is going to make your queries run faster, give you more accurate results, and make your Grafana Loki operators (like me) much happier. A metric query in Loki looks like this: And the part I want to talk about is that at the end. Now, if you’re like me and have a short attention span and are already bored — I understand.

How to fix performance issues using k6 and the Grafana LGTM Stack

The Grafana Labs ecosystem is built on a range of different projects that incorporate logs, metrics, traces across load testing, and Kubernetes monitoring. I’ll assume you know all of that data (and more!) can be visualized in Grafana. What made my observability dream become reality, though, is how these systems can work together to help you effectively debug performance issues and operate your system with more confidence.