Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Legacy alerting removal: What you need to know about upgrading to Grafana Alerting

Two years ago, when we launched Grafana 9, we announced the deprecation of legacy alerting and introduced Grafana Alerting, the new default alerting system in all editions of Grafana. Since then we have invested in Grafana Alerting, making it easier to create and manage your alerts. Along the way we have also worked to make the transition from legacy alerting to Grafana Alerting as seamless as possible in preparation for the time when we remove legacy alerting altogether from Grafana.

How the Prometheus community is investing in OpenTelemetry

Goutham Veeramachaneni, a product manager at Grafana Labs, and Carrie Edwards, a senior software engineer at Grafana Labs, are both contributors to the Prometheus open source project. This post, which they wrote together, was originally published on the Prometheus.io blog in March 2024. The OpenTelemetry project is an observability framework and toolkit designed to create and manage telemetry data such as traces, metrics, and logs.

Simplified routing in Grafana Alerting: Easy, secure, and powerful

With great power comes great… complexity? When we introduced Grafana Alerting a few years ago, it included a powerful routing feature that teams could use to send alerts to various contact points. Unfortunately, this functionality also came with a fair bit of complexity and an unfamiliar UX. This prevented many users from adopting it, but we’re still big believers in how it can help users.

A better Grafana OnCall: Seamless workflows with the rest of Grafana Cloud

Incident response and management (IRM) doesn’t happen in a vacuum. Your ability to respond to issues in a timely manner depends greatly on how well your on-call engineers can use their IRM tooling and observability tools together to understand what changed and why.

Call me, maybe: designing an incident response process

Hey, I just deployed — and this is crazy. But the server’s down, so call me, maybe? Making your services available at all times is the gold standard of modern software operations. The easiest way to reach this would be to just write bug-free software, but even if you reach this completely unattainable goal — stuff happens! Modern software rarely exists in a vacuum and often depends on a multitude of external services and libraries.

How to automate image analysis with the ChatGPT vision API and Grafana Cloud Metrics

OpenAI’s ChatGPT has an extraordinary ability to process natural language, reason about a user’s prompts, and generate human-like conversation in response. However, as the saying goes, “a picture is worth a thousand words” — and perhaps an even more significant achievement is ChatGPT’s ability to understand and answer questions about images.

CI/CD observability: Extracting DORA metrics from a CD pipeline

Last November, Dimitris and Giordano Ricci wrote a blog post about CI/CD observability that looked into ways to extract traces and metrics in order to get a better understanding of possible issues inside a CI/CD system. That post focused on getting data from a continuous integration (CI) system, and it really resonated with the community.

How to surface trends and make sense of your data with Grafana

There is a Polish proverb: “Co za dużo to niezdrowo,” which more or less translates to “Enough is as good as a feast.” (Or, translated verbatim: “Too much of something can be unhealthy.”) Sometimes this is true for data as well. At Grafana Labs, we’re always introducing products and features that help you make sense of that abundance of data, either by efficient visualizations, adaptive observability, or apps dedicated to specific workflows and use cases.

How to validate Sigma rules with GitHub Actions for improved security monitoring

Monitoring your identity provider’s logs is critical to identify potential security threats. These logs are vital for a security team, who may store them in a specialized tool like Grafana Loki for enhanced accessibility and analysis. The ability to pinpoint specific patterns within these logs is key — and by crafting these patterns into Loki queries, you can conduct focused searches across logs.

How shipping/third-party logistics companies reduce MTTR and increase uptime with the Grafana LGTM Stack

These days, everything can be tracked: transportation, deliveries, food orders. . . For consumers, knowing the location of a package or courier is a bonus, but for companies in the business of shipping, delivering, and third-party logistics, it’s a necessity. And so is having the right observability system to ensure everything gets where it needs to go. After all, errors, downtime, or anything that causes delays will end up delivering unhappy customers and lost revenue.