This is the second of a three-part blog series. Prior to reading this, be sure to check out Part 1, Benefiting from multi-cluster setups requires familiarity with common variations. In your Kubernetes journey, it's highly likely that you'll encounter the need to manage multiple clusters simultaneously.
DevAlert 2.0, which is now immediately available from Percepio, is a major upgrade to our edge observability platform. The upgrade provides much improved diagnostic capabilities, including core dumps for Arm Cortex-M devices. This allows remote inspection of crashes, errors or security anomalies in full detail, including the function call stack, parameters and variables and with source code display.
We are pleased to share a sneak peek of Query Assistant, our latest innovation that bridges the world of declarative querying with Generative AI. Leveraging our large language models (LLMs), Coralogix’s Query Assistant translates your natural language request for insights into data queries. This delivers deep visibility into all your data for everyone in your organization.
Observability is important to understand what’s happening in production. But carving out the time to add instrumentation to a codebase is daunting, and often treated as a separate task to writing features. This means that we end up instrumenting for observability long after a feature has shipped, usually when there’s a problem with it and we’ve lost all context. What if we instead treated observability similarly to how we treat tests?
We're thrilled to share an exciting update from Logit.io. As part of our ongoing commitment to providing cutting-edge observability solutions to our users, we've integrated OpenSearch 2.10.0 into our platform, bringing a host of advanced features to enhance your experience. Let's dive into what's new and how these changes can benefit your observability workflows.
Ever since we launched Query Assistant last June, we’ve learned a lot about working with—and improving—Large Language Models (LLMs) in production with Honeycomb. Today, we’re sharing those techniques so that you can use them to achieve better outputs from your own LLM applications. The techniques in this blog are a new Honeycomb use case. You can use them today. For free. With Honeycomb.
The single pane of glass is perhaps the most enduring and elusive goal of enterprise IT operations teams. When we polled our customers a couple of years ago, out of 184 respondents, 99% of them rated it as important to their business – with 64% indicating “extremely important”. The shared dream is to have: But unfortunately, the single pane of glass has become a bit of myth.
In the ever-evolving world of IT, keeping an eye on application, service and system performance and addressing issues in real-time is crucial both to an organization’s customer experience, as well as its overall success. Two terms and approaches that have gained significant attention in recent years are AIOps and observability. While they both relate to improving IT monitoring and management, they serve distinct roles in enhancing operational efficiency.
Observability has brought a new approach to IT infrastructure management, easing the workload on IT admins across the world and bringing more accuracy and efficiency. One of the clear beneficiaries of this evolution in IT infrastructure management is incident response. Incident response is the systematic process of identifying, analyzing, and mitigating security threats, breaches, or operational issues to minimize their impact on the continuity of business operations.
Developers and SREs choose to host their applications on Google Cloud Platform (GCP) for its reliability, speed, and ease of use. On Google Cloud, development teams are finding additional value in migrating to Kubernetes on GKE, leveraging the latest serverless options like Cloud Run, and improving traditional, tiered applications with managed services. Elastic Observability offers 16 out-of-the-box integrations for Google Cloud services with more on the way.
A lot of reasoning in content is predicated on the audience being in a modern, psychologically safe, agile sort of environment. It’s aspirational, so folks who aren’t in those environments may feel like the path there includes doing “the new thing” or using “the new tool.” If you write software and your employer hasn’t caught up to all the newest, best ways to work, I hope this pragmatic post helps you sleep better at night.
Kubecon 2023 was more than just another conference to check off my list. It marked my first chance to work in the booth with my incredible Kentik colleagues. It let me dive deep into the code, community, and culture of Kubernetes. It was a moment when members of an underrepresented group met face-to-face and experienced an event previously not an option.
In the world of modern Kubernetes, things have come a long way from the days of a single cluster handling one app. Now, it's common to see setups that span multiple clusters across different clouds. Initially, managing those clusters was a complicated operation with many moving parts. Using tools such as SUSE Rancher, RedHat OpenShift or AWS EKS, made managing multiple clusters somewhat easier.
Observability isn’t just about watching for errors or monitoring for basic health signals. Instead, it goes deeper so you can understand the “why” behind the behaviors within your system. CI/CD observability plays a key part in that. It’s about gaining an in-depth view of the entire pipeline of your continuous integration and deployment systems — looking at every code check-in, every test, every build, and every deployment.
Choosing, deploying, maintaining, and rationalizing observability and monitoring tools can be a constant challenge for ITOps, DevOps, and SRE teams. As teams monitor increasingly complex systems, the need for instrumentation that monitors those systems grows at the same rate, leading directly to a growing problem of observability data engineering, integration, and enrichment.
“Isn’t observability just a fancy term for monitoring?” That’d be the response from most IT folks a few years back if you asked them about it. And here we are in 2023, where Observability now is as imperative a term as security itself.
Software systems are increasingly complex. Applications can no longer simply be understood by examining their source code or relying on traditional monitoring methods. The interplay of distributed architectures, microservices, cloud-native environments, and massive data flows requires an increasingly critical approach: observability.
ObservabilityCON 2023 took place in London this week, showcasing all the latest and greatest trends in open source observability. Following the opening keynote, the event featured a range of breakout sessions — led by both Grafana Labs experts and members of the Grafana OSS community — that explored observability best practices and lessons learned.
Organizations are constantly looking to grow and expand, which requires establishing strong foundations, especially for the IT infrastructure. The challenge in achieving this is to consistently push the limits of the IT infrastructure to deliver more business excellence. To ensure success, management operations should be fine-tuned, and this often requires improving tool sets, skillsets, and personnel.
In today's digital age, the complexity and scope of dynamic system architectures are expanding at an unprecedented rate. As a result, IT teams find themselves grappling with the challenge of monitoring and addressing conditions across multi-cloud environments. With the increasing complexities, IT operations, DevOps, and SRE teams are searching for enhanced observability within these multifaceted computing environments.
At Grafana Labs, our mission has always been to help our users and customers understand the behavior of their applications and services. Over the past two years, the biggest needs we’ve heard from our customers have been to make it easier to understand their observability data, to extend observability into the application layer, and to get deeper, contextualized analytics.
The Grafana LGTM Stack (Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics) offers the freedom and flexibility for monitoring application performance. But we’ve also heard from many of our users and customers that you need a solution that makes it easier and faster to get started with application monitoring.
As someone living the Honeycomb ops life for a while, SLOs have been the bread and butter of our most critical and useful alerting. However, they had severe, long-standing limitations. In this post, I will describe these limitations, and how our brand new feature, budget rate alerts, addresses them. We usually don’t have SREs writing product announcements, but I’m so excited about this one that I said, “Screw it, I’m doing it!”
With this guide, empower your SRE team to achieve enhanced alert remediation and incident management.
As a Site Reliability Engineer (SRE) or DevOps professional, you are well aware of the importance of observability in ensuring the smooth functioning and performance of your applications. Observing and monitoring your applications can help you identify and resolve issues in real-time, resulting in increased reliability and improved user experience. Logs play a crucial role in this process as they provide detailed information about the activity and behavior of your applications.
Fintech, an abbreviation for financial technology, encompasses many firms and technologies that employ innovation and tech to enhance and automate financial services and operations. Their goal is to enhance the efficiency, accessibility, and user-friendliness of financial services. Fintech entities span numerous sectors within the financial industry, such as online payments, lending, digital banking, investing, insurance, and more, all aimed at streamlining financial processes.
Stop me if you’ve heard this one before: you just pushed and deployed your latest change to production, and it’s rolling out to your Kubernetes cluster. You sip your coffee as you wrap up some documentation when a ping in the ops channel catches your eye—a sales engineer is complaining that the demo environment is slow. Probably nothing to worry about, not like your changes had anything to do with that… but, minutes later, more alerts start to fire off.
In telemetry jargon, a pipeline is a directed acyclic graph (DAG) of nodes that carry emitted signals from an application to a backend. In an OpenTelemetry Collector, a pipeline is a set of receivers that collect signals, runs them through processors, and then emits them through configured exporters. This blog post hopes to simplify both types of pipelines by using an OpenTelemetry extension called the Headers Setter.
Organizations saw a 243% ROI and $1.2 million in savings over three years In today’s complex and distributed IT environments, traditional monitoring falls short. Legacy tools often provide limited visibility across an organization’s tech stack and often at a high cost, resulting in selective monitoring. Many companies are therefore realizing the need for true, affordable end-to-end observability, which eliminates blind spots and improves visibility across their ecosystem.
Elastic Observability customers saw 243% ROI and $1.2 million in savings over 3 years For government and education organizations around the world, facilitating an efficient, reliable customer experience is essential when providing critical services and building trust with stakeholders. As technology infrastructure expands and the IT landscape becomes a complex mix of private cloud, public cloud, and air-gapped environments, the ability to see across all systems and data is challenging yet critical.
In the ever-evolving and fast-changing landscape of cloud computing and modern software development, achieving 360-degree visibility into your critical business services, applications and infrastructure is essential. This is where observability comes into play. Observability, especially in a cloud-based or cloud-native environment, has become a critical aspect of maintaining and optimizing complex systems and services.
In our continuous journey to support teams grappling with the complexities of Kubernetes environments, we’re thrilled to announce the launch of Honeycomb for Kubernetes, a dedicated solution designed to bridge the growing divide between infrastructure/platform teams and application developers. This is available to all plans (including Free!) at no additional cost.
Application Performance Monitoring (APM), tracing, and observability are fundamental software development and system management approaches. Each of these three concepts uniquely ensures that your applications operate, efficiently, smoothly, and reliably. Your organisation will more than likely already adopt one of these approaches, or even two, potentially all three.