StackState

The Power of Data Correlation: Troubleshooting Made Easy

Oct 13, 2023 By Mark Bakker In StackState

As software engineers, we all know that troubleshooting often involves sifting through heaps of data points — scanning metrics, reading logs, checking resource status and analyzing events. We manually connect the dots, and if we're experienced enough, we might spot an issue that's about to become a problem. At StackState, we've faced these same challenges.

Read Post

StackState

Read more about The Power of Data Correlation: Troubleshooting Made Easy

Configuration Drift: Understanding, Avoiding, Managing and Resolving in Kubernetes

Oct 4, 2023 By Jeroen van Erp In StackState

If you work with Kubernetes, you know that any number of issues can pose a serious threat to the stability and security of your deployments. One that's subtly damaging is configuration drift, which occurs when the actual state of how your system is set up — its configuration — strays from the way you defined. Configuration drift in Kubernetes can happen when people make changes manually, systems aren't synchronized properly or monitoring falls short.

Read Post

StackState

Read more about Configuration Drift: Understanding, Avoiding, Managing and Resolving in Kubernetes

Application Dependency Maps: The Secret Weapon for Troubleshooting Kubernetes

Sep 20, 2023 By Andreas Prins In StackState

Picture this: You're knee-deep in the intricacies of a complex Kubernetes deployment, dealing with a web of services and resources that seem like a tangled ball of string. Visualization feels like an impossible dream, and understanding the interactions between resources? Well, that's another story. Meanwhile, your inbox is overflowing with alert emails, your Slack is buzzing with queries from the business side, and all you really want to do is figure out where the glitch is. Stressful? You bet!

Read Post

StackState

Read more about Application Dependency Maps: The Secret Weapon for Troubleshooting Kubernetes

Unlocking IT: Considerations for a Powerful Observability Strategy

Sep 13, 2023 By Andreas Prins In StackState

In today's cloud-native landscapes, observability is more than a buzzword; it's a critical element for software development teams looking to master the complexities of modern environments like Kubernetes. There’s a multi-faceted nature to observability with all its various levels and dimensions — from basic metrics to comprehensive business insights. It’s complex and can continue indefinitely…if you let it.

Read Post

StackState

Read more about Unlocking IT: Considerations for a Powerful Observability Strategy

Platform Engineers: Applied Best Practices Are Baked-in to Kubernetes Monitoring

Aug 29, 2023 By Mark Bakker In StackState

Operating Kubernetes reliably and efficiently involves adhering to a set of best practices. These practices help ensure the stability, scalability and maintainability of your Kubernetes clusters and their applications. It's crucial for platform teams (responsible for the infrastructure) and software development teams (responsible for deploying applications) to work together in applying these practices.

Read Post

StackState

Read more about Platform Engineers: Applied Best Practices Are Baked-in to Kubernetes Monitoring

A Practical Developer's Guide on How to Troubleshoot HTTP 5XX errors

Aug 24, 2023 By Bram Schuur In StackState

Imagine the following situation: You are on call, and your monitoring dashboard has flickering red lights due to an increased number of 5xx HTTP responses from one or more of your Kubernetes services. Now it is time to start to troubleshoot 500 Errors. Instead of panicking, you can use this blog as a guide.

Read Post

StackState

Read more about A Practical Developer's Guide on How to Troubleshoot HTTP 5XX errors

Troubleshooting and Fixing Kubernetes CrashLoopBackOff

Aug 16, 2023 By Mark Bakker In StackState

In this post, we'll dive into what CrashLoopBackOff actually is and explore the quickest way to fix it. Fasten your seat belts and get ready to ride. Everyone working with Kubernetes will sooner or later see the infamous CrashLoopBackOff in their clusters. No matter how basic or advanced your deployments are and whether you have a tiny dev cluster or an enterprise multi-cloud cluster, it will happen anyway. So, let’s dive into what CrashLoopBackOff actually is and the quickest way to fix it.

Read Post

StackState

Read more about Troubleshooting and Fixing Kubernetes CrashLoopBackOff

Restarting Kubernetes Pods: A Detailed Guide

Aug 10, 2023 By Mark Bakker In StackState

This blog will help you learn all about restarting Kubernetes pods and give you some tips on troubleshooting issues you may encounter. Kubernetes pods are one of the most commonly used Kubernetes resources. Since all of your applications running on your cluster live in a pod, the sooner you learn all about pods, the better.

Read Post

StackState

Read more about Restarting Kubernetes Pods: A Detailed Guide

From Battlefield to Business: Applying the OODA Loop

Jul 31, 2023 By Andreas Prins In StackState

In today's dynamic world of software development and system operations, making informed decisions and developing effective strategies rely heavily on data. The OODA loop, developed by military strategist John Boyd, consists of a recurring cycle: Observe, Orient, Decide and Act. This is then followed by a Feedback stage (not represented in the OODA acronym for some reason) before the cycle repeats itself, allowing for continuous optimization.

Read Post

StackState

Read more about From Battlefield to Business: Applying the OODA Loop

Maximizing System Reliability: The Case for Dedicated Troubleshooting Tools

Jul 26, 2023 By Andreas Prins In StackState

As a leader in IT, the question of whether or not it makes sense to adopt a dedicated software troubleshooting solution probably comes up from time to time. If it's happened in your organization — no worries — you're not alone. Many teams wonder if their current tools, such as an Application Performance Monitoring (APM) solution or a suite of open-source solutions are sufficient.

Read Post

StackState

Read more about Maximizing System Reliability: The Case for Dedicated Troubleshooting Tools

Subscribe to StackState

Operations | Monitoring | ITSM | DevOps | Cloud

StackState

The Power of Data Correlation: Troubleshooting Made Easy

Configuration Drift: Understanding, Avoiding, Managing and Resolving in Kubernetes

Application Dependency Maps: The Secret Weapon for Troubleshooting Kubernetes

Unlocking IT: Considerations for a Powerful Observability Strategy

Platform Engineers: Applied Best Practices Are Baked-in to Kubernetes Monitoring

A Practical Developer's Guide on How to Troubleshoot HTTP 5XX errors

Troubleshooting and Fixing Kubernetes CrashLoopBackOff

Restarting Kubernetes Pods: A Detailed Guide

From Battlefield to Business: Applying the OODA Loop

Maximizing System Reliability: The Case for Dedicated Troubleshooting Tools

Monthly Archive

Follow Us