Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Save money on Serverless: common costly mistakes and how to avoid them

When used properly, serverless technologies like AWS Lambda can lower the cost of running a system. This is because you only pay for these services when you’re using them, so you don’t waste any money. Serverless technologies also have other benefits. They can provide better security, built-in redundancy and scalability. The biggest plus is that they let you do more with less time and effort. You can focus on the things that directly add value to your business.

Kubernetes Troubleshooting with Operators and Auto-Tracing

Kubernetes has revolutionized the way we manage and deploy applications, but as with any system, troubleshooting can often be a daunting task. Even with the multitude of features and services provided by Kubernetes, when something goes awry, the complexity can feel like finding a needle in a haystack. This is where Kubernetes Operators and Auto-Tracing come into play, aiming to simplify the troubleshooting process.

Kubernetes Community Day Munich Recap: A Meeting of Tech Minds and Ideas

This July, the community spirit was profoundly vibrant in the scenic city of Munich, as Kubernetes Community Day (KCD) Munich brought together a meeting of minds and inspired the open-source collaboration we all know and love. The event was a testament to the strength and vitality of the Kubernetes community, which pulsed with an energy of shared intellectual curiosity and passion for all things Kubernetes.

Introducing Trace Endpoint Mapping

At Lumigo, we see ourselves as your reliable ally in the noble mission of detecting and vanquishing troublesome issues that lurk within your serverless and container applications. Our secret sauce? Equipping you with a wealth of detailed trace data, ensuring you’re always well-lit and ready for battle when the nefarious ‘bugs’ make their unsolicited appearances.

Troubleshooting Bad Health Checks on Amazon ECS

Health checks are an important factor when working with containerized applications in the cloud and are the source of truth for many applications in terms of their running status. In the context of AWS Elastic Container Service (ECS), health checks are a periodic probe to assess the functioning of containers. In this blog, we will explore how Lumigo, a troubleshooting platform built for microservices, can help provide insights into container crashes and failed health checks.

Building, deploying and observing SDKs as a Service - Part 2

In the first part of our series on Building, Deploying, and Observing SDKs as a Service, we delved into the world of APIs and successfully deployed our own REST APIs by wrapping the existing pet store APIs. Now, it’s time to take our journey further and unlock the true potential of SDKs. In this second part, we’ll explore how to build an SDK for the pet store API using the OpenAPI spec and the OpenAPI Generator project.

The Curious Case Of Kubernetes Health Checks

Health checks for cloud infrastructure refer to the mechanisms and processes used to monitor the health and availability of the components within a cloud-based system. These checks are essential for ensuring that the infrastructure is functioning correctly and that any issues or failures are detected and addressed promptly. Health checks typically involve monitoring various parameters such as system resources, network connectivity, and application-specific metrics.

The Art of Using Execution tags to Troubleshoot ECS

In the grand tapestry of software engineering, our journey often winds through labyrinthine layers of application logic. Here, bugs play a compelling game of hide-and-seek, and features dance in an unpredictable ballet. During these instances of fervent exploration, we find ourselves longing for a reliable compass—a secret weapon—to help us decipher the riddles that lie ahead. Cue execution tags, our luminous lighthouse cutting through the dense fog of complexity.

Leveraging OpenTelemetry to Fix Flaky Integration Tests

At Lumigo, we heavily depend on a set of tests to deploy code changes fast. For every pull request opened, we bootstrap our whole application backend and run a set of async parallel checks mimicking users’ use cases. We call them integration tests. These integration tests are how we ensure: Recently, we changed our old “traditional log traversing” of integration tests into *amazing* OpenTelemetry traces graphs.