Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Kubernetes Design Patterns For Optimal Observability

Technology is a fast-moving commodity. Trends, thoughts, techniques, and tools evolve rapidly in the software technology space. This rapid change is particularly felt in the software the engineers in the cloud-native space make use of to build, deploy, and operate their applications. One particular area where we see rapid evolution in the past few years/months is Observability.

Faster Debugging with Collaborative Troubleshooting Tools

As developers we understand the critical role teamwork and collaboration play in solving complex problems. Often, it’s that second set of eyes that uncovers an additional issue or sheds light on the root cause of a stubborn bug. Effective collaboration becomes a critical factor in determining a team’s success or failure, especially when debugging or troubleshooting problematic issues within complex applications.

Troubleshooting Slow Draining SQS Queues

This post is part of an ongoing series about troubleshooting common issues with microservice-based applications. Read the previous one on intermittent failure. Queues are an essential component of many applications, enabling asynchronous processing of tasks and messages. However, queues can become a bottleneck if they don’t drain fast enough, causing delays, increasing costs, and reducing the overall reliability of the system.

Kubecon + CloudNativeCon Europe 2023 Recap

KubeCon Amsterdam was an incredible gathering of like-minded professionals, bringing together devops, software engineers, vendors, and cloud technology enthusiasts from around the world. This year’s event was the biggest KubeCon + CloudNativeCon ever, with a sold-out attendee list of 10,000 strong. The sheer scale of the event was a testament to the growing popularity of cloud native technology and the vibrant community that supports it.

Debugging Containerized React Apps

In your lifetime as a frontend developer that works with React, you must have come across several issues with debugging a containerized React application. I bet you can relate, you’re certainly not alone. Containerization has become an integral part of best practices for software development teams that want to create, test and deploy applications quickly and efficiently. However, despite its advantages, it also comes with new challenges for debugging and troubleshooting applications.

The Magic Behind the Lumigo Kubernetes Operator

Kubernetes is the container orchestration platform of choice for many teams. In our ongoing efforts to bring the magic experience of Lumigo’s serverless capabilities to the world of containerized applications, we are delighted to share with you the Lumigo Kubernetes operator, a best-in-class operator to automatically trace your applications running on Kubernetes.

Migrating a Web App to AWS Lambda with Lambda Web Adapter

As developers, we all seek to build web applications that can scale seamlessly, adapt to changing needs and do so without incurring excessive costs. One way to achieve this is by migrating web applications to AWS Lambda, which can provide scalability, flexibility, and cost savings. To make this process even easier, AWS provides the Lambda web adapter, a simple and efficient tool that enables you to migrate your web apps quickly.

Distributed Tracing for AWS CDK Applications

The AWS CDK lets users build as Infrastructure as Code (IaC) reliable, scalable, and cost-effective applications in their cloud environments. With the AWS CDK, developers can use various supported programming languages to create constructs (reusable cloud components) and compose them together into stacks and applications.

Return large objects with AWS Lambda's new Streaming Response

Lambda has a size limit of 6MB on request and response payloads for synchronous invocations. This affects API functions and how much data you are able to send and receive from a Lambda-backed API endpoint. I have previously written about several workarounds on the request payload limit. But sometimes you also need to return a payload bigger than 6MB. For example, PDF or image files.

Troubleshooting Intermittent Failure in Amazon ECS apps

A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another. The components interact in a decentralized manner and work together to achieve a common goal. Working with distributed systems is challenging, because failure often spreads between components and debugging across multiple components is difficult and time-consuming.