Operations | Monitoring | ITSM | DevOps | Cloud

Kafka Tiered Storage in depth: How Reads and Deletes Flow (Prefetching, Caching)

In this article, we will be continuing our series of deep dives into KIP-405. Previously, we covered: Now, we turn our attention to the internals of the read and delete paths. Just like we did for the write and metadata, here we will also be focusing on Aiven’s battle-tested Apache-licensed KIP-405 plugin. What makes the read path particularly interesting is how it delivers latency comparable to local disk or memory systems despite leveraging external object storage—let's dive in!

Top 10 Changes and Key Improvements in Apache Kafka 4.0.0

In this post, we summarize the major changes in the recently officially released Apache Kafka 4.0.0 version. We will look at the most notable features compared to the previous versions and explain what these changes mean in real production environments and what improvements they can bring to your streaming infrastructure.

Apache Kafka Tiered Storage in Depth: How Writes and Metadata Flow

The idea behind KIP-405 is to simply store most of the cluster’s data in another service. As we covered in detail in the last article - it’s a simple-sounding idea that goes a very long way. This other server where the data gets stored is pluggable. KIP-405 was designed in such a way to make Kafka seamlessly extensible to store its data in any kind of external store through a solid interface.

A Guide to Fixing Kafka Consumer Lag [Without Jargon]

Have you ever looked at your monitoring dashboard and wondered, "Why is my Kafka consumer lag spiking again?" It’s a common frustration. Consumer lag isn’t just an inconvenience—it’s a sign that something’s wrong with your data pipeline. When lag builds up, you're facing delayed data processing and the risk of system failures.

The critical role of Kafka monitoring in managing big data streams

Apache Kafka is the backbone of modern data streaming architectures, enabling real-time data movement, stream processing, and event-driven applications at scale. It enables high-throughput messaging between data sources and analytics platforms, supports log aggregation, and facilitates scalable extract, transform, load (ETL) pipelines for continuous data transformation and storage.

Guide To Confluent Kafka vs Apache Kafka

Kafka is an open-source distributed streaming platform for high-throughput and fault-tolerant real-time data streaming in large-scale systems. It can integrate with a wide range of data sources and sinks, which include databases, message queues, big data processing frameworks like Apache Spark and Apache Flink, and many more.

Out-of-box OpenTelemetry-powered Kafka & Celery monitoring

Messaging queues power modern distributed systems, handling background tasks, event-driven architectures, and real-time data streaming. However, debugging issues in Kafka and Celery queues has traditionally been a black box, with limited correlation between message producers, consumers, and broker metrics. With OpenTelemetry-powered Kafka & Celery monitoring, SigNoz introduces the industry's first fully integrated observability solution for messaging queues powered by OpenTelemetry.

Reducing the Costs and Operational Overhead of Kafka Infrastructures

Kafka is powerful. No doubt about it. But it’s also a beast when it comes to operational complexity and cost. What starts as a simple deployment quickly turns into a resource-hungry system that eats up engineering hours, compute power, and budget. Let’s consider a company that eagerly rolls out Kafka to streamline event streaming. Year one? Smooth sailing. Everything runs fine, and the team feels great. Year two? The cracks start to show.

Multi-Version Connector Support for Apache Kafka Now Available

Connecting the data across your business and getting it where it needs to be can often be challenging and place undue operational stress across your application, infrastructure, and platform teams. Apache Kafka, and in particular the Apache Kafka Connect framework simplifies these pain points by allowing you to use Kafka to transport data from where it is produced, to where it needs to be stored, analyzed, or transformed.