Operations | Monitoring | ITSM | DevOps | Cloud

Querying Arrow tables with DataFusion in Python

InfluxDB v3 allows users to write data at a rate of 4.3 million points per second. However, an incredibly fast ingest rate like this is meaningless without the ability to query that data. Apache DataFusion is an “extensible query execution framework, written in Rust, that uses Apache Arrow as its in-memory format.” It enables 5–25x faster query responses across a broad range of query types compared to previous versions of InfluxDB that didn’t use the Apache ecosystem.

Kubernetes Deep Dive: Key Features, Visibility and Optimization

Kubernetes or K8s is an open-source production-grade container orchestration system for automating, scaling, and managing containerized applications. A container is a lightweight, standalone, executable ready-to-run software package that contains everything needed to run an application. It includes the runtime, code, libraries, systems tools, and default values for any essential settings.

Flight, DataFusion, Arrow, and Parquet: Using the FDAP Architecture to build InfluxDB 3.0

This article coins the term “FDAP stack”, explains why we used it to build InfluxDB 3.0, and argues that it will enable and power a generation of analytics applications in the same way that the LAMP stack enabled and powered a generation of interactive websites (by the way we are hiring!).

TensorFlow, Postgres, PGVector & Next.js: building a movie recommender

Learn how to build a movie recommender with TensorFlow, Postgres, PGVector, Javascript & Next.js. This is a series of videos where we build a project together step by step. Chapters: ABOUT AIVEN Aiven’s cloud data platform helps your business reach its highest potential by making your data work for you. It provides fully managed open source data infrastructure on all major clouds, helping developers focus on what they do best: innovate and create without worrying about the limitations of technology.

Real-Time Analytics: Definition, Examples & Challenges

Businesses need to stay agile and make data-driven decisions in real time to outperform their competitors. Real-time analytics is emerging as a game-changer, with 80% of companies showing an increase in revenue due to real-time data analytics as companies can gain valuable insights on the fly. This blog post will explore the concept of real-time analytics, its examples, and some challenges faced when implementing it. Read on for a detailed explanation of this exciting area in data analytics.

Connect and Federate Searches Across Your Cloud Data Lakes with Cribl Search

The way we handle massive volumes of data from multiple sources is about to change fundamentally. The traditional data processing systems don’t always fit into our budget (unless you have some pretty deep pockets). Our wallets constantly need to expand to keep up with the changing data veracity and volume, which isn’t always feasible. Yet we keep doing it because data is a commodity.

Everything you need to know about IT Operations Analytics

Data is both a challenge and an asset for IT professionals, who rely on IT Operations Analytics (ITOA) to guide them towards operational excellence, system reliability, and swift incident resolution. So whether you’re seeking clarity on understanding what ITOA is and its connection to related technologies, are contemplating how to use it within your organization, or are curious about its enhanced efficiency and cost savings benefits, we’ve got you covered.

Aiven Workshop: Learn Apache Kafka with Python

What's in the Workshop Recipe? Apache Kafka is the industry de-facto standard for data streaming. An open-source, scalable, highly available and reliable solution to move data across companies' departments, technologies or micro-services. In this workshop you'll learn the basics components of Apache Kafka and how to get started with data streaming using Python. We'll dive deep, with the help of some prebuilt Jupyter notebooks, on how to produce, consume and have concurrent applications reading from the same source, empowering multiple use-cases with the same streaming data.

Anomaly Detection for Time Series Data: An Introduction

Welcome to the handbook on Anomaly Detection for Time Series Data! This series of blog posts aims to provide an in-depth look into the fundamentals of anomaly detection and root cause analysis. It will also address the challenges posed by the time-series characteristics of the data and demystify technical jargon by breaking it down into easily understandable language. This blog post (Chapter 1) is focused on.

The Advantage of Cold Storage in InfluxDB

Imagine, if you will, having hundreds of devices that you need to monitor. All these devices generate data at sub-second intervals, and you need all that high fidelity data for historical analysis to feed machine learning models. Storing all that data can get really expensive, really fast. When that happens, you must decide what’s more important: keeping all your data or sacrificing insights and analysis. It may not be a big stretch of the imagination for many readers.