Operations | Monitoring | ITSM | DevOps | Cloud

Analytics

Comparing Apache Hive vs. Spark

Hive and Spark are two very popular and successful products for processing large-scale data sets. In other words, they do big data analytics. This article focuses on describing the history and various features of both products. A comparison of their capabilities will illustrate the various complex data processing problems these two products can address.

Building a Real Time Metrics Database at Datadog

In the course of its eight years of existence, Datadog has grown its real time metrics systems that collect, process, and visualize data to the point they now handle trillions of points per day. This has been based on an architecture combining open source technologies, such as Apache Cassandra, Kafka, and PostgreSQL, with a lot of in-house software for in-memory data storing and querying.

Announcing Graylog 3.1 RC 1

Today we are releasing the first Release Candidate of Graylog v3.1. This release brings a whole new alerting and event system that provides more flexible alert conditions and event correlation based on the new search APIs that also power the views. In addition, some extended search capabilities introduced in Graylog Enterprise v3.0 are now available in the open source edition in preparation for unifying the various search features.

Leading Chief Data Scientists Weigh in on Building Time Series Anomaly Detection

In our recent webinar on what it takes to build time series anomaly detection, industry experts Arun Kejariwal, Ira Cohen and Ben Lorica shared valuable advice for ways to successfully implement and execute anomaly detection systems in today’s increasingly complex corporate world.

Announcing Single Sign-On (SSO) Support for CHAOSSEARCH

We are thrilled to announce that we now offer Single Sign-On (SSO) support for ALL customers on the CHAOSSEARCH platform. You can now integrate your existing identity provider with CHAOSSEACH and have your users access the platform without needing to manage a separate set of credentials.

Seeing is Believing: Announcing the DevOps Pulse 2019 with a Focus on Observability

In the world of Software Engineering, observability seems to be the talk of the town. We discuss it at conferences, read about it in blogs or articles, and see it promised to us by vendor after vendor. But what is observability? What issues have recently evolved to make it such an integral concept? What strategies are engineers employing to ensure observability? And most importantly of all, why are engineers looking to achieve it?

Running Spark with Jupyter Notebook & HDFS on Kubernetes

Kublr and Kubernetes can help make your favorite data science tools easier to deploy and manage. Hadoop Distributed File System (HDFS) carries the burden of storing big data; Spark provides many powerful tools to process data; while Jupyter Notebook is the de facto standard UI to dynamically manage the queries and visualization of results.