This is a guest post by Noah Zoschke, Engineering Manager at Segment. Segment is the customer data infrastructure that makes it easy for companies to clean, collect, and control their first-party customer data. At Segment, our ultimate goal is to collect data from Sources (e.g., a website or mobile app) and route it to one or more Destinations (e.g., Google Analytics and AWS Redshift) as quickly and reliably as possible.
Apache Hive is an open source interface that allows users to query and analyze distributed datasets using SQL commands. Hive compiles SQL commands into an execution plan, which it then runs against your Hadoop deployment. You can customize Hive by using a number of pluggable components (e.g., HDFS and HBase for storage, Spark and MapReduce for execution). With our new integration, you can monitor Hive metrics and logs in context with the rest of your big data infrastructure.
Dashboards provide critical visibility into the performance and health of your environment. But if your organization uses hundreds or thousands of dashboards, or if you’ve recently transitioned to a new company or different team, it’s not always easy to understand the full significance of the data shown on every single dashboard.
Ansible is an automation tool for provisioning, managing, and deploying infrastructure and applications. When building large-scale applications, Ansible enables users to manage and configure their infrastructure across platforms like AWS. Whether you rely on temporary or dedicated hosts, you can use Ansible to create a repeatable process for configuring them with the Datadog Agent.