Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

The mistake boot in engineering: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Fostering a fearless engineering culture: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Best One To One Plus Alternatives for K-12 in 2023

One to One Plus is a cloud-based, all-in-one software solution, tailored specifically to K-12 institutions. They provide a comprehensive suite of integrated IT asset management (ITAM), help desk software, and inventory management. As they describe on their website, they are “designed for K-12, built by K-12,” and appear to have been started by tech directors working in K12.

Essential Metrics for Kafka Performance Monitoring

Apache Kafka is an open-source distributed streaming system that has grown in popularity and usage across the technology industry. Originating from LinkedIn and now part of the Apache Software Foundation, Kafka provides a robust and scalable platform. It’s uniquely designed with an architecture that includes both a storage layer and a compute layer.

Configuring Python StatsD Client

Building and deploying highly scalable, distributed applications in the ever-changing landscape of software development is only half the journey. The other half is monitoring your application states and instances while recording accurate metrics. There are moments when you wish to check how many resources are being consumed, how many files are under access by the specialized process, etc. These metrics provide valuable insights into our tech stack execution and management.

What are Prometheus Functions?

Prometheus is a platform for real-time systems and event monitoring and alerting. The Prometheus project is free, open-source, and available on GitHub. Originally developed at SoundCloud, Prometheus became a project of the Cloud Native Computing Foundation in 2016, alongside other popular frameworks such as Kubernetes. The core of the project is the Prometheus server, which acts as the system’s “brain” by collecting various metrics and storing them in a time-series database.

The Best Cloud Infrastructure Automation Tools

The past decade has seen a drastic growth in the adoption of public cloud. One of the primary reasons for this is its cheaper infrastructure and ease of scale. With such rapid adoption of public cloud, the need for infrastructure automation also arises. This is because teams want to quickly provision infrastructure and automate tasks that previously took weeks in the case of traditional data centers, down to minutes in the public cloud.