Operations | Monitoring | ITSM | DevOps | Cloud

August 2023

8 Advanced Tech Solutions for Reducing Business Downtime

In today's fast-paced business landscape, downtime can be a devastating blow to any organization. Every minute of unproductive time translates to potential revenue loss, damage to customer trust, and disruptions in operations. As technology evolves, so do the solutions for mitigating downtime and ensuring seamless business continuity. This article will explore eight advanced tech solutions that can significantly reduce business downtime and keep your operations running smoothly.

Get familiar with "Rusty" kernel programming in Ubuntu Lunar Lobster

The Linux kernel has recently introduced the Rust programming language as an alternative to C for creating kernel modules. Rust is a strongly, statically typed programming language with a focus on memory safety features which produces extremely compact executable code. These properties, paired with its good tooling, make Rust a natural choice for creating many types of kernel modules, including device drivers, network protocols and filesystems.

Democratize Automation with AI-Generated Runbooks

Operational efficiency is as critical within the IT and engineering teams as any other part of the business. Automating repetitive tasks and reducing escalations within and to these teams is of immense value. While automation saves time and boosts productivity, the complexity of developing automation can be a limiting factor and bottleneck. Generative AI is a paradigm shift here, in that it brings consumer-style simplicity to assisting in the development of enterprise-grade automation.

10 Critical Server Performance Metrics You Should Consider

More and more developers are worried about the end-to-end delivery of online apps as the DevOps movement gains attention. This covers the application's launch, functionality, and upkeep. Understanding the function of the server becomes more and more important as an application's user base grows in a live setting. You must collect speed data for the computers hosting your web apps in order to assess the health of your applications.

Automating Kubernetes Deployments with GitHub Actions

Kubernetes orchestrates the management of containerized applications, with an emphasis on declarative configuration. A DevOps engineer creates deployment files specifying how to spin up a Kubernetes cluster, which establishes a blueprint for how containers should handle the application workloads.

Better Together: Creating an AI-friendly Culture to Optimize Business Outcomes

Are business outcomes, with the potential to make or break an organization’s future, becoming more important than they’ve ever been before? It sure seems that way. Embarking on the journey of business growth despite a treacherous path of challenges, all departments are biting at the bit for stability, solid strategies, and a reliable plan for what’s ahead.

The Power of VMware CLI

Command Line Interface (CLI) for VMware is not just a feature but a cornerstone for effective virtualization management. Think of it as the hidden trapdoor that takes you straight to the control room of a spaceship. It’s less fancy than the graphical user interface (GUI), but it gets you direct access to the nuts and bolts of your VMware environment.

LogicMonitor Envision Dexda Demo

Watch this demo video to learn about our latest offering in AIOps, Dexda. Dexda ingests events from LogicMonitor Envision and seamlessly transforms them into episodes. Advanced machine learning techniques automatically identify features in the alert data to correlate the disparate alerts into connected insights based on time, resources involved, environment, and other significant features of the enriched alert data.

More Reliability, Less Firefighting: How to Build a Proactive Reliability Program

Does it feel like your team spends all its time putting out incident fires? Change the story with a proactive reliability program that actively improves reliability. Join reliability expert and engineering leader Jeff Nickoloff for a webinar that lays out the common traits for successful reliability programs so you can build more reliability and spend less time firefighting. You’ll also get a downloadable checklist worksheet to help you create and evaluate your reliability program.

The Double-Edged Sword of Modern Software Delivery

Kubernetes offers undeniable benefits—scalability, portability, reliability—and enterprises everywhere are jumping on the bandwagon to adopt it. However, as incredible as Kubernetes is, its adopters are learning a difficult lesson: Without taking the steps to standardize Kubernetes adoption across the organization, costs and risk can skyrocket.

Next-Gen Defense: Unleashing the Power of Kubernetes

The U.S. Department of Defense’s Software Modernization Strategy calls for gaining a competitive advantage to achieve strategic and tactical superiority. Leveraging artificial intelligence (AI) and implementing zero trust security are critical parts of the movement to modernize the U.S. military. To this end, U.S. Deputy Secretary of Defense Kathleen H. Hicks issued a memorandum in February 2022 establishing the formation of the DoD Chief Digital and Artificial Intelligence Officer (CDAO).

Introducing Cortex Plugins and Customization

To be a true system of record, your IDP needs to be a source of truth for all the data in your stack. While Cortex offers 50+ out of the box integrations, and the ability to bring in custom data, we know there are occasions where you’ll also want to visualize or take action on data sourced from other places including internally developed tools or repositories. That’s why we’re excited to officially launch the Cortex plugin framework plus UI customization. Now users can.

Simplifying Microservices Debugging on Kubernetes with Istio, OTel, and Apica

Microservices architecture has become increasingly popular in modern software development due to its scalability, resilience, and flexibility. However, with the benefits of microservices come the challenges of debugging and monitoring these distributed systems. Using the Istio service mesh, OpenTelemetry distributed tracing, and Apica’s Kubernetes-native observability platform, developers can easily collect and visualize performance data in real-time to identify and fix issues quickly.

Why the Blameless Mission Matters Today

Blameless was founded over 5 years ago, in a world that looked very different than the world today. We were the first mover in the incident management space, setting the standards for what these tools should achieve. These days, concerns about reliability, incidents, and toil have hit the mainstream. Why have we seen the tech world enter an era where reliability is priority #1? Why do we believe that the Blameless mission matters more today than ever before?

LLMs explained: how to build your own private ChatGPT

Large language models (LLMs) are the topic of the year. They are as complex as they are exciting, and everyone can agree they put artificial intelligence in the spotlight. Once LLms were released to the public, the hype around them grew and so did their potential use cases – LLM-based chatbots being one of them. While large language models have been available for some time, there are still a lot of challenges when it comes to building your own project.

Why Resilience Engineering Needs To Be A C-Level Strategy & How To Get There

The consequences of downtime and data breaches can be devastating to organizations, leading to substantial financial losses and irreparable damage to a business’s reputation. If last week's outage by the Bank of England is anything to go by, after losing trillions of £’s per day due to downtime, resilience shouldn’t just be an afterthought for organizations.

Latest Developments in Site Reliability Engineering, 2023

Gartner recently published its Hype Cycle for Site Reliability Engineering, 2023, (July 2023) report. OnPage was inspired by this report to share its prediction about the future of site reliability engineering. In this blog, OnPage will review evolutionary tools that can improve site reliability engineering practices.

Feature-Based Pricing: A Guide To Per Feature Pricing in SaaS

The best pricing strategy for your SaaS business will depend on your specific business model, target market, and competition. You’ll also want to test different pricing strategies to see which one works best for you. That said, feature-based pricing can be a very profitable way to price SaaS products. Here’s how it works. We’ll also include how to determine your cloud costs per feature so you can set profitable prices.

Why you need an artifact management platform for best-in-class software delivery

Discover the pivotal role of artifact management platforms in software delivery. Learn how Cloudsmith streamlines storage, boosts security, scales effortlessly, and more. Elevate your software delivery with indispensable tools and insights. The tools and strategies you employ in software delivery can make all the difference when distributing and managing software. As the intricacies of software projects amplify, the call for streamlined, secure, and adaptable solutions becomes undeniable.

Cycle's New Interface, Part 1

After a span of 5 long years, we've bid farewell to Cycle's old portal. Our engineering team has been working tirelessly over the last 10 months to bring a fresh, new interface to the platform for our users. This new design encapsulates the wealth of insights we've gained during this period. Just last week, we took the decisive step of launching it into production, and the initial feedback has been overwhelmingly positive.

The Importance of Using Supported Integration Packs for Your Orchestrator Deployment

When looking at any automation task in System Center Orchestrator, it will inevitably need to connect to other enterprise management systems to automate the process. This means using an Integration Pack or writing your own piece of PowerShell script to build an integration yourself.

LogicMonitor Envision Platform UIv4 Overview Demo

Take your user experience to the next level and get the most out of the LM Envision platform with UIv4! LM Envision's New UI provides the fewest clicks to get users where they are trying to go, intuitive next steps, pre-set defaults, consistency of bulk actions, better search and filtering, all coupled together with modern React components that make for fast, reliable, consistent execution of common tasks.

How Detected Risks helps you find reliability risks in minutes-without running any tests

This video showcases Gremlin's Detected Risks feature. Detected risks are high-priority reliability concerns that Gremlin automatically identifies in an environment. These include misconfigurations, bad default values, and reliability anti-patterns. Gremlin prioritizes these risks based on severity and impact, giving instantaneous feedback on risks and action items to improve the reliability and stability of each service.

Four Pillars of a Best-in-Class Reliability Program

Reliability impacts every organization, whether you plan for it or not. Leading companies take matters into their own hands and get ahead of incidents by building reliability programs. But since many of these programs are still nascent, how do you know what good looks like? Of course, the right tools and technology that can enable your team to uncover reliability risks before they impact users play an important role. But improving reliability goes beyond technology.

Enhancing Your Heroku Postgres Performance: A Guide to Effective Monitoring with Hosted Graphite

When it comes to managing your database, monitoring is crucial for maintaining data integrity, optimizing performance, and ensuring efficient resource allocation. In today's fast-paced technological landscape, having real-time insights into your database's health is more important than ever. This is where Heroku Postgres and Hosted Graphite come into the picture.

Hosted Graphite and Printers: Boosting Efficiency and Performance

Printers play a crucial role in various industries, helping businesses efficiently manage their document workflows. However, ensuring optimal printer performance and minimizing downtime can be a challenge. This is where hosted graphite comes into the picture. Hosted graphite is a powerful monitoring tool that allows businesses to graph metrics and gain valuable insights into their printer systems.

Ditch Nagios Errors for A Streamlined Alternative: "Return code of x is out of bounds"

In the realm of IT infrastructure monitoring, Nagios has long been a popular choice due to its robust feature set and flexibility. However, even reliable systems can encounter issues, and one recurring problem that Nagios users might encounter is the "Return code of x is out of bounds" error. In this blog post, we'll dive into the details of this error, what causes it, and how it can impact your monitoring efforts.

Progressive delivery on Kubernetes with CircleCI and Argo Rollouts

Containers and microservices have revolutionized the way applications are deployed on the cloud. Since its launch in 2014, Kubernetes has become a de-facto standard as a container orchestration tool. With traditional approaches of deploying applications in production, developers often release updates or new features all at once, which can lead to issues if there are bugs or other issues that weren’t caught during testing.

What is a Colocation Data Center?

The colocation data center is a type of service where organizations can rent out space in a data center facility to house their IT infrastructure. These facilities provide the power, cooling, and network connectivity that companies require to operate their servers, storage devices, and applications. Colocation services allow companies to reduce costs and avoid the hassle of building and maintaining their data center facilities.

Data at the edge: Meet modern data processing demands with edge computing

We’ve all experienced latency in some form. It’s unfortunately something we’re all too familiar with. We’ve even gone so far as to accept it as a regular albeit undesirable part of the user experience. Yet despite various steps taken over the years, it still exists and is as disruptive as ever.

Automation + Orchestration = A Continuous Journey to Drive Bigger Business Value

The more people realize the many ways automation makes their jobs easier, the more they want to apply it – and not just in IT, but across business departments and for multiple processes. By 2025, Gartner predicts: IT individuals have been taking advantage of automation for at least 10 years, but as the stats show, organizations as a whole are gravitating toward the value automation can bring.

How New Mexico State University accelerates compliant federal research with Ubuntu

When the stakes are high and national security is on the line, every decision matters. Just ask the team at New Mexico State University’s Physical Science Laboratory (PSL). Founded back in 1946 to support the United States’ space and rocket programs, PSL has been on the leading edge of defence-oriented applied science for over seven decades. But when the Department of Defense (DoD) rolled out new cybersecurity guidelines, PSL found itself at a crossroads.

Closing the Gap: Ubuntu Pro in the AWS Shared Responsibility Model

Deploying your application on a public cloud offers numerous benefits, including improved time to market, elastic capacity, and improved baseline security compared to on-premises solutions. However, this does not guarantee better security coverage for your application and data. For this reason, the major cloud providers provide a Shared Responsibility Model, which outlines the distribution of security responsibilities between the cloud service provider and its customers.

Azure Files: Latest Enhancements and Features

The Azure Files update in 2023 introduced Azure Active Directory support for REST API, enabling SMB file share access with OAuth authentication. This advancement improved the scalability of Azure Virtual Desktop by increasing the root directory handle limit from 2,000 to 10,000. Additionally, the public preview of geo-redundant storage for large file shares enhanced capacity and performance, while the Premium Tier now guarantees a 99.99% uptime SLA for all premium shares.

The Unplanned Show, Episode 11: Donnie Berkholz on ITIL, DevOps and Platforms

In this episode, Donnie breaks down where ITIL came from and where it’s starting to go, and why that’s useful for teams that are trying to adopt DevOps practices in ITIL-oriented organizations. Donnie gives some great examples of building empathy and bringing the ITIL teams along for automating changes and decentralizing Sev 2 incident management. He also lays out his core philosophies on Platform Engineering and how to justify the effort.

Things We've Learned About Software Delivery Principles Through A Pandemic - Civo Navigate NA 2023

Dive into the world of high-performing engineering teams with Jeremy Meiss of CircleCI in this Civo Navigate NA 2023 talk. Discover the pivotal metrics that drive success and learn how to strike the perfect balance between rapid delivery and robust stability in software engineering.

Unlocking the Power of Hosted Graphite and Machine Learning

Monitoring and optimizing IT infrastructure, applications, and networks is crucial for businesses in today's digital landscape. It allows them to proactively identify issues, ensure optimal performance, and deliver a seamless user experience. However, traditional monitoring methods often fall short when it comes to handling the increasing complexity and scale of modern systems. That's where hosted graphite and machine learning come into play.

Network Monitoring: A Comprehensive Overview

Imagine this: You’re a doctor. Your patient is a colossal network of computers, servers and cables, all intertwined and humming with activity. Your job? To keep an eye on this complex entity’s vital signs, ensure it runs smoothly and intervene when things start to look a little off. Welcome to the world of network monitoring and the role of network administrators.

Platform Engineers: Applied Best Practices Are Baked-in to Kubernetes Monitoring

Operating Kubernetes reliably and efficiently involves adhering to a set of best practices. These practices help ensure the stability, scalability and maintainability of your Kubernetes clusters and their applications. It's crucial for platform teams (responsible for the infrastructure) and software development teams (responsible for deploying applications) to work together in applying these practices.

Leveraging Cloud Infrastructure for Remote Work

Even though there have been numerous reports about the decline of remote work in the last two years, many companies are still using the model and seeing a lot of success with it. Some of this success has to do with the technology they use. Cloud technology has become popular as businesses have realized its numerous benefits, including its ability to help them reduce operating costs, enhance collaboration, and provide scalability. Here's how these businesses are using cloud infrastructure effectively for remote work.

Telecom security: How to safeguard your open source telco infrastructure

From pure voice to data, and now with the connectivity provided to devices and machines, telco systems make it possible to deliver digital services to society. Thanks to telecom systems, we can keep in touch with each other and reach the information sources we need at any time and anywhere. As we have become increasingly reliant on these systems, we also need to be vigilant about telecom security.

A Practical Guide to Incident Communication

Even the best software fails sometimes. How quickly those failures get addressed, and how your teammates and customers feel about you after the fact, comes down to how well you communicate with them. Users, customer success managers, Ops team members, IT, security, engineering leadership, even the executive team. Each has a vested interest in resolving engineering incidents quickly. All need to be updated with the right information at the right time.

How to schedule an AzCopy Data Transfer

AzCopy is a command-line utility designed for copying data to and from Microsoft Azure Blob and File storage. It is a very powerful tool provided by Microsoft that helps users to copy and transfer data efficiently and securely. One of the key features of AzCopy is the ability to schedule transfers. Scheduled transfers can be extremely useful in managing data and ensuring that data is moved or backed up at the most appropriate times.

Detecting Performance Monitoring Issues in Prometheus & Grafana | 2023 Guide

Stay ahead of performance hiccups with our comprehensive Prometheus & Grafana monitoring tutorial. From setup to advanced detection techniques, this guide ensures your systems run smoothly. Facing challenges or want to exchange tips? Connect with peers and mentors in our dedicated community space.

Advanced Monitoring and Observability Tips for Kubernetes Deployments

Cloud deployments and containerization let you provision infrastructure as needed, meaning your applications can grow in scope and complexity. The results can be impressive, but the ability to expand quickly and easily makes it harder to keep track of your system as it develops. In this type of Kubernetes deployment, it’s essential to track your containers to understand what they’re doing.

LogicMonitor Kubernetes Helm Monitoring

LogicMonitor recently added support for monitoring Kubernetes Helm Charts. This new module helps customers clearly see the health and performance of their Kubernetes applications, quickly respond when configured metrics exceed thresholds or deviate from patterns, and take action on critical issues to detect anomalies or issues early on.

How to use Key-Based Deduplication in Squadcast | Deduplication Rules | Squadcast

Key Based Deduplication is an efficient way to avoid duplicate entries when processing incoming Events alongside existing Incidents. It generates a Deduplication Key using a user-defined template specific to events from an Alert Source. This key helps identify and group duplicates. This video explains how does Key Based Deduplication work and how to set it up effectively.

Helm Dry Run: Guide & Best Practices

Kubernetes, the de-facto standard for container orchestration, supports two deployment options: imperative and declarative. Because they are more conducive to automation, declarative deployments are typically considered better than imperative. A declarative paradigm involves: The issue with the declarative approach is that YAML manifest files are static.

Cloud Repatriation: The Unforeseen Reversal in Cloud Computing Trends

At first glance, this may appear counterintuitive. After all, aren’t public clouds lauded for their scalability, flexibility, and cost-efficiency? However, a closer examination reveals a more nuanced reality. This article delves into the driving forces behind cloud repatriation, helping you determine whether it might be the right fit for your organization.

Sponsored Post

Managing On-Call Rotations: Navigating Incident Management from Chaos to Calm

Navigating On-Call rotations can often feel like taming a storm of alerts and constant disruptions, leaving teams overwhelmed and stressed. Hence there is a need to streamline On-Call rotations and leverage concerned software to restore order and peace. In this guide, you'll explore practical tips, best practices, and smart strategies to transform your Incident Management process. Let's get to a more efficient On-Call experience.

Cloud Computing Market Size And Key Insights You Need To Know In 2023

Over the next five years, the cloud computing market is expected to grow at a compound annual growth rate (CAGR) of 18.3%. Cloud-based services are becoming increasingly popular with businesses of all sizes, contributing to the growth of this market. When you consider the other benefits involved, that makes a lot of business sense.

Statefulset vs. Deployment in Kubernetes

As Kubernetes continues its ascent as a leading container orchestration platform, it's common for users to encounter a perplexing choice between two prominent workload controllers: StatefulSets and Deployments. Despite both controllers being instrumental in managing high-availability workloads, they diverge significantly in terms of features and use cases. Grasping these distinctions is pivotal for fine-tuning the performance and scalability of your Kubernetes infrastructure.

Everything You Need to Know About Kubernetes

Welcome to the world of Kubernetes - a powerful container orchestration platform. Before we dive deep into the concepts of Kubernetes, let's grasp the concept of containers - a lightweight, and isolated units that package applications along with their dependencies, ensuring seamless deployment and portability. In this blog, you will witness Kubernetes incredible abilities. It can handle the ups and downs of your applications, ensuring they scale seamlessly, even when facing tough challenges.

Exploring Kubernetes Nodes: Essential Components of Container Orchestration

Kubernetes serves as a robust tool for managing and orchestrating applications across multiple computers. These computers are referred to as 'nodes.' Picture nodes as fundamental units in the ecosystem of your applications. Every node possesses its own computing resources, encompassing memory, processing capabilities, and storage capacity. Your apps are hosted and run by nodes. They give your apps the room and resources they need to work.

5 Best Environment as a Service (EaaS) Platforms in 2023

A paradigm-shifting concept has emerged in the dynamic and ever-evolving world of modern software development — Environment as a Service (EaaS). This innovative approach has swiftly become a cornerstone of streamlined development processes, offering developers the means to effortlessly provision, manage, and collaborate within diverse software environments.

Ubuntu Desktop: charting a course for the future

It has been a little while since we shared our vision for Ubuntu Desktop, and explained how our current roadmap fits into our long term strategic thinking. Recently, we embarked on an internal exercise to consolidate and bring structure to our values and goals for how we plan to evolve the desktop experience over the next few years. This post is designed to share the output of those discussions and give insight into the direction we’re going.

Feature flags for stress-free continuous deployment

Feature flags (also known as feature toggles or switches) are conditional statements in code that determine whether a feature or functionality is visible and accessible to users of an application or service. They offer programmers a powerful tool for managing feature releases. Their capabilities are indispensable in software development, where agility and continuous, automated delivery are paramount.

Agent vs. Agentless Security: How Do They Stack Up for Secure Infrastructure Automation?

In today’s multi-cloud reality, we at Puppet see that perspectives about agent vs. agentless have changed as security needs have grown in complexity and urgency. Given the changes fact that different architecture models and supporting infrastructure (private & public cloud), the same viewpoint about the best way to keep a business secure is no longer a one-size-fits-all solution. In this blog, we’ll weigh the pros and cons of agent vs.

A Release Strategy for Continuous Innovation

At Cribl, we take pride in doing things differently. Our Customers First mentality is at the heart of everything we do as an organization–from free education and sandboxes, community programs, and platforms, to streamlining legal reviews on contracts. We strive to solve problems from first principles – understanding root causes to build optimal experiences vs. piecemeal solutions together. We aim to be a partner—working with you to address your challenges holistically.

Architecting a Data Infrastructure with Kubernetes - Civo Navigate NA 2023

Join George Trujillo, an experienced enterprise architect, who explores architecting data infrastructure with Kubernetes. Learn about real-time AI, data strategies, machine learning, and Kubernetes's crucial role in efficiently deploying models. Hear firsthand stories of his journey to stateful solutions and managing business requirements. Discover insights on technical debt, strategy drift, and how to improve the speed of organizational execution.

Unlocking Azure Data Factory Transparency with Business Activity Monitoring (BAM)

Facing challenges with Azure data factory's transparency? This video from Mike Stephenson unveils the power of Business Activity Monitoring (BAM) in bridging this gap. Which will uncover: By integrating BAM, not only can enhance transparency, but also elevate the efficiency of your operations, reducing frequent support hitches.

Streamlining incident response: the power of integration in engineering tools

In the ever-evolving world of software development, incidents are bound to happen. Whether it's an unexpected server crash, a critical bug impacting user experience, or a security breach, handling incidents swiftly and effectively is crucial for maintaining a seamless user experience and preserving business reputation. That's where incident response tools come in — to help you automate, document, communicate, and mitigate.

A Practical Developer's Guide on How to Troubleshoot HTTP 5XX errors

Imagine the following situation: You are on call, and your monitoring dashboard has flickering red lights due to an increased number of 5xx HTTP responses from one or more of your Kubernetes services. Now it is time to start to troubleshoot 500 Errors. Instead of panicking, you can use this blog as a guide.

Power Automate Integration for Business Users to Improve Automation Efficiency.

Struggling to manage all your Power Automate flows spread across different solutions? Get visibility into your integration processes with Serverless360's Business Activity Monitoring. Business Activity Monitoring (BAM) offers an holistic view of integrations across Power Automate, Logic Apps, and different Azure components. From real-time tracking to insightful dashboards this video by Michael Stephenson showcases BAM's vital features, facilitating swift troubleshooting and user empowerment.

Serverless Data Integration

Struggling with the complexities of serverless data integration, especially within the realm of Azure? Dive into this illuminating podcast by Ratomir Vukadin from Dev Tech as he delves into the intricacies of serverless data integration and offers essential guidance on overcoming them. Discover solutions ranging from addressing Azure Functions challenges to mastering hybrid methodologies, all backed by real-world case studies and beginner-friendly advice.

Streamlining operations with automation from Spot by NetApp

The increased usage of containerization and virtual machines (VMs) in cloud infrastructure comes with its operational toll. Teams are preoccupied with complex and tedious tasks that prevent them from focusing on applications and architecture and driving the business value they can deliver. Organizations are seeking efficient solutions to optimize their cloud operations and enhance scalability.

Azure and VMWare Integration

In today’s rapidly transforming digital landscape, virtualization and cloud computing have become pivotal technologies. Among the leaders in these fields, VMware and Microsoft Azure stand as two monumental names, each carving out unique niches and empowering businesses to reach new heights. The integration between VMware’s virtualization solutions and Azure’s vast cloud services is not merely a technical novelty but a strategic alignment that reshapes the way enterprises operate.

The MOST used metrics to boost deployment frequency

The most used metrics in Sleuth to help software teams boost deployment frequency are batch size, lead time, coding time, review lag, review time and deploy time. Here's how the software engineering team at Gigpro uses them to construct experiments. Give Sleuth a try and see how we empower software teams to build faster by making engineering efficiency easy to improve and measurable — in a way that both managers and developers love.

Modernizing Cybersecurity: New Challenges, New Practices

The practice of cybersecurity is undergoing radical transformation in the face of new threats introduced by new technologies. As a McKinsey & Company survey notes, “an expanding attack surface is driving innovation in cybersecurity.” Kubernetes and the cloud are infrastructure technologies with many moving parts that have introduced new attack surfaces and created a host of new security challenges.

Mastering Communication: 8 Skills for Enhancing Dev to Dev Collaboration

Collaboration: the secret sauce of every software project. But let’s be honest, mastering the communication needed for seamless teamwork is like trying to herd cats wearing roller skates – sounds amusing, but feels impossible. From navigating multiple time zones and illegible pull requests to juggling so many tasks simultaneously that your brain feels like a browser with fifty tabs open – teamwork isn’t always a walk in the park.

Black Hat USA: Adaptable Security From HAProxy

The curtain rose and fell on another spectacular Black Hat USA, the conference set against the backdrop of fabulous Las Vegas in the Mandalay Bay Convention Center. We knew upon hitting the Strip that all the glamor and neon lights were just the preshow for the main event: innovation and the latest in cybersecurity. We couldn’t wait to show attendees and fellow vendors what we had to offer.

More than downtime: the opportunity costs of poor incident management

In my last blog post, I wrote about the explicit costs of incidents — the ones you can easily track based on dollars lost. But the cost of incidents goes beyond the time spent resolving them. While we’re spending our time managing incidents (that includes mitigating and retrospectives), we’re incurring a large opportunity cost in terms of releasing the next big thing.

Job Progress Optimization: Deep Dive into Connections, Improvements & Query Analysis | 2023 Guide

Explore the vast world of SQL job progress. In this guide, we break down the ins and outs of connections, demonstrate innovative improvements, and demystify query analysis. Enhance your database efficiency and get ahead in your SQL journey.

CloudZero Vs. Homegrown Cloud Cost Tooling

From 2022 to 2023, worldwide cloud spending is expected to increase to $597.3 billion, up from $491 billion in 2022. Year over year, that’s a 21.64% increase. Compare that to the 15.42% increase (at the time of writing) in the Nasdaq-100 Technology Sector Index, and you see a simple, inconvenient truth: Cloud spend is outpacing business growth by a substantial margin. It’s no wonder that cloud efficiency sits at the top of many companies’ priority lists.

Choosing the Right Kubernetes Cluster Setup: A Comprehensive Guide

Kubernetes has revolutionized how modern applications are deployed, managed, and scaled. As the container orchestration platform of choice, Kubernetes provides a dynamic and highly efficient environment for running containerized applications. At the heart of this ecosystem lies the intricate relationship between Kubernetes and the applications residing within its clusters. Applications within Kubernetes clusters are arranged through Pods, which are managed and scaled by various controllers.

A Guide to Kubernetes Core Components

In the ever-evolving landscape of modern software development and deployment, Kubernetes has emerged as a prominent solution to manage and orchestrate applications. This technology has redefined how applications are deployed and maintained, offering a flexible and efficient framework that abstracts the underlying infrastructure complexities. In Kubernetes, you define how network traffic should be routed to different services and pods.

Top Features of Grafana Versions 7 & 8

Grafana is a monitoring system that helps you visualize your infrastructure and provides notifications when errors occur. It offers interesting features on some versions as it's the case on v7 and v8. We will go through some that are very interesting in particular Panel editor, Tracing UI, bar graph, and visualization. With MetricFire specializing in hosted monitoring, you can easily make a Grafana dashboard by booking a demo or signing on to the free trial immediately.

Down to the Dollar: Turning Logs into Serverless Estimates - Civo Navigate NA 2023

Learn the nuances of serverless cost estimation in "Down to the Dollar: Turning Logs into Serverless Estimates" with David Strauss. Discover actionable strategies for cost reduction and understand how effective log analysis can lead to better financial models. From predicting your serverless expenses to understanding the impact of architecture choices on cost, this video offers a comprehensive guide for any IT professional.

AI and Big Data Solutions

Big data and artificial intelligence (AI) go hand in hand. Used for tasks like trend prediction, process automation and research, these two technologies can help organisations solve some of the toughest problems. However, the growing volume of data and increasing diversity of data sources make it difficult to use data and AI effectively and at scale.

Unlock rapid developer productivity gains | Sleuth Automations Marketplace

Unlock rapid developer productivity gains with Sleuth's Automations Marketplace. Your team has no time for toil – the manual, repetitive tasks that eat away at your developers. That’s why the time is now for Sleuth Automations Marketplace, where no-code automations let you experiment rapidly to improve your engineering efficiency and save your developers from burnout. We discussed how to make your team awesome with DORA in part one of our series. This event will show you how Sleuth's Automations Marketplace helps you take the DORA philosophy and metrics to the next level to improve your team.

101+ Cloud Computing Statistics That Will Blow Your Mind (Updated 2023)

The trend continues to accelerate - even faster now. Cloud computing was already booming before 2020. But in the following two years, remote work flourished, and cloud adoption soared. Some companies have since returned to the office. Others are adopting hybrid models, where staff work from home and in the office. Yet there’s more to the rise and rise of cloud computing than remote working.

Machine learning in finance: history, technologies and outlook

In its analysis of over 1,400 use cases from “Eye on Innovation” in Financial Services Awards, Gartner found that machine learning (ML) is the top technology used to empower innovations at financial services firms, with operational efficiency and cost optimisation as key intended business outcomes. ML is a branch of artificial intelligence (AI) that involves the development of algorithms and models capable of automatically learning and improving from data.

Introduction to Azure Files Backup

Azure Files is Microsoft’s robust file storage solution, offering the ability to access data seamlessly from various locations using standard protocols. But in the world of IT, where data is the heartbeat of operations, its safety is paramount. That’s where Azure Files Backup comes into play. In a digital era, where data loss can spell catastrophe, backing up your valuable files is more than a best practice; it’s a necessity.

How seamless connectivity is driving the evolution of broadcast media

The way content is created, distributed, and consumed has changed a lot over the last decade with the introduction of streaming and on-demand viewing. Media sector revenues are growing, but so is the cost to produce content and manage complex digital ecosystems. So how can media and broadcasting companies simplify network complexity and improve performance, all while delivering great content to hungry audiences? First, let’s look at the three key areas for connectivity in media and broadcasting.

10 Best Firewalls for Small Business to Use in 2023

Why is it critical to know and implement the best firewalls for small businesses? Well, cybercrime in information technology development has reached new heights, and according to Cybersecurity Ventures, the damage it causes to the online landscape is forecasted to grow to $10.5 trillion annually by 2025. Phishing, malware, account takeover, credential abuse, ransomware, cryptojacking, and zillions of other severe cyber security attacks are commonplace these days.

Announcing the Gremlin Enterprise Chaos Engineering Certification (GECEC) program

We knew Chaos Engineering was in high demand when we first launched the Gremlin certifications in 2021. But we had no idea our Chaos Engineering certification programs would be such a success. There’s a reason: the market is looking for professionals who know how to wield Chaos Engineering well, and Gremlin's certification has become the gold-standard to learn the principles of Chaos Engineering and demonstrate proficiency.

Best Architecture for Dev Collaboration: Monorepo vs. Multi-Repo

Choosing between a monorepo and a multi-repo architecture can significantly impact how development teams work together. In this article, we’ll delve into the advantages and disadvantages of both approaches, helping you make an informed decision to enhance collaboration within your dev teams.

Azure Monitoring Tool: Here's What's New in Serverless360 for June 2023

Looking to optimize your Azure monitoring and management? In this video, Michael Stephenson dives into the latest release updates of #Serverless360, an enterprise-grade management and monitoring tool for Microsoft Azure. Discover the latest features added in June 2023, including failure monitoring for logic app standards, Hybrid Tracking between BizTalk to Azure, and much more.

[New Premium feature] Share pipeline workflow configurations across your repositories

As part of our focus on building features around team scale and performance, we are happy to announce that Bitbucket Pipelines now supports sharing of CI/CD configurations across repositories. This feature is now available as part of our Premium plan. With this feature, your teams can create centralized pipelines yml workflows and import that workflow to other repositories in your workspace. This brings several benefits: Here’s how it works.

Orchestrating Kubernetes with Terraform + Kustomize - Civo Navigate NA 2023

Jack Ross, Principal Software Development Engineer at Shutterfly, explains the use of Terraform and Kustomize for orchestrating Kubernetes. He distinguishes between Infrastructure as Code and Configuration as Code, outlining their respective benefits. Ross highlights how Terraform interacts with Civo and contrasts the functionality of Helm and Kustomize. The session wraps up with practical code examples illustrating the principles of these tools.

Merging to Main #5: Coexisting Between Kubernetes & Legacy Tech with Mark Panthofer, Nvisia

Are you trying to balance your CI/CD resources and effort between Kubernetes and your legacy tech? Don't know how to encourage adoption of the new processes? On this episode of Merging to Main, we'll cover just that. This episodes guest is Mark Panthofer, VP of DevOps & Cloud at nvisia and, during this session, Brandon and Mark discuss how your CI/CD can coexist between two distinct technology worlds.

AZcopy and GDPR compliance

In today’s data-driven world, managing information is more crucial than ever. With the constant flow of data, both individuals and organizations are increasingly concerned about privacy and security. The General Data Protection Regulation (GDPR) has emerged as a key legislative framework in the European Union to protect citizens’ personal data. But how does this relate to the tools we use to manage and transfer data, like Microsoft’s AzCopy?

Develop, Operate, and Optimize with VMware Tanzu: A VMware Explore Round-Up

Applications are the center of your organization’s business. Success (or failure) depends on how quickly you can respond to dynamic market demands driven by cultural shifts, technical innovation, and global events. This business agility is driven by fast, predictable application delivery. The past few years have seen IT leaders in the public sector and private industry alike rushing to get better and faster at delivering applications and services to their customers, employees, and constituents.

Holistic Azure Monitoring and Alerting with Single-Pane Visibility

One of the key ways to resolve Azure incident faster is to set up monitoring in business context and achieve a single pane of glass visibility across Azure subscriptions and resource groups. Also, it is significant to alert only on potential issues and relevant team members during an incident to eliminate alert fatigue. In this video, Manish Uppadhaya from Kleinschmidt shares how Serverless360 enhanced their Azure experience by centralizing monitoring and offering tailored alerts.

Why should every organization invest in cybersecurity software?

In today’s digital age, organizations across industries are increasingly reliant on technology for their operations, communication, and data management. While this technological advancement is no doubt beneficial, it also brings with it a heightened risk of cyber threats and attacks. From data breaches and ransomware attacks to intellectual property theft and financial fraud, the consequences of a cybersecurity breach can be devastating for any organization.

21+ Stunning FinOps Statistics You Need To Read

A recent report found that managing cloud costs, alongside resource usage, is the most pressing cloud management challenge for the fifth straight year. This makes sense, considering respondents said a staggering 32% of their cloud budget went to waste in 2022 alone. It doesn't end there. The following FinOps statistics also demonstrate the value of the cloud for many organizations.

How to Log HTTP Headers With HAProxy for Debugging Purposes

HAProxy offers a powerful logging system that allows users to capture information about HTTP transactions. As part of that, logging headers provides insight into what's happening behind the scenes, such as when a web application firewall (WAF) blocks a request because of a problematic header. It can also be helpful when an application has trouble parsing a header, allowing you to review the logged headers to ensure that they conform to a specific format.

The Resolve Automation Flywheel: A "Good to Great" Automation Journey

Organizations lean on IT as the foundational force for helping the business succeed, even more so when stakes are high, and the future is unpredictable. Considering the countless tasks and processes carried out in IT, automation is no longer a nice to have. It’s a requirement for starting, and maintaining, a system of consistency that supports the business through instability and uncertainty.

Exploring Kubernetes Storage: Persistent Volumes and Persistent Volume Claims

In today's world of container-based applications, the role of storage has become more critical than ever. One of the most significant challenges of containerization is the management of stateful applications. Kubernetes, one of the popular container orchestration platforms, provides a solution to this problem - Persistent Volumes (PVs). PVs allow the storage provision to be decoupled from the lifecycle of the Pod, making it easier to manage stateful applications.

Do you know where your engineering bottlenecks are hiding?

Do you know where your software engineering bottlenecks are hiding? Jacpocket used Sleuth to track DORA metrics, which revealed the software development team suffered from infrequent releases. Here's how Sleuth helped. Give Sleuth a try and see how we empower software teams to build faster by making engineering efficiency easy to improve and measurable — in a way that both managers and developers love.

The Iceberg of Engineering Incident Costs

I've long been fascinated with the metaphor of an iceberg to describe a problem who’s true magnitude is obscured beneath the surface. If you’re not familiar with this phenomenon, when ice freezes it decreases in density. This allows the solid ice to float, partially, atop the water with only a small fraction of it exposed. In fact, icebergs hold nearly 90% of their mass hidden below the water.

How to Achieve Zero-Downtime Application with Kubernetes

I’ve worked on on-premised and managed Kubernetes clusters for more than seven years. What I can say is that containers have drastically changed the hosting landscape! It has brought a lot of facilities where complex setups were required. Having several instances, with rolling restart, zero downtime, health check, etc. It was such a pain and time effort before (implementing VRRP solution, application monitoring restart with monit like, load balancing haproxy like)!

Best Practices and Potential Loopholes for Successful Microservices Architecture

Microservices architecture is a software development approach where an application is built as a collection of small, loosely coupled, independently deployable services. Each service focuses on a specific business capability and operates as an autonomous unit, communicating with other services through well-defined APIs. This architectural style is often used in the context of DevOps to create more efficient, scalable, and manageable systems.

Simplifying the Tooling you Need to Manage Infrastructure at Scale - Civo Navigate NA 2023

In this talk, Alejandro, a Go developer from Civo, introduces OpenCP, a tool aiming to simplify infrastructure management at scale. OpenCP provides a unified API for interacting with different cloud providers, rendering the need to rewrite configuration files when switching providers unnecessary. The tool simplifies infrastructure management by allowing operations to be handled through the kubectl interface, thus ensuring compatibility with most users' existing workflows.

New User Inbox and Circuit Management

As we enter the new era of 4.0, we are thrilled to introduce a range of fresh and user-friendly features. Take a peek into your brand new User Inbox, a centralized hub for all your notifications. Explore the enhanced Connectivity add-on with Circuit Management, allowing you to conveniently bundle your connections. And for those managing multiple outlets, get ready to experience the upgraded Outlet Control functionality that will surely delight you.

Hyperview DCIM 4.0 Software Release

As we enter the new era of 4.0, we are thrilled to introduce a range of fresh and user-friendly features. Take a peek into your brand new User Inbox, a centralized hub for all your notifications. Explore the enhanced Connectivity add-on with Circuit Management, allowing you to conveniently bundle your connections. And for those managing multiple outlets, get ready to experience the upgraded Outlet Control functionality that will surely delight you.

Optimizing Web Security Operations for Remote Work Environments

The shift towards remote work has been one of the most significant transformations in the modern business landscape. While it offers flexibility and a broader talent pool, it also introduces unique challenges, especially in the realm of web security. As businesses adapt to this new norm, optimizing web security operations becomes paramount.

What is a hypervisor? A beginner's guide

In the realm of virtualisation and cloud computing, the hypervisor is a critical component that enables the seamless operation of multiple virtual machines (VMs) on a single host. While virtualisation is a technology, the hypervisor is its actual implementation. In this beginner’s guide, we will explore the fundamentals of hypervisors, their types, and how they differ from container runtimes. We will also review some of the leading hypervisors available today.

Understanding the API economy

APIs (Application Programming Interfaces) have been around for decades, but although the concept emerged as far back as the 1940s, it wasn’t until the early 2000s and the efforts of eBay, Salesforce, and Amazon that really put them on the map. APIs today are ubiquitous - any company developing software will likely have an API for customers and partners - and research suggests the majority of enterprises consuming APIs are focused on strategies for easier consumption and management of APIs.

Integrating Azure Files with Windows Server

In today’s digital world, where data is considered the new oil, organizations are consistently looking for efficient ways to store and manage their invaluable information assets. Microsoft’s Azure Files and Windows Server are two technologies at the forefront of this technological evolution.

What is the SPACE developer productivity framework?

The SPACE framework is an acronym for an approach to measuring, understanding, and optimizing engineering productivity. Outlined by researchers from GitHub, Microsoft, and the University of Victoria, it encourages leaders to look at productivity holistically, placing leading metrics in context with each other and linking them to output, often team goals, rather than individual effort. The SPACE framework breaks productivity into five metrics.

10 FinOps Diagrams To Help You Better Understand The Value Of FinOps

By now FinOps has proven its value — so much so that the web is full of helpful in-depth guides, broad overviews, bite-sized articles, and resource lists on the subject. But if you're preparing to pitch a FinOps program at your company, you need to convey only the most important points in a way that's both impactful and succinct. That's when it's time to call in the visuals.

How to perform end-to-end message failure tracking in Azure?

When an application is built with distributed Azure Integration services, it becomes essential to end-to-end track message request flow to detect failures and isolate faulty services. Traditionally, this is done manually by going through hundreds of failure logs in Log Analytics by the developers, which kills their productivity. Also, the Application Map feature in Application Insights provides an oversimplified version of the topology map of the application. For instance, tracking the complete data flow in Azure Durable Functions is impossible.

Shifting Left Stateful Applications In Kubernetes - Civo Navigate NA 2023

In this talk, Viktor Farcic from Upbound demonstrates how to shift left stateful applications, avoid the pitfalls of Jira ticketing system, and implement a more efficient and self-sufficient system. Plus, see a live demo of building a database in a Kubernetes cluster and deploying an application.

Managing Users and User Groups: A Guide to OKTA and Cloudsmith Integration

Explore Cloudsmith’s powerful OKTA integration for user and user group management. Dive into the benefits, security considerations, and best practices to optimize user access, streamline workflows, and bolster security in your software operations. User management is the backbone of secure and efficient software operations. As businesses grow and evolve, the tools they use must keep pace. Enter OKTA and Cloudsmith.

How to import EKS clusters into Ocean in 5 easy steps

If you are familiar with Spot Ocean, you may already know how quickly you start realizing the value it generates. Ocean offers automation and optimization of container workloads, while providing enterprise-grade service. Today we are glad to share that you can start realizing the value even faster, with Ocean’s new creation wizard. Ocean’s creation wizard is designed to simplify and accelerate setting up Elastic Kubernetes Service (EKS) clusters with Spot Ocean.

Observability and the DORA metrics

The Accelerate State of Devops Report highlights four key metrics (known as the DORA metrics, for DevOps Research & Assessment) that distinguish high-performing software organizations: deployment frequency, lead time for changes, time-to-restore, and change fail rate. Observability can kickstart a virtuous cycle that improves all the DORA metrics.

SolarWinds Replacements: 6 Best Alternatives

SolarWinds is a trusted name in the world of IT management. This comprehensive suite of tools is designed to help organizations manage, monitor and troubleshoot their IT infrastructure. Solarwinds encompasses several capabilities, including network performance monitoring, systems management, IT security, database management, and IT helpdesk. Still, many SolarWinds replacements exist for IT teams looking for an alternative.

What causes Azure costs to increase?

As the adoption of cloud computing continues to surge, Microsoft Azure remains one of the leading platforms for businesses seeking scalable and efficient cloud solutions. I have been using Azure for a couple of years now; it provides a wide range of services and features, allowing organizations to host applications, store data, and deploy various workloads on a pay-as-you-go basis.

Scaling microservices: Challenges, best practices and tools

Scaling the deployment, in order to meet demand or extend capabilities, is a known challenge in many fields, but it’s particularly pertinent when scaling microservices. This article looks at the challenges of scaling microservices and examines best practices to overcome them while maintaining app quality, dev efficiency, and a good developer experience.

Top 5 Preview Environments Products to Consider in 2023

In the ever-evolving landscape of modern software development, where speed, quality, and collaboration are paramount, the concept of preview environments has gained significant traction. These virtual sandboxes serve as vital tools that empower developers to thoroughly test their code changes in isolated environments before integrating them into the main codebase. The advantages are multifaceted, ranging from reduced bugs to enhanced collaboration and accelerated software delivery.

The Top FP&A KPIs And How To Choose Yours

You know this by now. You can only improve an area of your business if you measure its health and performance. Yet, rather than collecting every general finance metric possible, you’ll want to plan, collect, and analyze the Key Performance Indicators (KPIs) that’ll drive strategic decision-making. That means KPIs are a subset of metrics that are most relevant to your goals as a team. Here’s how to leverage the right KPIs as an FP&A team.

Help reduce resource consumption: Put your preview environments on pause

With Platform.sh, every Git branch maps to a preview environment which is an exact and isolated copy of your live application—including all data, services, and files. They are usually created to build new features, apply security patches, or upgrade dependencies in full isolation and before deploying to production. Although there is a catch—preview environments are often left idle waiting for someone to review and approve any changes made.

Authenticating Users with Google IAP in Rails

Google Identity-Aware Proxy (IAP) is a Google Cloud service that provides authentication for web applications. This service simplifies the process of building web applications authenticated with Google, eliminating the need to handle user-related concerns within your application code. This is especially valuable for internal applications within organizations that already utilize GSuite. It is straightforward to use, particularly when operating on Google Cloud.

Enhance Business Processes with Azure Logic Apps Monitoring to Unlock Seamless Integration.

Handling Logic Apps' potential for various integration scenarios should be considered when managing integration processes, especially at a higher abstraction level like the Order business process. But with Serverless360's Business Activity Monitoring (BAM) capabilities, you can provide your users: Join us in this video as we dive deep into the seamless integration of Logic Apps with Serverless360, empowering you with efficient process management and real-time visibility.

How to secure your database

Cybersecurity threats are increasing in volume, complexity and impact. Yet, organisations struggle to counter these growing threats. Cyber attacks often intend to steal, damage, hijack or alter value-generating data. In this article around database security, we use the NIST framework to lay out the common controls that you can implement to secure your databases. Let’s start by discussing the potential impact of unsecured databases.

Top 10 End-to-End Testing Products for Web Applications in 2023

In the realm of software development and testing, end-to-end (E2E) testing plays a pivotal role in ensuring the robustness and reliability of web applications. Unlike unit or integration testing, which focuses on specific components or integrations between them, end-to-end testing evaluates the entire system as a whole. This includes testing the flow from the user's perspective, from start to finish, and often involves multiple systems and components working in unison.

Everything I Needed to Know about Securing a DevOps Platform - Civo Navigate NA 23

Join Hannah Sutor as she unravels the key aspects of securing a DevOps platform in this talk at Navigate NA 2023. She brings to light the necessity of incorporating security measures right from the coding stages to deployment, with an emphasis on continuous monitoring, automation, and the power of team collaboration. Her talk extends beyond the use of tools, focusing on the implementation of best practices for maximum security.

Grafana vs. SolarWinds - The Dashboards

Dashboards are great ways to visualize different KPIs in a single place. Metrics from all over your system can be framed together and viewed on a single screen, helping to correlate them and reducing the overall effort of analysis. But when it comes to Grafana vs. SolarWinds, which one is better? It is often difficult to choose between their dashboarding capabilities. Both tools provide their own visualizations and help bring out interactive dashboards for users to use.

10 Observability Tools in 2023: Features, Market Share and Choose the Right One for You

Understanding what's happening within your systems is a necessity. Have you ever wondered how experts keep an eye on systems to make sure everything's running smoothly? That's where observability tools come in! Observability tools are like helpers that give you a peek inside your tech. In this blog, we will talk about observability tools and how they can be used in different situations so it's easier for you to choose the right one for your organization.

How Transtira Delivers Improved Productivity for its Fleet with Remote Access and App Management

Discover how Transtira, an innovative logistics company, overcame the challenges of providing timely technical support, monitoring data usage, and ensuring driver safety with the help of AirDroid Business. With AirDroid Business, Transtira improved productivity, reduced labor and costs, and ultimately enhanced their logistics operations.

Impact of Kubernetes cluster maintenance on application availability

#kubernetes #eks #chaosengineering
In this video, we will be exploring an interesting scenario that might happen in real life. Let's imagine we have an application running in a Kubernetes cluster inside EKS. If for any reason, two of our three nodes are cordoned and can't be scheduled anymore, what would happen to our users should the last node be cordoned as well? And what if we need to reschedule something?

Checking your observability and communication platforms with Reliably

#reliably #chaosengineering #honeycomb #slack #resilience
In this video, we will use a chaos engineering experiment, that we expect to fail, to verify our open tracing and communication platforms are correctly set up. Using the Honeycomb and Slack integrations provided by Reliably, we will send traces and messages and observe if they are triggered as expected.

Eliminating Bias in Machine Learning: Gold In, Garbage Out

Data scientists have long been aware of the concept of “garbage in, garbage out” — the idea that the quality of results is a direct indicator of the quality of data. Indeed, much effort has been expended in the pursuit of cleansing data to ensure its accuracy. It then should come as no surprise that AI and machine learning (ML) algorithms are also subject to the same quality standards.

One Year of Automation, 100K Staff Hours Saved: A Telco Giant's Big Gain

A leading mobile communications company, based in South Africa, had big plans for its growth in the upcoming months. To ensure customer loyalty as they continued to grow their subscribers, they had to make sure their networks evolved while maintaining performance. That meant the organization’s IT and network teams needed a way to support business goals for growth with their existing capacity.

Incident Management: A Complete Introduction

In the dynamic landscape of IT operations, incidents are bound to occur. Incident management is a structured and proactive approach to address and resolve these unexpected events promptly and effectively. It forms a crucial component of IT service management (ITSM), ensuring smooth operations and minimizing the impact of incidents on an organization’s productivity and customer experience.

How Qovery Could Have Saved Time and Effort in Compare the Market's EKS Migration

During the AWS summit in London, Renee Hunt, the CTO of Compare the Market, shared their journey of migrating from EC2 to EKS and the obstacles they faced along the way. As I listened to their story, I couldn't help but think about how Qovery could have greatly streamlined its migration process; here is my take on the subject.

Maximize Long-Term Savings From Cloud Providers with Densify

One of the first considerations for FinOps teams trying to lower their public cloud spend is investing in long-term savings vehicles available from their Cloud Service Provider. These programs can provide customers with upwards of 72% savings off on-demand prices, in return for a 1-to-3-year usage commitment, so it’s pretty common that we see them in use by our customers.

Private Connectivity to IBM Cloud - Portal Demo

Brian Bowman, Solutions Architect at Megaport, will guide you through the process of setting up private connectivity to IBM Cloud via IBM Direct Link, using both the Megaport portal and the IBM Cloud Management Console. This video is perfect for customers who want to enhance their throughput capabilities and establish a secure connection to the IBM Cloud. Get ready to learn and take action!

Monitoring Kubernetes with Prometheus

In part I of this blog series, we understood that monitoring a Kubernetes cluster is a challenge that we can overcome if we use the right tools. We also understood that the default Kubernetes dashboard allows us to monitor the different resources running inside our cluster, but it is very basic. We suggested some tools and platforms like cAdvisor, Kube-state-metrics, Prometheus, Grafana, Kubewatch, Jaeger, and MetricFire.

The 15 Best Cloud Reporting Tools To Consider In 2023

Cloud infrastructure and the applications it hosts generate millions of rows of raw data in short periods. Viewed in the context of everyday business activities, most of the data makes little sense. So many companies amass mountains of this type of data in data lakes and data warehouses, hoping to figure out how to use it later but end up never tapping into it.

DevOps language trends 2023: Top tools used by elite software delivery teams

As organizations continue to embrace CI/CD and DevOps in their quest for shorter, more reliable delivery cycles, the choice of programming languages becomes even more critical. The language used to build your applications can affect everything from developer happiness and productivity to your organization’s performance on the four key software delivery metrics.

Neon and Qovery - The Perfect Match for Preview Environments

Hey guys, it's Romaric from Qovery. In this video, I'll show you how to combine Neon, a Postgres serverless solution, with Qovery to easily create and clone Postgres serverless instances. I'll walk you through the process step by step, demonstrating how to spin up a new serverless instance from Neon and connect it to a to-do application. The key point is that with Neon, you can create a branch from the original environment, make changes in the branch, and those changes will only affect that branch, not the parent environment. It's a powerful feature that allows for easy experimentation and isolation. So let's dive in and see how it works!

4 Ways a Consistent Schema Drives More Value From Your Observability Data

One of the hardest challenges in computer science is deciding what to name things. Adoption of consistent nomenclature is difficult because there is no one right answer. In fact, it’s not uncommon for different teams within organizations to choose different names for the same technologies. In the world of monitoring and observability, this can create quite a lot of confusion – not to mention wasted resources.

Canonical Kubernetes 1.28 is now generally available

Following the release of upstream Kubernetes on 15th of August, Canonical Kubernetes 1.28 is generally available in the form of MicroK8s, with Charmed Kubernetes expected to follow shortly. We consistently follow the upstream release cadence to provide our users and customers with the latest improvements and fixes, together with security maintenance and enterprise support for Kubernetes on Ubuntu.

Troubleshooting and Fixing Kubernetes CrashLoopBackOff

In this post, we'll dive into what CrashLoopBackOff actually is and explore the quickest way to fix it. Fasten your seat belts and get ready to ride. Everyone working with Kubernetes will sooner or later see the infamous CrashLoopBackOff in their clusters. No matter how basic or advanced your deployments are and whether you have a tiny dev cluster or an enterprise multi-cloud cluster, it will happen anyway. So, let’s dive into what CrashLoopBackOff actually is and the quickest way to fix it.

Unveiling Komodor's Network Mapping Capability

I am happy to share that thanks to the power of the open-source community, and our friends over at Otterize, we have now enhanced our Kubernetes offering for developers with another visual aid to streamline operations and troubleshooting – Dependencies Map. The Otterize network mapper is a zero-config tool that aims to be lightweight and doesn’t require you to adapt anything in your cluster.

Prometheus vs. AppDynamics

‍In the fast-evolving landscape of technology and software applications, ensuring optimal performance and reliability has become paramount. This article delves into two powerful tools that facilitate effective monitoring and management of digital systems: Prometheus and AppDynamics. With a focus on different aspects of application performance, these tools offer distinct advantages to businesses aiming to elevate their user experiences and operational efficiency.

More than downtime: the explicit costs of poor incident management

A cold fact of SaaS Life™ is that you can’t make money when your product or website doesn’t work — and those lost dollars add up fast. Downtime, SLA breach paybacks, compliance fines, and other explicit costs are the easiest to quantify and they’re what most people think of when they think about incidents.

Advance Azure message failure tracking for efficient business transaction management

In today’s fast-paced business world, organizations rely heavily on data to make wise decisions and streamline operations. Efficiently managing and accessing critical information, such as customer data, orders, and bills, is essential for the success of any business. Serverless360 Business Activity Monitoring offers advanced search capabilities and many other features that empower businesses with a self-service portal to find and retrieve relevant data quickly and easily.

What Is Universal Artifact Management, and Why Is It Beneficial?

Dive into the world of universal artifact management with our comprehensive guide. Discover the role of Cloudsmith in streamlining software artifact management, the advantages of a cloud-native approach, and the tangible benefits of a dedicated platform for software distribution. Navigating the software development world can sometimes feel like deciphering a new language.

Database Migrations in the Era of Kubernetes Microservices

In our extensive guide of best ci/cd practices we included a dedicated section for database migrations and why they should be completely automated and given the same attention as application deployments. We explained the theory behind automatic database migrations, but never had the opportunity to talk about the actual tools and give some examples on how database migrations should be handled by a well disciplined software team.

From Novice to Ninja: A Deep Dive into Managing Permissions with AzCopy

AzCopy, a command-line utility designed by Microsoft, is the bridge that links data transfer and data management within Azure. Targeting seasoned professionals, it offers a granular level of control, especially when it comes to permissions. Managing permissions is not just about controlling access; it’s about maintaining the integrity of data, ensuring compliance, and optimizing operational efficiency.

Simplify Building Applications for Kubernetes - Civo Navigate NA 2023

In this talk, Robert Sirchia discusses simplifying the process of building applications in Kubernetes using Epinio, an open-source tool. He distinguishes between building applications for Kubernetes and building applications in Kubernetes, emphasizing the importance of having "just enough" knowledge about Kubernetes for developers. Epinio streamlines the deployment process, allowing developers to focus on their code without worrying about containerization.

Establishing a Kubernetes cost management strategy

Kubernetes (k8s) adoption has skyrocketed since its 2014 introduction, becoming one of the most popular open source container orchestration platforms for its power and flexibility. K8s reduce costs by improving efficiency, optimizing resource use, and eliminating redundancies. But cost savings in Kubernetes can be tricky to maintain. In fact, a 2021 survey of 178 organizations showed that costs associated with k8s can actually increase from insufficient monitoring, resulting in overspend.

The Azure Automation Series - episode 2

In Episode 2 of The Azure Automation Series, delve into the world of Azure Automation with Sathish Veerapandian, a seasoned Microsoft MVP and Cloud Architect. Explore the potential of Azure Automation for managing crucial automation tasks in Microsoft Azure. Uncover the key aspects of Configuration Management, including inventory tracking, change monitoring, and desired state configuration.

The Azure Automation Series - episode 3

Embark on a journey of process automation in the thrilling conclusion of The Azure Automation Series! Join Sathish Veerapandian, a Microsoft MVP and automation expert, as he unveils the incredible capabilities of Azure Automation's process automation component. Learn how to effortlessly handle recurring tasks, automate repetitive processes, and optimize your workflow efficiency.

What is eBPF?

eBPF, or Extended Berkeley Packet Filter, is a kernel technology available since Linux 4.4. It lets developers run programs without adding additional modules or modifying the kernel source code. Think of it as a lightweight, sandboxed virtual machine (VM) within the Linux kernel that lets you run Berkeley Packet Filter (BPF) bytecode that uses certain kernel resources. Utilizing eBPF removes the need to modify the kernel source code and improves the software’s capacity to use existing layers.

Pro Tip: Enhance Your Coding Environment with the Colorful Themes of Rainglow for VS Code

If you're looking for some fresh new themes to really vibefy your coding experience, then check out the Rainglow extension for VS Code! These themes are designed to give you a stylish and modern coding experience that you'll love. From dark and modern to light and airy, these themes are perfect for any developer looking for a fresh new look. Give the Rainglow extension a shot and see for yourself how it can change your coding experience!

Docker on Mac - a lightweight option with Multipass

For those looking for a streamlined, lightweight command line interface for Docker on Mac, look no further. Multipass is a flexible tool that makes it easy to create and run Ubuntu VMs on any platform, and it comes with built-in tools that make running applications like Docker feel native on platforms such as macOS.

Controlling Our Destiny: Building When Open-Source Is No Longer Open-Source

The dev world was on fire this weekend, as news of yet another major open-source project was revealed to be in the midst of an identity crisis. The unsettling trend is clear: hit a certain adoption threshold, and then swap the licensing in an attempt to turn dedicated fans into revenue streams. With more companies searching for a sustainable business model and attempting to appease shareholders, the only certainty we have is, what was free yesterday, might be paid tomorrow.

Azure Blob Storage Versioning: A Step-by-Step Guide

In the multifaceted world of cloud computing, managing and safeguarding data becomes paramount. Azure Blob Storage Versioning serves as a pivotal feature within the Microsoft Azure platform, providing the essential capacity to control and maintain various versions of data. Whether you’re a small business owner worried about accidental deletions or a large corporation dealing with regulatory compliance, understanding Azure Blob Storage Versioning is crucial.

How Much Does Twitter Spend On AWS And Google Cloud?

It's been a rollercoaster ride at Twitter recently. News of layoffs with "50% higher than legally required" severance pay has dominated the news. But Twitter’s new owner, Elon Musk is also trying to optimize cloud infrastructure costs. Some of the tactics used include refusing to pay a $70 million AWS bill and $8 million to a software services company. More on that below.

Is your cloud provider telling you everything, everywhere, all at once?

Today the Internet IS the new enterprise network your organization relies upon. However, most of your key applications and systems are outsourced to the cloud. In fact, huge parts of your Internet Stack are either outsourced to the cloud or to 3rd-parties who themselves rely upon the cloud. And that's an issue because if any of those cloud-based services go down, your network is going to be impacted.

Maximizing Efficiency and Savings: Explore Virtana's Latest Innovations to IPM and Cloud Cost Management

In today’s rapidly evolving business landscape, where IT infrastructure and cloud costs play a pivotal role, organizations demand advanced solutions that streamline operations, optimize performance, and drive cost efficiency. Virtana, a trailblazer in infrastructure monitoring and observability and true multicloud cost management, has taken another leap forward by introducing a host of groundbreaking features to our flagship products.

CloudOps: Transforming IT Operations in the Cloud

CloudOps, or Cloud Operations, is quickly becoming the standard for managing IT operations in the cloud computing ecosystem. By transforming traditional IT operations to harness the full potential of the cloud, businesses are experiencing greater automation, collaboration, agility, and resilience. This article is a deep dive into the concept of CloudOps, its core components, the advantages it offers, and the steps necessary to implement it effectively within an organization.

Air-Gapping Should Be Head-Slappingly Obvious

When you think of air-gapped security, you imagine a protective distancing that separates your sensitive data from those who would steal it. In practice, the separation is a disconnection from the Internet. If no one can get to your data, no one can steal it. However, air-gapped deployments that are completely disconnected from the Internet are not the case in all instances. It’s true that many clusters are fully air-gapped, particularly in classified government installations.

What is data portability & why should businesses care about it?

If you wanted to switch from one project management tool to another, what would happen to your data? As more and more business operations undergo digital transformations and move online, organizations have become more reliant than ever on their digital data. Many take for granted that their data is their own and that they can take it with them if they change tools. But for those who rely on third-party tools, control of data may not be so simple.

5 Steps to An Easy, Error-free Load Balancer Sanity Reboot

It’s one of the main use cases fit for network automation: the load balancer sanity reboot. Not to be left to manual executions, this long, tiresome task creates too much possibility for errors. Negative effects like unnecessary time and money spent are detriments that organizations can avoid, simply by automating the types of tasks—like load balancer sanity reboots—that include loads of repetitive steps.

eCommerce Load Testing (Step 3): Build the Environment

In this webinar clip from "Ensuring performance: How major retailers leverage user traffic to validate code changes", Speedscale Co-founder, Nate Lee, explains what to consider when building the environment, including backend dependencies and data. He covers how service mocking can help companies test at a higher velocity in today's complex development environments.

Replicating Traffic: Use Cases & Benefits

In this webinar clip from "Ensuring performance: How major retailers leverage user traffic to validate code changes", Speedscale Co-founder, Nate Lee, talks about why traffic-based testing (unlike manually writing script tests) can help companies move faster and test at the speed of development. He covers the top use cases and benefits of leveraging traffic as the new way to test.

Leveraging Neon's Serverless Postgres with Qovery Preview Environments

At Qovery, we are committed to ensuring our users have access to the best development tools in the industry. That’s why we’re excited about Neon — a state-of-the-art serverless Postgres solution. When used in conjunction with Qovery's Preview environments, Neon supercharges your development pipeline.

How a Service Mesh enhances EdgeComputeOps - Civo Navigate NA 2023

In this talk, Marino Wijay discusses Edge Compute Ops and the significance of service mesh technology. He explores the evolution of edge computing, its various forms, and the challenges it poses. The talk highlights Ambient Mesh, a sidecar-less mode of Istio service mesh, as an ideal solution for edge computing due to its adaptability and security features.

Behind the Scenes: Mattermost OpenOps AI Mindmeld | August 15, 2023

Tune in for a behind-the-scenes discussion on the advancement of Mattermost's AI tools and how they're being integrated into the team's current projects. In this installment, the team explores the bot's capabilities — including how much it can remember — and evaluated whether they should focus their efforts on improving the AI model or adding more features to the AI plugin, among other things. Summary.

How Real-Time Asset Tracking Transforms Data Center Operations

Using real-time asset tracking, data centers can identify the exact location of any given asset at any given time. This can be extremely useful in ensuring that available resources are optimally utilized. For instance, understanding the current status and location of servers, racks, and other hardware can streamline maintenance operations and reduce downtime.

The importance of a network backup tool

Configurations are regarded as the core of networks due to their importance. With businesses continually advancing and relying on networks for storing, processing, and transmitting critical data, the complexity of network management has increased, leading to difficulties and human errors that can cause significant network downtime.

How to ensure business continuity with IT infrastructure support

Picture this: you’re on a dream vacation with your family on a serene tropical island. The weather is perfect, the sea is mesmerising, and you’re ready to enjoy a relaxing day at the beach. Just as you’re about to unwind, your phone rings: it’s your manager calling to inform you that your IT infrastructure is down, and you need to fix it immediately. If this scenario sounds all too familiar, you’re not alone.

The Sound of Code: Instrument with OpenTelemetry - Civo Navigate NA 2023

Join Henrik Rexed in this insightful talk as he explores "The Sound of Code" and demonstrates how to instrument your code with OpenTelemetry for improved observability. penTelemetry enables the generation of traces, metrics, and logs, providing valuable insights into application performance and troubleshooting in production environments. The talk covers the components of OpenTelemetry, how to customize telemetry data, and the importance of context in observability solutions.

Demo: Summarizing all unread messages in a channel with one click | August 14, 2023

In this video, Jesús Espino shows a demo of a new feature that enables users to summarize all unread posts in a channel with a mouse click. This video shows the interface, but the backend work isn't ready yet; we expect this will work as designed once the AI backend is connected.

Why should I consider Salesforce Express Connect?

In this blog, we look at why more businesses are considering Salesforce Express Connect for accessing their Salesforce applications – and how Console Connect can help. Salesforce needs little introduction. It has become the CRM of choice for enterprises, now counting 150,000 customers across various industries, including the likes of Spotify, Amazon Web Services, and U.S. Bank.

How Modern Retailers Approach Software Testing (4 Steps)

In this webinar clip from "Ensuring performance: How major retailers leverage user traffic to validate code changes", Speedscale Co-founder, Nate Lee, shares how modern retailers are thinking about and approaching software testing in increasingly dynamic and complex environments.

eCommerce Load Testing (Step 2): Capture Traffic

In this webinar clip from "Ensuring performance: How major retailers leverage user traffic to validate code changes", Speedscale Co-founder, Nate Lee, explains what comes next after selecting the business-critical APIs to test first: capturing traffic. Capturing traffic for the service can be done through tools like Speedscale, GoReplay, VCR, K6, and JMeter.

Netreo How To: Troubleshooting No Alerts Received

henToo many alerts are frustrating and have an even worse trickle down effect on IT teams. When alert deluge turns to alert fatigue, critical issues may be ignored. But seasoned IT pros will tell you that receiving no alerts causes even greater distress. When monitoring tools go silent, user complaints are sure to follow. Even with Netreo’s high-value, intelligent alert management capabilities, issues can go undetected from time to time.

Exploring the Intelligence Gap: Robocop and Cyber Criminals

As technology professionals, we must consider the evolution of security and its connection to literature, such as George Orwell’s “1984” and Aldous Huxley’s “Brave New World.” The digital threats we face are often unseen, lying dormant until they can be weaponized for both good and evil purposes. Advancements in machine learning and algorithms have revolutionized data analysis, allowing us to observe and analyze behavioral patterns both online and offline.

But It's Not Our Fault! When Third-party Incidents Affect Your Service

Very few SaaS products exist completely independently. Between cloud service providers, payment processors, content delivery networks, and more, chances are you rely on external systems to keep your product working. When these systems fail, it can leave you feeling pretty helpless. In some cases you might have fallback options, but oftentimes all you can do is wait for recovery and clean up the fallout.

Managing Cloud Cost Integration During A Merger Or Acquisition

Merging two companies into one comes with several challenges. One of the most formidable endeavors is maintaining control over cloud costs while a cloud-to-cloud integration — essentially, taking the cloud environments of two companies and combining them into one — is taking place. One or both companies may not be tracking cost data, and even if they are, it’s highly unlikely that their cost-tracking strategies line up well enough to merge seamlessly.

GitKraken Client is Migrating From Libgit2 to Git Executable

In a world where companies pivot for breakfast, lunch and dinner – it’s rare to see a tech house commit to a large, time intensive project. Big projects take a long time to implement. Sometimes too long. But we asked ourselves, “Can we give users an early look at some of the stuff we’re working on?” We wanted something that was in-between a public beta and our own GitKraken preview states.

Efficient Kubernetes Cluster Management: Building Infrastructure-Agnostic Clusters with Cluster API

With the widespread adoption of Kubernetes, the Cloud Native Computing Foundation (CNCF) ecosystem has evolved to include projects that address the challenges of using a container orchestrator system. One such challenge is managing and deploying clusters, which can become complex as organizations scale their Kubernetes requirements. Fortunately, Cluster API (CAPI) provides a solution.

What is an API Gateway?

API Gateways are vital components in today's digital landscape, facilitating seamless communication between systems and applications. To ensure optimal performance, monitoring API Gateways is crucial. MetricFire offers a comprehensive monitoring platform that tracks and analyzes key metrics, providing real-time insights into performance indicators such as latency, error rates, and throughput.

Best Practices for implementing DevOps in Organizations

Are you curious about DevOps and how it’s transforming the world of technology? Look no further! In this blog, we will dive into the fascinating world of DevOps and explore its significance and need in today’s fast-paced digital landscape. From its definition and importance to real-world examples of epic fails and their solutions, we’ll cover it all. So, grab a cup of coffee, sit back, and let’s embark on this DevOps journey together!

Azure Monitoring Agent: Key Features & Benefits

In today's rapidly evolving digital landscape, businesses increasingly rely on cloud computing and infrastructure to support their operations. As organizations migrate their workloads to the cloud, robust monitoring and management tools are paramount to ensure optimal performance, security, and efficiency. In response to this demand, Microsoft Azure has introduced the Azure Monitoring Agent (AMA), a powerful and versatile solution designed to enhance the monitoring capabilities of Azure resources.

Top 5 network management trends in 2023

New trends emerge in network management every year, and 2023 is no exception. This year the industry is set to witness a plethora of advancements and breakthroughs that will revolutionize network administration. From the adoption of sophisticated analytics and machine learning to the proliferation of cloud-based solutions and the surging significance of cybersecurity, here are the top network management trends to watch out for in 2023.

Ensuring performance: How major retailers leverage user traffic to validate code changes

As featured on CMG.org: Software development and testing is ultimately all in preparation for go-live. But what if you could predict how your go-live could go wrong? In this webinar, learn how traffic-based tests and mocks can accurately simulate peak load conditions, ensure performance, and increase your top line revenue.

How to mitigate the challenges of data growth

Over the last decade, I’ve rarely met a data professional whose organization wasn’t experiencing data growth and making more demands of their data. We build and deploy new applications faster than we retire old ones, and new data is accumulating dramatically faster on our existing systems than our ability to decide to delete older information. Additionally, the ever-growing number of users and devices interacting with that data increases the strain on the infrastructure underpinning it.

Managing Kubernetes Log Data at Scale | Civo Navigate NA 2023

In this talk by Matt Miller from Edge Deltas, we delve into the world of managing Kubernetes log data at a significant scale. Matt breaks down different strategies for handling log data, including Kubernetes native tools, persistent storage, open-source log collection, and using centralized log vendors. He explores the benefits and drawbacks of each approach, particularly focusing on the challenges of scalability and cost when dealing with large volumes of data. Matt also shares insights on how to optimize data management using an intelligent edge-first approach.

5 Best FinOps Tools Companies Should Consider In 2023

Cloud financial management has become an increasingly sophisticated business function over the past few years, far beyond what many cloud stakeholders anticipated. As a result, many companies are unable to monitor and allocate cloud costs accurately. In 2022, Gartner estimates companies will waste as much as 30% of their cloud budgets due to this problem. A more holistic approach to cloud cost management is necessary to reduce waste and optimize cloud spending.

Agentless Network Monitoring: An Introductory Guide

From communication and collaboration to data storage and sharing, networks are critical to almost every business operation today. Thus, monitoring the reliability and security of your network infrastructure is more critical than ever. Network monitoring entails observing and analyzing network traffic to identify issues, optimize performance, and ensure security.

The Essential AzCopy Cheat Sheet

AzCopy is a command-line utility designed for copying data to and from Microsoft Azure Blob and File storage. It allows for efficient data transfer, ensuring the integrity of the files and offering a seamless process. In a world where data is considered a valuable asset, AzCopy stands out as a vital tool for data administrators, developers, and Azure users.

VMware Tanzu Application Service and MySQL: Better Together

VMware SQL with MySQL for Tanzu Application Service is a top choice for customers seeking a multi-cloud, easy-to-use, on demand MySQL service for enterprise applications. Customers who have adopted our solution affectionately refer to it as MySQL tile. Our solution provides tangible benefits over open source and third-party offerings for the VMware Tanzu Application Service platform. To call out a few.

How To Write Incident Postmortems

Writing a public postmortem regarding an outage is essential to maintaining transparency and accountability when things go wrong in a service or system. The purpose of writing a postmortem is to analyze and document an incident or event that has occurred, usually with a focus on identifying its root causes, understanding what went wrong, and outlining steps to prevent similar issues from happening in the future.

Where do YOU stand on DORA metrics?

Where does your team stand on DORA metrics? The Jackpocket software development team wanted to know that, too, so they implemented Sleuth to find out. Now, they know exactly where they stand AND where they're making progress. Give Sleuth a try and see how we empower software teams to build faster by making engineering efficiency easy to improve and measurable — in a way that both managers and developers love.

Introducing the Telemetry Cloud: An All-In-One Observability Platform All Enterprises Can Afford

We’re excited to announce that we just released the next-generation of our observability platform – the Circonus Telemetry Cloud™. Here’s a closer look at what it is and why we think it’s a standout in the monitoring and observability space.

How to monitor Azure App Registration Client Secret Expiration Notification?

Security remains a paramount concern in the rapidly evolving landscape of cloud computing. Azure Active Directory (Azure AD) is a cornerstone for securing applications and services within the Azure ecosystem. Azure App Registrations offer a crucial mechanism to manage application identities and enable secure authentication and authorization. However, the expiration of client secrets associated with these app registrations can introduce security vulnerabilities.

AI With a Purpose: Top Moves for Strategically Applying IT Automation

Artificial intelligence (AI) isn’t the only thing at the heart of what organizations are doing to keep with digital transformation and drive business growth. People are, too. Development of AI actually began about 40 years ago, but for generative AI (genAI), that time is much less. The explosion of genAI has brought about an everlasting, first-of-its-kind innovation that’s accessible to just about everyone.

Maximizing Coding Productivity with Large Language Models

Learn how to maximize developer productivity by leveraging large language models for rapid code refactoring. Large language models like ChatGPT have tremendous potential to automate repetitive coding tasks and boost team effectiveness. In this MAAS Show And Tell, Peter Makowski, Senior Web Engineer at Canonical, shares insights and a real-world example of using LLM for a successful large-scale migration of hundreds of tests from enzyme to @testing-library/react.

Key questions to ask when setting SLOs

Many organizations rely on service level objectives (SLOs) to help them gauge the reliability of their products. By setting SLOs that define clear and measurable reliability targets, businesses can ensure they are delivering positive end-user experiences to their customers. Clearly defined SLOs also make it much easier for businesses to understand what tradeoffs they may have to make in order to deliver those specific experiences.

The Unplanned Show, Episode 8: Platform Engineering with Martin Van Son

In this episode, Martin Van Son provides a simplified definition of platforms in this context: a way for internal users to request anything from environments to deployments. The platform engineering comes in because someone needs to own stitching together and automating away all the complexity involved to complete that action. In the end, both the consumers and the creators save time. Furthermore, platform engineers have an opportunity to encode best practices and cost saving measures that are often forgotten when users are left to their own devices.

Restarting Kubernetes Pods: A Detailed Guide

This blog will help you learn all about restarting Kubernetes pods and give you some tips on troubleshooting issues you may encounter. Kubernetes pods are one of the most commonly used Kubernetes resources. Since all of your applications running on your cluster live in a pod, the sooner you learn all about pods, the better.

Git a Handle on it! A Scalable Approach to GitOps Configuration Patterns - Civo Navigate NA 2023

Discover the world of GitOps with this presentation from John Dietz where he dives into scalable configuration management patterns. The talk emphasizes the challenges faced by developers and administrators while adopting GitOps practices and outlines various scalable strategies and best practices to address these. It presents a variety of patterns, tools, and tactics to manage configuration as code, contributing to improved system stability, team collaboration, and delivery speed.

Rootly Raises $12 Million from Renegade Partners, Google Gradient Ventures, & XYZ Ventures

We are excited to announce that we have raised a $12M round of financing led by Renegade Partners with participation from Google Gradient Ventures (Google’s AI-focused venture fund) and XYZ Ventures. This brings our total funding to date to $15.2M ($20M CAD) alongside our other existing investors Y Combinator and 8VC.

July 2023 newsletter: Changelog-The Deluxe Edition

🎵 Gotta give the people, give the people what they want! 🎵 You've been asking. And we've been listening. Over the past few weeks, we've been shipping frequently requested features to help you bring your incident management to the next level. It may be the dog days of summer, but let's ignore that, yeah? Just take a look at this recent changelog. Note that this is the biggest one we've ever published.

Release 1.42.0 - Integrations Marketplace, SystemD Journal Function, and more!

The Netdata Team is very excited to introduce you to all the new features and improvements in the new version. HIGHLIGHTS: There is now a beta version of the Netdata Marketplace with this release, containing more than 800 integrations, directly from the Dashboard! For each integration, all the information required to get it up and running is included, along with info about metrics, alerts and more!

Checkly Advances Monitoring as Code with New User-Centric Features

Checkly, the leading provider of monitoring solutions powered by a Monitoring as Code (MaC) workflow, has unveiled two groundbreaking features: the Activity Log and Code Exporter. These innovative features not only enhance transparency and simplify the adoption of MaC practices but also mark a significant step forward in Checkly's commitment to advancing the MaC movement, offering users an end-to-end workflow that integrates seamlessly with modern software development practices.

AzCopy and Azure File Sync: How They Work Together

In the ever-expanding landscape of Azure data management, two powerful tools emerge as essential assets for tech professionals: AzCopy and Azure File Sync. While each has its unique capabilities, together they create an intricate symphony that enhances data transfer and synchronization within Azure. In this comprehensive guide, we’ll unravel the functionalities of both, explore their common use cases, delve into their integration processes, and weigh their benefits and drawbacks.

Scaling Software Delivery: Continuous Delivery, Overcoming Challenges, and the Power of Cloudsmith

Explore the intricacies of scaling software delivery, from the nuances of continuous delivery to overcoming common challenges. Dive deep into how Cloudsmith can be the game-changer in your DevOps journey, ensuring agility, security, and efficiency in every release. Every business, from startups to established enterprises, feels the urgency to scale their software delivery. Why?

Modernizing the Air Force: DAFITC 2023

D2iQ is excited to be participating in the Department of the Air Force Information Technology and Cyberpower (DAFITC) 2023, in Montgomery, Alabama, from August 28-30. The theme of this year’s DAFITC conference is “Digitally Transforming the Air & Space Force: Investing for Tomorrow’s Fight.” Digital transformation of the Air Force and Space Force is part of a wider modernization effort that is accelerating across all U.S.

How to optimize your high-performance computing workloads

One contributor to successful research is the ability to process and analyze enormous amounts of data. The speed of accessing information and performing computations is crucial to research institutions and organizations looking to innovate and lead in their domain. Yet to process and analyze all that data, researchers need high-performance computing resources that can handle these workloads.

AWS Data Centers Today: 100+ Locations, 1.5 Million Servers, and More

In 2006, Amazon Web Services (AWS) pioneered cloud computing as we know it today. And with about 34% of the market share, Amazon Web Services remains the largest cloud infrastructure provider today, followed by Microsoft Azure and Google Cloud Platform (GCP). AWS is a Cloud Service Provider (CSP). That means AWS builds, maintains, and continually improves multiple data centers worldwide and then rents the computing infrastructure out to other companies as needed, over the Internet.

G2's Most Recommended WAF & DDoS Protection

In case you missed it, HAProxy Technologies recently put out a press release about our stunning leadership position in G2’s Summer 2023 Grid® Reports for load balancing. We’re incredibly proud of these results, which are a direct result of the hard work and dedication of HAProxy’s community developers and our enterprise product and support teams. Looking at the Momentum Grid® Report for Load Balancing, the gap between HAProxy and the rest is impossible to ignore.

What is virtualization? A beginners' guide.

While information technology continues to evolve rapidly, virtualization remains a cornerstone of modern computing, enabling businesses to maximise resource utilisation, enhance flexibility, and reduce the total cost of ownership (TCO). It is a key building block of the cloud computing paradigm, and millions of organisations use it daily worldwide. All existing cloud platforms, such as AWS, Azure, Google or OpenStack, use virtualization underneath.

Monitoring as Code in Your Software Development Lifecycle

When we launched the Checkly CLI and Test Sessions last May, I wrote about the three pillars of monitoring as code. Code — write your monitoring checks as code and store them in version control. Test — test your checks against our global infrastructure and record test sessions. Deploy — deploy your checks from your local machine or CI to run them as monitors.

How to monitor CoreDNS with Datadog

In Part 1 of this series, we introduced you to the key metrics you should be monitoring to ensure that you get optimal performance from CoreDNS running in your Kubernetes clusters. In Part 2, we showed you some tools you can use to monitor CoreDNS. In this post, we’ll show you how you can use Datadog to monitor metrics, logs, and traces from CoreDNS alongside telemetry from the rest of your cluster, including the infrastructure it runs on.

Tools for collecting metrics and logs from CoreDNS

In Part 1 of this series, we looked at key metrics you should monitor to understand the performance of your CoreDNS servers. In this post, we’ll show you how to collect and visualize these metrics. We’ll also explore how CoreDNS logging works and show you how to collect CoreDNS logs to get even deeper visibility into your Deployment.

Key metrics for CoreDNS monitoring

CoreDNS is an open source DNS server that can resolve requests for internet domain names and provide service discovery within a Kubernetes cluster. CoreDNS is the default DNS provider in Kubernetes as of v1.13. Though it can be used independently of Kubernetes, this series will focus on its role in providing Kubernetes service discovery, which simplifies cluster networking by enabling clients to access services using DNS names rather than IP addresses.

Kubernetes Liveness Probe Guide

Kubernetes liveness probes are a critical component for monitoring the health and availability of application containers running within a Kubernetes cluster. They allow Kubernetes to determine whether a container is running as expected and take appropriate actions if it is found to be unresponsive or in an unhealthy state. Liveness probes periodically check the health of containers by sending requests to a specified endpoint or executing a command within the container.

9 Popular Kubernetes Distributions You Should Know About

Kubernetes has become the go-to platform for container orchestration, allowing teams to more efficiently manage their containerized applications. Vanilla Kubernetes, as well as managed Kubernetes, are the two options available when building up a Kubernetes system. A group of programmers using vanilla Kubernetes must download the source code files, follow the code route, and set up the machine's environment.

SRE in Transition: From Startup to Enterprise

"Startups are defined by “ship or die”. As a result, SRE teams at a startup should be focused on enabling product engineers to ship features as quickly as possible. As your startup transitions from “we’ll run out of money in the next 18 months” to “we have more than 1000 engineers”, how should the SRE organization evolve and provide the best value through that transition (including booting one up if you don’t have one)? I will discuss specific ways the organization needs to evolve to meet this challenge, how the SRE org can advocate for and support this change (both in direct actions and in “influence”), and how the overhang of startup technical and cultural debt can make this shift more challenging (but also more necessary).

Integrate Monitoring as Code into your Software Development Lifecycle

Learn how the new Checkly features (code exporter and activity log) enable you to integrate Monitoring as Code into your Software Development Lifecycle. Define and debug your monitoring resources during development, test your preview deployments and start monitoring productions with ease.

Kubernetes Delivers Business Value Beyond IT

Since 2018, our annual State of Kubernetes survey has consistently found that organizations achieve significant operational benefits from using Kubernetes, especially “improved resource utilization.” This year, we wanted to understand how Kubernetes impacts the business as a whole. The results are unequivocal.

ITOps vs. DevOps: what's the difference?

Titles within an organization evolve nearly as fast as the technology itself. For a long time, the title of DevOps was considered a literal interpretation of “Development” and “Operations” – a catch-all term for hybrid roles encapsulating everything from on-prem, cloud, and hybrid infrastructures, to code execution and lifecycle management. Sounds like a lot? It is.

How to Plan for a Crisis with Infrastructure-Agnostic Recovery of Kubernetes Applications

Corey Dinkens and Carol Pereira contributed to this blog post. As enterprises deploy modern containerized applications to their Kubernetes clusters, managing data protection centrally is necessary to run critical business applications, especially in multi-cloud distributed environments.

Kubernetes at the Edge with Portainer | Civo Navigate NA 2023

Dive into the world of Kubernetes at the Edge with Portainer with Neil Cresswell. In this talk, learn how Portainer transforms container management by making Kubernetes deployment at the edge effortless and efficient. Through this talk, revoluntionize container management and orchestration by simplifying Kubernetes deployment at the edge. This is aimed at making deploying, updating, and maintaining Kubernetes clusters on edge devices more streamlined and accessible, benefiting industries like telecommunication, healthcare, and manufacturing where low latency and high reliability are critical.

10 Crucial Engineering Metrics Must Follow In 2023

In software engineering, measuring your performance gives you the knowledge you need to make informed decisions regarding your products, features, processes, and even your dev teams. Measuring also tells you if you’re on track to meet your engineering goals. Yet, with an abundance of tasks, data, and other information to keep track of, how do you decide on the right metrics to monitor?

25+ Best Kubernetes Tools By Category In 2023

Over the past few years, Kubernetes (K8s) has become the preferred method of orchestrating containers and microservices. Its self-healing, high scalability, and open-source nature make it appealing to a wide range of users. However, deploying, running, and scaling containerized applications and microservices with Kubernetes can be quite challenging. The Kubernetes community keeps growing, but there still aren’t that many experienced K8s engineers.

Unique Challenges for Software Developers in the Public Sector

If you’re a software developer in the public sector, you’re familiar with the unique challenges public sector professionals face. From grappling with tight deadlines and limited resources to working with complex legacy systems, the journey of a public sector software dev seems to be filled with obstacles at nearly every turn.

Common AzCopy Errors and How to Fix Them

AzCopy is a command-line tool provided by Microsoft to transfer data to and from Azure Storage services like Blob, File, and Table storage. It’s a vital tool for IT professionals who handle large-scale data operations, offering an efficient way to move data where it’s needed. However, as with any robust tool, users might encounter errors or issues while working with AzCopy. In this article, we’ll explore some common AzCopy errors and provide solutions on how to fix them.

Why Andy Warhol would like - and dislike - AI

In a series of blog posts about AI, I’ve been looking at how intelligent ChatGPT is, how good ChatGPT and Bing are when you employ them as a technology writer, and how the engineering team at Redgate is using GitHub Copilot to aid with writing code. Now it’s time to take a look at image creation tools, and where better to start than Andy Warhol? I like Andy Warhol.

Unraveling AWS Lambda: Exploring Scalability and Applicability

In our previous blog, we shared our firsthand experience of implementing a tracing collector API using serverless components. Drawing parallels with Amazon Prime Video’s architectural redesign, we discussed the challenges we encountered, such as cold-start delays and increased costs, which prompted us to transition to a non-serverless architecture for more efficient solutions.

What is a MicroCloud?

A MicroCloud is a new lightweight, featureful, and straightforward cloud for on-demand computing at the edge. MicroClouds differ from IoT which uses thousands of single machines or sensors to gather data, yet does not perform computing tasks. Instead, MicroClouds reuse proven cloud primitives with unattended, autonomous, and clustering features that resolve typical edge computing challenges.

13 Key Cloud Cost Management Strategies (And How CloudZero Can Help)

Managing cloud computing costs is a big deal right now. For instance, 60% of organizations say their cloud costs are higher (over 70%) than they should be, according to our State of Cloud Cost Intelligence report. Over the last five years, several other studies have shown that controlling cloud spend is a top cloud computing challenge. Yet almost all organizations acknowledge that moving more workloads to the cloud is a top cloud initiative in the next year.

CloudZero Launches Advanced Analytics For Deeper Visibility And Savings Insights

We built CloudZero for a simple reason: Bring business fundamentals to cloud-driven organizations without stifling innovation. It sounds simple, but for years, the intrinsic complexities of the cloud and extrinsic pressures to grab SaaS market share made it near-impossible for businesses to achieve this. Until CloudZero.

Benefits and challenges of containerization for IT operations

Your IT teams are critical to improving the efficiency of your operations and ensuring long-term business scalability. But as your organization grows and demands become more complex, the challenges of managing IT operations can become difficult, especially when managing multiple applications across various server environments. Containerization has become a popular solution for some of these challenges.

Exploring distributed vs centralized incident command models

Recently in our Better Incidents Slack channel, there’s been some chatter around how people structure dedicated incident commanders at their company: distributed or centralized. The way I see it, there are two types of commanders: the temporary, distributed role — a hat that an on-call engineer or an engineering manager puts on during an incident. Then there’s the centralized, full-time role, where someone is the designated incident commander (or one of a few) for all incidents.

Stop the death march for your developers

Stop the death march for your developers. Developer toil directly leads to burnout, say software engineering leaders from LaunchDarkly, Okteto, Atlassian and Sleuth. And burnout leads to devs leaving your team. Don't let that happen -- remove toil. Give Sleuth a try and see how we empower software teams to build faster by making engineering efficiency easy to improve and measurable — in a way that both managers and developers love.

Announcing In-Place Upgrade from Ubuntu Server to Ubuntu Pro on Azure

We are pleased to share that Azure is now offering an in-place upgrade from Ubuntu Server to Ubuntu Pro. This functionality, made possible through our strategic partnership with Azure, provides a straightforward way to leverage the advanced features and extended security maintenance of Ubuntu Pro, all without redeploying your Virtual Machine (VM) or scheduling a maintenance window.

Ship faster by integrating AI into your Bitbucket workflow

AI tools have taken the world by storm. In April, we announced Atlassian Intelligence to bring the power of AI into our tools. Leveraging AI through internal models and our collaboration with OpenAI, Atlassian Intelligence will be built into the Atlassian suite of tools, including Bitbucket Cloud.

What's missing from your incident management workflow

The first fifteen minutes of an incident set the tone for the rest of the resolution process. But what makes the difference between a rapid response and a stressful scramble—clear ownership—hasn't always been easy to ascertain. In this article, we’ll cover how Cortex, an internal developer portal, can be your team’s source of truth to accelerate the incident management process, and reduce MTTR.

How to record a business process with Kosli's Audit Trail

Have you ever needed to provide proof that a critical business process actually took place? It’s a painful process involving all kinds of paperwork, but it’s the reality for many organizations working in highly regulated industries. For these companies, records need to be kept for actions like the provisioning of user accounts and access to sensitive records. It’s necessary, but it’s manual and time-consuming work.

Stay on top of every change with Kosli Notifications

In this short blog, you will learn how to set up Kosli Notifications so your whole team can stay on top of environment changes and compliance events in real time. 🚀 In fast-paced technology landscapes, understanding how systems are changing is crucial. Developers, DevOps/Platform/SRE teams, security personnel, and management all need this information to manage operational risk, resolve incidents, and just for basic communication with each other.

Is Kubernetes Too Complicated - Civo Navigate NA 2023

Join @JuliaFMorgado as she takes you through her Kubernetes learning journey and demystifies its complex architecture. She outlines some advice for those just getting started by covering topics such as nodes, pods, containers, and more. The presentation offers insights into Kubernetes' learning curve and essential components, providing valuable advice for those looking to delve into containerization and Kubernetes.

Automate network topology mapping with OpManager's topology software

Network topology mapping is the process of mapping topological relationships between network components and establishing those relationships in the form of network diagrams. Network mapping helps visualize physical and logical connections between all elements and nodes, thus simplifying network management. A network topology mapper is a tool that helps perform network mapping effectively.

How to Transform the Management and Modernization of Your Infrastructure to Maximize Business Outcomes

But whether it’s replatforming legacy applications or migrating them to the cloud, enterprise IT leaders routinely suffer from run-away costs, unforeseen complications, and out-of-control environments on the other side of the modernization process. Yet, as an enterprise IT leader you have little choice but to forge ahead.

Mastering Kubernetes Pod Restarts with kubectl

Managing containerized applications efficiently in the dynamic realm of Kubernetes is essential for smooth deployments and optimal performance. Kubernetes empowers us with powerful orchestration capabilities, enabling seamless scaling and deployment of applications. However, in real-world scenarios, there are situations that necessitate the restarting of Pods, whether to apply configuration changes, recover from failures, or address misbehaving applications.

Monitoring Redis Clusters with Prometheus

This article will outline what Redis database monitoring is and how to set up a Redis database monitoring system with MetricFire. Then we’ll show what the final graphs and dashboards look like when displayed on Grafana. We will be using Prometheus and Grafana to power the monitoring, and we'll use a simulated Redis DB to generate the data for the Grafana dashboards. ‍ ‍

SQL Server Terms Translated into PostgreSQL

The rise in popularity of open-source RBDMs has encouraged many organizations to adopt PostgreSQL, but as a DBA or Developer, it can be challenging when exploring new database platforms, no matter how experienced you are. When looking at SQL Server, it has many similarities to PostgreSQL, but there are several big differences too.

The Most Common Ways To Allocate Cloud Spend (+ The Pros And Cons Of Each)

All the major cloud providers allow users to attach business context to their infrastructure in some way. It’s this context that allows users to divide up their cloud bill into more easily digestible bites and keep track of cost trends for different resource types. Thorough cloud cost allocation gives companies the ability to make educated business decisions.

Making a move: How migrating to Ubuntu saved a life insurance company 60% in costs

Balancing high performance operations against the need to reduce total operating costs is a classical dilemma faced by both large and small organisations. This dilemma becomes particularly important when you choose the foundation of your IT infrastructure: the operating system. A recent case study by Tech Mahindra, the multinational IT services and consulting firm, details how their partnership with Canonical enabled them to shift the balance for a major Fortune 500 life insurance company.

Understanding and Optimizing CI/CD Pipelines

Building, testing and deploying software is a time-consuming process that many organizations aim to minimize by automating repeatable work wherever possible. To do so, many organizations are utilizing a continuous integration, continuous delivery (CI/CD) philosophy in combination with cloud native tools like Kubernetes to develop and deploy software at scale.

Scaling Up to Keep Costs Down: Automation for Web Application Incident Management

Any organization that’s keeping up with today’s sharp rise in business demands (or better yet, getting ahead of the game) is doing so by getting innovative and jumping at the chance to do things differently. They’re not relying on the old ways or trying to use their existing toolbox. Instead, organizations are looking to the newest technologies and means of adding efficiency to as many day-to-day functions as possible.

Reliability Best Practices: How Gremlin Uses Gremlin

Ensuring software availability is essential for any SaaS company—including Gremlin. To do that, our teams need to identify the reliability risks hiding in our systems. That’s why our development, platform, and SRE teams use Gremlin regularly to perform Chaos Engineering experiments, run reliability tests, and track the reliability of our systems against our standards. Along the way they’ve picked up a thing or two about how to find and fix reliability risks with Gremlin.

Behind the Scenes: Mattermost OpenOps AI Mindmeld | July 27, 2023

Tune in for a behind-the-scenes discussion on the advancement of Mattermost's AI tools and how they're being integrated into the team's current projects. The main topics covered include using AI to create tweets, the potential of using the tool to auto-generate text that resembles a user's tone, how to improve public awareness and involvement in OpenOps, and more.

How to monitor connector's API Connections in Logic Apps?

Let us consider a scenario where a Logic App is used to communicate with SharePoint through API connections, known as connectors. When configuring the connector, it communicates with Azure AD, retrieving a username and password and continuously refreshing the authentication token. When the Logic App calls the connector, it performs operations like uploading files to SharePoint.

SMS Alerts for GitHub Actions - Civo Navigate NA 2023

Rishab Kumar, a Developer Evangelist at Twilio, shares their insights on implementing SMS alerts for GitHub actions during an informative talk. Their presentation primarily focuses on using GitHub Actions for build and CI/CD tasks due to its efficient cost structure. However, Rishab points out a feature gap: the platform lacks the capability for SMS alerts or phone calls. To address this, they demonstrate how to configure SMS alerts in a manner akin to enterprise tooling such as OpsGenie or PagerDuty.

7 Tips for Remote Data Center Management

As data centers become increasingly decentralized, managing them remotely is now a must-have skill. Data center professionals need to maintain uptime, increase efficiency, and boost productivity across all their global sites without leaving their desk. While this might have once seemed near-impossible, with the right tools and processes, remotely managing your data center can be even better than physically being there.

New Feature: Instantly Clone Your Service

We're excited to announce the general availability of the new "Clone Service" feature on Qovery, which is built to augment the capabilities of our platform and to cater to our user needs more effectively. Qovery has always prided itself on being a user-centric platform, and this new feature continues to uphold that tradition.

Uploading Files Using AzCopy: A Detailed Technical Guide

Data has become a critical asset in today’s digital era, making its storage, management, and accessibility crucial to many organizations’ operations. Microsoft’s Azure provides a suite of cloud storage solutions designed to address these needs. Among the tools provided by Azure is AzCopy, a command-line utility designed to simplify data transfer to and from Azure Blob, File, and Table storage.

Monitoring Digital Ocean with MetricFire

Cloud monitoring is like a health check-up for our online spaces. It tells us what's going well and what we need to improve. It is critical because it lets us fix problems before they get too big and helps our online services work at their best. This article talks about how we can use MetricFire to monitor DigitalOcean environments.

Snowflake Pricing Explained: A 2023 Usage Cost Guide

Snowflake’s scalable architecture, minimal latency, advanced analytics, simplified data handling, flexible pay-as-you-go model, and always-on security make the data cloud a top choice for many businesses. You can also purchase Snowflake resources on-demand or upfront. But if you struggle to control your Snowflake costs, you’re not alone. With the help of this guide, you’ll know how to better manage your Snowflake costs.

Export Your Qovery Configuration into Terraform Manifest in One Click

At Qovery, we've always prided ourselves on the usability and convenience of our web interface. It's where most of our users begin their journey, configuring and deploying applications with ease and speed. Many users start configuring their applications on our intuitive web interface, validate the successful deployment, and then transition to writing their configurations with the powerful infrastructure-as-code tool, Terraform, utilizing the Qovery Terraform Provider.

10 Video Shorts to Level Up Your Git Game

Mastering Git can sometimes feel like trying to untangle a bunch of super tangled – and wired (old school, I know) – headphones, not sure exactly where to pull or loop or flip next. Here are 10 Git tutorial videos that’ll help you level up your Git game. These videos are the perfect combination of fast and helpful.

The Future Skills People Need to Succeed in Tech - Civo Navigate NA 23

In this talk, Tamika Reed discusses the future skills needed to succeed in the tech industry beyond coding. As the founder of Women in Linux, she shares insights into Linux administration, infrastructure building, and investing in tech to secure a prosperous career. Discover the latest trends, challenges, and opportunities in the ever-evolving tech world.

Upgrading to Azure Data Lake Gen2: A Seamless Transition

Microsoft’s Azure Data Lake Storage (ADLS) has been a vital component for organizations aiming to build scalable and secure data lakes. As technology evolves, transitioning from Azure Data Lake Storage Gen1 to Gen2 has become increasingly important. This article aims to guide readers through the essential considerations, detailed processes, and best practices involved in making this shift.

What is Docker Swarm and How Does it Work?

For most organizations, having a stable and reliable IT infrastructure is essential for success. But managing multiple servers, databases, and applications can often be difficult and time-consuming. Container orchestration is a standard solution for handling this complexity. Docker Swarm has gained popularity as a container orchestration solution because of its simplicity and scalability.

How using a database monitoring tool helps DBAs create value for the whole organization

As the size and complexity of database estates increases, with more workloads and data being hosted on more platforms, both on-premises and in the cloud, so the appeal of third-party database monitoring tools has also grown. Their ability to provide a holistic view of an entire estate and monitor multiple databases and platforms from a single dashboard has been shown to save DBAs and IT teams many hours of time when compared to home-grown solutions.

IoT Dashboards with Grafana and Prometheus

The Internet of Things (IoT) - is a number of physical devices connected to one network that enables the system to interact with the external world. A great deal of the work surrounding IoT is monitoring, as it’s impossible to react without knowing the situation. For example, we might build a greenhouse system for agriculture that can maintain optimal conditions for growing crops. For this purpose, we need to have sensors picking up information about the temperature and humidity.

incident.io: A scalable incident management solution built for enterprises

For enterprise businesses, a lot is riding on the efficiency of their incident response. These organizations have large customer bases, complex products, and many incidents. They also have loads of incident responders across various roles, making it difficult to coordinate internally.

Unveiling Squadcast's Enhanced Status Pages

Meet Kevin and Mai (again): Navigating the Troublesome Waters of Platform Downtime. Kevin is a Site Reliability Engineer (SRE), constantly on the lookout for potential downtime that could impact their platform, kryptobro.com. Mai is his adept partner, ever-ready to troubleshoot. In their journey, the previous version of Squadcast Status Pages served as a helpful tool, but they soon found room for improvements.

Getting started with AWS CloudWatch

Out of more than 100 services that Amazon Web Services (AWS) provides, Amazon CloudWatch was one of the earliest services provided by AWS. CloudWatch was announced on May 17th, 2009, and it was the 7th service released after S3, SQS, SimpleDB, EBS, EC2, and EMR. AWS CloudWatch is a suite of tools that encompasses a wide range of cloud resources, including collecting logs and metrics; monitoring; visualization and alerting; and automated action in response to operational health changes.

Cloud Native Security Must Go Beyond the Perimeter

One month after the MOVEit vulnerability was first reported, it continues to wreak havoc on U.S. agencies and commercial enterprises. Unfortunately, the victim list keeps growing and includes organizations such as the U.S. Department of Health and Human Services, the U.S. Department of Energy, Merchant Bank, Shell, and others.

SRE Redefines IT Operations as Architect of Sustainable Systems

Site Reliability Engineering (SRE) is a term that’s getting attention and gaining momentum – and for a good reason. SRE takes features of software engineering and applies them to various problems in infrastructures and operations. Organizations look to build SRE teams with a couple goals in mind, including to create and increase scalability and develop solid software systems.

Kubernetes Community Days Munich Recap

A couple of weeks ago I had the absolute joy of attending KCD Munich for the first time, with my friend and colleague Guy Menahem (whom some of you know simply as The Good Guy on Twitter and YouTube). Besides rooting for Guy and his co-speaker, Arsh Sharma of Okteto, during their session on Backstage.io and IDPs, I enjoyed being untethered from ‘booth duty’ and free to engage with all the beautiful human beings that gathered together for this Kubetastic event!

10 Best Git GUIs for Public Sector Developers

For developers working in the public sector, leveraging secure version control systems like Git is essential to manage code and web content efficiently and safely. Git simplifies collaborative projects between developers working in fields like government, healthcare, banking, and education, but hey, let’s face it – mastering Git via the command line can be like solving a Rubik’s Cube blindfolded. That’s where a Git GUI comes in handy.

Anything But Tech Debt

Tech debt is usually one of the most fraught topics on engineering teams. Engineers often feel they aren’t allowed enough time to address tech debt. Product partners wonder why engineers spend so much time working on it—or at least talking about it. “The business” always seems to insinuate that engineers should do less of it, instead focusing on shipping value to customers.

Using UX and Observability to Track Application Health

UX (user experience) is a core factor that determines the success of an application or platform in a distributed system. Specifically, developers need to understand the infrastructure within an entire application stack to improve and refine the user experience to meet customer expectations without guesswork. System downtime remains a significant source of revenue and reputational losses for enterprises, employees, and customers.

Here's what it feels like to deploy every day

Here's what it feels like to deploy every day. With Sleuth, Gigpro's software engineering team went from one deploy every two weeks to once a day. That made releases less stressful and helped improve team culture. Give Sleuth a try and see how we empower software teams to build faster by making engineering efficiency easy to improve and measurable — in a way that both managers and developers love.

cert-manager can do SPIFFE? - Civo Navigate NA 2023

Ashley Davis, Senior Software Engineer and Maintainer of cert-manager, discusses the capabilities of cert-manager, an easy way to manage certificates in Kubernetes clusters. Ashley highlights the importance of Trust-manager for managing trust bundles, enabling clients to verify certificate legitimacy. Additionally, he explores the potential of using x509 certificates as a universal identity control plane in distributed systems through the concept of "SPIFFE" (Secure Production Identity Framework For Everyone).

Enhanced Ubuntu Experience on Azure: Introducing Ubuntu Pro Updates Awareness

In collaboration with Microsoft, Canonical introduces Ubuntu Pro update notifications into the Azure Update Management Center. This feature enables users to identify Ubuntu instances that aren't receiving all available security updates, including those delivered via Ubuntu Pro. Ubuntu Pro, a subscription by Canonical, provides enhanced security, maintenance, and compliance tools for organizations using Ubuntu on Azure.

Kubernetes Incident Management Best Practices

Creating just any infrastructure on Kubernetes is not enough. There are so many basic configurations you could apply and create the infrastructure for your application for the time being and it might work just fine. The incident responses won’t always remain 100% reliable. You will run into newer potholes, and that’s okay.

GitOps the Planet #16: Using SLOs to Improve Software Delivery

Kit Merker is the one of the original product managers for Kubernetes and now Chief Growth Officer at Nobl9 where they're delivering a new open standard called OpenSlo. SLOs, or service-level-objectives, provide a framework for understanding performance targets and making judgements about software changes and how they impact uptime. But it's not just a standard, it's also code. Come find out about it with Kit in this GitOps the Planet!

What is Scalability?

The number of simultaneous requests that an application can successfully support is a measure of its scalability. The point at which an application can no longer successfully handle more requests is its scalability limit. When a key piece of hardware is exhausted and new or more machines are needed, this limit is reached. Scaling these resources can include any combination of CPU and physical memory (different or more computers), hard disc (larger hard drives, less "live" data, solid state drives), and/or network bandwidth (several network interface controllers, larger NICs, fibre, and so on).

Cloud connectivity and interoperability

The post-pandemic world has transformed our work habits and the landscape of conducting business. Organizations now take the hybrid approach to work, wherein employees may work from an office, while travelling, or from a remote location. This fundamental shift has accelerated the pace of cloud adoption, as the cloud makes data access possible from anyplace, anytime. But the cloud brings with it a set of complexities that must be managed.

Using Grafana and Graphite to monitor server load

Since server outages can lead to a loss of customers, reputation, and other troubles and it is important to get information on the status of the server on time. MetricFire's Hosted Grafana and Graphite will help you monitor server load in a timely and efficient manner. Servers generate a large number of metrics and it is essential to not only track their values but also to observe their changes over time. There is also a possibility to correlate app statistics with server load metrics.

AWS Cost Allocation: A Guide To Allocating Cloud Spend

Picture this. Over 90% of organizations use the cloud in one form or another, according to O'Reiley's research. Cloud computing is so popular because of its flexibility. Because you can access cloud computing resources on-demand, you can automatically increase or decrease resource usage depending on your workload, which is incredibly appealing — and quite different from traditional IT infrastructure. In 2023, controlling cloud spend is one of the greatest challenges for cloud users today.

How we use trace-based alerts to reduce MTTR

On-call shifts are part of every developer’s job – we’ve all been there. It’s 3am, suddenly you get an alert for an issue occurring in production. The microservices landscape is complicated and finding the root cause of an issue is like looking for a needle in a haystack. How can you get to the root of what’s happening in the system so you can analyze and resolve the issue quickly and effectively?

Understanding Blameless Postmortems

Progress often accompanies unforeseen challenges and mishaps in organizations. Traditionally, these setbacks resulted in pointing fingers, hindering progress, and creating a negative work atmosphere. However, a "Blameless Postmortems" approach transforms how organizations respond to failure. In this blog, we will delve into the importance of cultivating a blameless postrmortem culture when faced with setbacks.

Using Helm Dashboard and Intents-Based Access Control for Pain-Free Network Segmentation

Helm Dashboard is an open-source project which graphically shows installed Helm charts, revisions, and changes to their Kubernetes resources. The intents operator is an open-source Kubernetes operator which makes it possible to roll out network policies in a Kubernetes cluster, chart by chart, and gradually achieve zero trust or network segmentation.

AzCopy Installation: Simplifying Data Transfers to the Cloud

Data management and transfer are essential components of the digital era. Whether you are an IT professional, a developer, or simply someone looking to move large amounts of data to the cloud, the efficiency and reliability of the process are paramount. That’s where AzCopy comes into the picture.

Leveraging AWS EventBridge to stay ahead of spot instance interruptions

Amazon EC2 Spot Instances can help you save significantly on your compute costs. However, you should also be aware that Amazon can take them back with a two-minute notice if the demand for the instance type goes up. Fortunately, AWS EventBridge, along with Spot by NetApp, can help you automate the process of detecting and reacting faster to these interruptions.

Azure Distributed Transaction Performance Monitoring

In this article, we will explore Azure Distributed Transaction Performance Monitoring using Serverless360’s new feature called BAM Duration Monitoring. Our primary focus will be effectively monitoring a long-running business process implemented using the dynamic combination of Logic Apps and Data Factory.

Ep 6: A deeper dive into the cloud featuring James Sanders

On this episode of Cloud Control, we dive into an insightful conversation with James Sanders, a principal analyst for CCS Insight, and a recognized expert in cloud and infrastructure technology. Discussing the rapid shift to the cloud during the COVID-19 pandemic and the financial implications this rapid adoption has had for businesses worldwide. Unpack strategies for cost optimization, the complexities of repatriating workloads, and the human factor involved in technology adoption.

CD for machine learning: Deploy, monitor, retrain

While there are an increasing number of off-the-shelf machine learning (ML) solutions that promise to adapt to your specific requirements, organizations that are serious about investing in ML for the long term are building their own workflows tailored exactly to their data and the outcomes they expect. To make full use of this investment, ML models must be kept up to date and working from the freshest available data.

Securing Access to Cloud Native Resources with Certificates - Civo Navigate NA 2023

In this talk, Alan Vailliencourt, a Senior Solutions Engineer with Teleport, discusses the importance of moving away from passwords and securing access to cloud-native resources using short-lived certificates. He highlights the risks associated with passwords and showcases the benefits of identity-native access, incorporating proof of presence, mutual authentication, and device security. The talk provides practical steps for adopting certificate-based authentication and improving security posture for Kubernetes, databases, and other cloud resources.

The broader approach on Azure monitoring

This episode of Azure On Air podcast tackles the challenges in IT infrastructure monitoring and transitioning from on-premise to the cloud. Pedro Sousa, Microsoft Azure MVP, advocates for a shift from traditional monitoring to a holistic observability approach, starting with an understanding of business needs and working down to infrastructure details. Furthermore, he provides invaluable advice on migrating from on-premise to Azure, emphasizing the consistency of observability principles across environments.

Introducing Squadcast's Key Based Deduplication

We are excited to share another feature update with all our valued customers! We have recently gone live with our Key Based Deduplication feature, enabling you to define dedup keys using customizable templates for configured alert sources. With this feature, you can automatically group similar incidents and effectively deduplicate alerts.

Empowering AIOps With Zenoss Smart View: Unleashing the Power of Intelligent Diagnostics

In this video blog post, I delve into the world of Zenoss Smart View, an indispensable tool that has revolutionized the way IT operations personnel approach diagnostic challenges. In today's fast-paced and complex digital landscape, swift problem resolution is paramount. That's precisely where Smart View shines. Smart View is a critical, differentiated tool in Zenoss’ toolkit to identify critical issues with time-sensitive, contextual information.

LinkPool: Laying the foundations of a crypto ecosystem

Blockchain-generated smart contracts are transforming the way the world creates and settles agreements. Jonathan Huxtable and the team at LinkPool are using platformEDGE™ to connect these contracts with real-world data sources. In 2017, Google searches for the term Bitcoin peaked. It seemed like everyone was talking about cryptocurrencies. For Jonathan Huxtable, this surge in interest didn’t come as a surprise. In fact, he’d predicted it.

Azure's Robust Journey with GDPR Compliance

In an age where data is the new gold, safeguarding personal information has become more vital than ever before. The General Data Protection Regulation, or GDPR, is no longer a buzzword in the corporate corridors of Europe; it’s a binding legislation that has set the global standard for privacy and security. Enter the world of Microsoft Azure, one of the leading cloud computing platforms that’s shaping the way businesses store, manage, and analyze data.

Solving the Never Ending Requirements of Authorization - Civo Navigate NA 2023

In this talk, Alex Olivier shares their personal experience with the challenges of constantly changing authorization requirements in software systems. They discuss the drawbacks of traditional if-else statement-based authorization logic and propose a more efficient and scalable solution using an authorization service called Cerbos. The talk explores the benefits of decoupling authorization logic into policies, providing a centralized and maintainable approach with a clear audit trail.

Why you need an internal status page

When we launched incident.io Status Pages a few months ago, we stressed the importance of communicating clearly with your customers about ongoing issues. To help with this, we spent a lot of time carefully designing a status page that’s easy to understand for everyone - whether they come from a technical background, work in a different area, or just want to get on with their day.

Reducing latency in industrial systems with real-time Ubuntu on Intel SoCs

Delivering a comprehensive real-time solution for industrial systems requires careful work at every layer of the stack. Since standalone hardware or software components are not sufficient, Canonical and Intel have joined forces to deliver an out-of-the-box real-time solution. This solution is now generally available on Intel Core processors.

AKS Day 2 management made easy

You’ve deployed your Azure Kubernetes Services (AKS) cluster into production. Now what? Deploying AKS clusters is cause for celebration, but don’t rest on your laurels for too long. You are now in the Day 2 Kubernetes management phase and the operational challenges are on the rise. The Kubernetes application lifecycle is broken into three main phases. They are often referred to as Days, but realistically, they take much longer than 24 hours!

Low Disk Space Remediation: Triaging the Explosion of Data and Closing the Loop

Today, there is an explosion of data in IT. This data explosion of critical infrastructure living in the cloud, on premises in the data center, or even orchestrated in containers can be subjected to low disk space issues. How do you respond to the challenging inconvenience of low disk space?

Are you ready for DORA?

Not to be confused with the popular children’s TV character, DORA is a new EU regulation for the financial sector, which stands for the Digital Operational Resilience Act. DORA became law on 16 January 2023 and will start to apply from 17 January 2025, so it’s crucial that senior executives in the financial sector, such as Chief Risk Officers and Chief Information Security Officers, understand its implications and prepare for compliance from day one.

Prometheus Monitoring 101

Prometheus is an increasingly popular tool in the world of SREs and operational monitoring. Based on ideas from Google’s internal monitoring service (Borgmon), and with native support from services like Docker and Kubernetes, Prometheus is designed for a cloud-based, containerized world. As a result, it’s quite different from existing services like Graphite. ‍ Starting out, it can be tricky to know where to begin with the official Prometheus docs and the wave of recent Prom content.

Connecting Prometheus and Grafana

Using Prometheus and Grafana together is a great combination of tools for monitoring an infrastructure. In this article, we will discuss how Prometheus can be connected with Grafana and what makes Prometheus different from the rest of the tools in the market. MetricFire's product, Hosted Graphite, runs Graphite (a Prometheus alternative) with Grafana dashboards for you so you can have the reliability and ease of use that is hard to get while doing it in-house.

AWS CloudWatch Custom Metrics vs Prometheus Custom Metrics

Understanding the state of your systems and their underlying infrastructure at all times is paramount for ensuring the stability and reliability of your services. Up-to-date information about the performance and health of your deployments not only helps your team react to issues in real time, but it also gives them the security to make changes with confidence and to safely forecast system failures or performance hiccups even before they occur.

Monitoring Webapp Performance with Sitespeed

In today's digital landscape, optimal web application performance is crucial for business success. Slow loading times, unresponsive pages, and inefficient code can drive away users and harm your reputation. This makes monitoring web app performance extremely important to prevent them and to provide a smooth user experience. Sitespeed, a powerful web performance monitoring framework, analyzes metrics like page load time, resource usage, and user interactions to identify performance bottlenecks.

An Insider Look at Zero Trust with GDIT DevSecOps Experts

As cyber attacks have become ever more sophisticated, the means of protecting against cyber attacks have had to become more stringent. With zero trust security, the model has changed from “trust but verify” to “never trust, always verify.” Joining D2iQ VP of Product Dan Ciruli for an in-depth discussion of zero trust security was Dr. John Sahlin, VP of Cybersolutions at General Dynamics Information Technology (GDIT), and David Sperbeck, DevSecOps Capability Lead at GDIT.

5 important features to look for in cybersecurity applications

In today’s digital landscape, organizations need the right cybersecurity applications to address evolving cyber threats effectively. To keep security teams aligned and streamline mission-critical workflows, one of the most important cybersecurity applications organizations need is a secure and efficient cybersecurity collaboration platform that enables seamless communication, information sharing, and coordinated incident response.

Enable and use GKE Control plane logs

Are you having any issues with the control plane components in your GKE Cluster? Are you interested in gaining visibility into the control plane side of the cluster to troubleshoot the issues by yourself? Then GKE Control Plane Logs is a great way to gain insights on what's going on with your cluster. In this video, we provide a quick overview about Control Plane components and logs, and show how to enable control plane logs on the new and existing GKE clusters. Watch this video to learn how to use Control plane logs to troubleshoot webhook and control plane latency issues in GKE clusters.

Enhancing the Ubuntu Experience on Azure: Introducing Ubuntu Pro Updates Awareness

Canonical works closely with Microsoft to ensure that running Ubuntu on Azure is a great experience. One of the key aspects of this collaboration is ensuring the longevity and security of Ubuntu releases, such as Ubuntu 18.04 LTS, even beyond their Standard Security Maintenance period. We are excited to announce the integration of Ubuntu Pro update awareness into Azure through the Azure Guest Patching Service (AzGPS) and Update Management Center (UMC).

How Novacy Shortened Troubleshooting Time by 90% with Helios

When I first met Uria Franko, the CTO of Novacy, I immediately knew we’d hit it off. He was looking for an observability solution for his team with a specific need around Celery, after they had been using logs but found they lacked the depth and granularity they needed. Luckily, our mission at Helios is to help organizations gain visibility and drill down into services through traces. So this was a perfect match.