Operations | Monitoring | ITSM | DevOps | Cloud

July 2022

Monitor Citrix Hypervisor performance with Datadog

Citrix Hypervisor, formerly known as Citrix XenServer, is a type 1 hypervisor that enables organizations to run and manage an entire virtual infrastructure—including VMs, virtual desktops, and virtual applications. Organizations can also use Citrix Hypervisor to optionally host these virtual workloads with higher availability and flexibility by implementing managed server groups called resource pools.

Sponsored Post

Classifying Severity Levels for Your Organization

Major outages are bound to occur in even the most well-maintained infrastructure and systems. Being able to quickly classify the severity level also allows your on-call team to respond more effectively. Imagine a scenario where your on-call team is getting critical alerts every 15 minutes, user complaints are piling up on social media, and since your platform is inoperative revenue losses are mounting every minute. How do you go about getting your application back on track? This is where understanding incident severity and priority can be invaluable. In this blog we look at severity levels and how they can improve your incident response process.

A Complete Guide for API endpoint Monitoring

Modern programs run on APIs. In digital transformation efforts, APIs are often the building blocks used to push organizations into the current digital age. For business-critical transactions, most applications, therefore, rely on APIs. Without truly understanding what’s going on behind the scenes with each API endpoint, organizations create blind spots in their performance. Let us see more about API and how important it is to monitor the API, and how we can easily monitor the API endpoint.

CloudHealth Pricing: How Much Does CloudHealth Cost?

Virtualization technology provider VMware acquired CloudHealth in August 2018. VMware hoped the new addition would enable more organizations to manage services on other cloud platforms like AWS, Azure, and their private data centers. So, what does CloudHealth do and how much does it cost? Is it a good all-in-one platform for cloud management? This guide explores CloudHealth pricing, features, and possible alternatives to help you decide if the solution is right for you (or if there is a better option).

StackPod: Jujhar Singh of Thoughtworks on Why Technology Is Always About People

A few episodes ago, we talked with fellow podcaster and tech evangelist Dotan Horovits. During that episode, Dotan shared that he wrote a blog post with Jujhar Singh called “How Much Observability Is Enough?” which is definitely a recommended read if you’re implementing observability and feeling overwhelmed. After reading this article, we were eager to invite Jujhar to the StackPod as well, to dive into this topic a bit more.

Qovery x Spayr - Managing Multiple Environments Running on Kubernetes Clusters

Qovery makes it easy to deploy on-demand environments on AWS. More than 20,000 DevOps and developers use Qovery to deploy their production, staging, and development environments on AWS in a few seconds. Join Albane (Product Marketing at Qovery) and Pierre (CTO and Co-founder at Spayr) to talk about how Spayr is managing multiple environments running on Kubernetes clusters on Qovery, and is empowering his team, from junior to senior developers, to create a new environment and test new ideas independently.

Site Reliability Engineering (SRE) explained

Google has introduced so many innovations that it’d be impossible to list them all. And we’re not just talking about the obvious things like search engine algorithms or nearly-ubiquitous programs and apps (Google Maps, Docs, Gmail) — not even self-driving cars. Today, we’re going to talk about one such innovation: Site Reliability Engineering. In a nutshell, SRE it’s a practical framework for software development that improves on even giants like DevOps. Wait, what?

Multipass 1.10 brings new instance modification capabilities

Developers rejoice! The Multipass team has been listening to your feedback, and we are excited to announce that the latest update to Multipass contains one of our most requested features – instance modification. For those who are just discovering Multipass, it’s software designed to make working with virtual machines as painless as possible. It has an intuitive command line interface, and abstracts away the hard work of configuring, launching, modifying and destroying VMs.

Managing the Looker ecosystem at scale with SRE and DevOps practices

Many organizations struggle to create data-driven cultures where each employee is empowered to make decisions based on data. This is especially true for enterprises with a variety of systems and tools in use across different teams. If you are a leader, manager, or executive focused on how your team can leverage Google's SRE practices or wider DevOps practices, definitely you are in the right place!

How to Change the Puppetdb Port in Puppet Enterprise

Occasionally in Puppet Enterprise, you may need to change the port PuppetDB consumes, for instance, if another service requires port 8081. While we in Puppet Support recommend that you change the port for the other service. If you can’t do that, Follow this Video and the attached Knowledge base article for a guide on changing the port.

Setting up Runbooks in Squadcast | SRE Best Practices | Squadcast

A Runbook is a compilation of routine procedures and operations that are documented for reference while working on a critical incident. Sometimes, it can also be referred to as a Playbook. From this video, learn to create, attach, reference and mark progress for incident resolution using Runbooks.

Spayr Manages Multiple Environments On Kubernetes With Qovery

Albane here, Product Marketing Manager at Qovery 👋 Yesterday we joined forced Pierre Olive (CTO and co-founder of Spayr) to talk about how they manage multiple environments on Kubernetes with Qovery and much more; if you missed it or would just rather read than listen, here is the recap.

Run dbt tests in parallel with CI/CD

One difficult challenge in the software development cycle is increasing the speed of development while ensuring the quality of the code remains the same. The data world has adopted software development practices in recent years to test data changes before deployment. The testing process can be time-consuming and prone to unexpected errors.

Deploying web applications on Kubernetes with continuous integration

Containers and microservices have revolutionized the way applications are deployed on the cloud. Since its launch in 2014, Kubernetes has become a de-facto standard as a container orchestration tool. In this tutorial, you will learn how to deploy a Node.js application on Azure Kubernetes Service (AKS) with continuous integration and continuous deployment (CI/CD).

Store and manage Datadog configurations as code with Performetriks' offering in the Datadog Marketplace

Performetriks is a service provider that specializes in assessing and improving application performance and security for enterprise clients. To streamline these processes, Performetriks offers frameworks for automation, benchmarking, and security testing, as well as tools that evaluate and improve application performance. This includes their Composer tool, an on-prem piece of software that allows teams to more efficiently manage monitoring settings by storing, tracking, and managing them as code.

Sleep through the night with self-healing infrastructure

What is self-healing infrastructure and why do you need it? The first part is easy; it’s exactly what the name implies. It’s a methodology for creating automation that allows systems to identify and repair errors and misconfigurations without any human action. The “why” is a little more complex, but, like self-healing infrastructure, is well worth the effort.

Making the Most Out of PromQL with VMware Tanzu Observability

Rachna Srivastava contributed to this blog post. Given the popularity of Prometheus and the open source community behind it, it’s no surprise that customers often ask about support for the Prometheus Query Language, PromQL. Many users are already comfortable with PromQL but need the additional performance and scalability of the VMware Tanzu Observability platform.

Research Report Observability at the Speed of Innovation 2022

IT innovation is happening at a record pace. With today’s complexities, you need deep insights into your IT environment—more than traditional monitoring tools can provide. Enter modern observability, a critical application. Observability moves beyond monitoring to help teams understand what is actually happening in the system by bringing together and correlating information from all layers of your IT stack. Observability gives teams deeper, more actionable insights into both the state of a system and the reasons for its behavior.

Kubernetes Load Testing Comparison: Speedscale vs K6

In this article, you’ll be introduced to two different load testing tools that are both able to work with Kubernetes; Speedscale and K6. Throughout this post you’ll be given a comparative view of how each tool performs in five different categories: Ease of setup, developer experience, working with the CLI, creating tests, and integration into CI/CD pipelines.

Happy Systems Administrator's Day!

Happy System Administrator’s Day! July 29th is the day we thank and honor all the hard-working system administrators (and network administrators, network engineers, IT helpdesk staff – basically anyone who helps keep the networks up and running) for all their hard work. You’ve spent the last year fixing problems, onboarding users, integrating new systems and keeping the entire world connected.

Recapping Yalla! DevOps 2022

TL;DR Yalla! DevOps 2022 community event — Learning. Networking. Fun. Driven by the DevOps community. All about the DevOps community. Yalla! DevOps was back again this year with an exciting lineup of content ranging from DevOps, DevSecOps, professional development and more. Local speakers from the DevOps community and industry leaders from around the world took the stage making it one of the best DevOps community events this year. Keep reading for a recap of Yalla! DevOps 2022.

Let's get confidential! Canonical Ubuntu Confidential VMs are now generally available on Microsoft Azure

On behalf of all Canonical teams, I am happy to announce the general availability of Ubuntu Confidential VMs (CVMs) on Microsoft Azure! They are part of the Microsoft Azure DCasv5/ECasv5 series, and only take a few clicks to enable and use. Ubuntu 20.04 is the first and only Linux distribution to support Confidential VMs on Azure.

Introducing Our Newest Integration with ServiceNow

Blameless just released a new integration to ServiceNow’s incident management ticketing solution. If you are a modern DevOps team moving towards SRE practices and you want to speed the time to incident resolution through streamlined, automated workflows, this is worth investigating.

What is Bashtop? Setup, Commands, and Shortcuts

Usually, we have top and htop to monitor the Linux system and get to know the running processes along with CPU and Memory utilization. But these commands have certain limitations which refrain them from giving a detailed overview of the system performance. This limitation is overcome by the alternative called Bashtop. In this blog, we will learn about Bashtop, its advantages, and disadvantages along with its shortcuts and installation guide.

Codefresh GitOps CD Launch Webinar

Introducing a Hosted GitOps Platform Automate GitOps best practices and deliver software faster Introducing a fully-hosted solution for DevOps teams seeking to quickly and easily achieve frictionless, GitOps-based continuous software delivery in the cloud. Codefresh GitOps CD has been very popular for its ease of use and centralized UI providing detailed CI/CD deployment insights and analytics that optimize software delivery for smoother, scalable DevOps automation leveraging open-source Argo.

Looking Beyond SNMP

In a previous blog post, we dove into the wayback machine and looked at Simple Network Management Protocol (SNMP) Traps – a technology that allows devices (including network devices) to send alerts when specific thresholds have been reached. In this post, we are going to be a bit more forward looking and discuss some technologies that will, in theory, replace SNMP. It is important to keep in mind that the demise of SNMP has been predicted for years (actually decades).

Pushing a project to GitHub

GitHub is a web-based platform used for project version control and codebase hosting. GitHub uses Git, a widely-used version control system. GitLab and Bitbucket are similar tools. Using GitHub is a prerequisite of most tutorials on the CircleCI blog, so it is helpful to learn to use it. In this tutorial, I’ll show you how to push a project to GitHub.

Best Practices for Multi-Cloud Provisioning and Governance

Our speakers presented the best practices for multi-cloud, provisioning and governance. You'll learn about the challenges and opportunities for managing multi-cloud, complex multi-cloud environments. You'll also learn a bit about the integration between ServiceNow and Cloudify that helps to make those processes more agile and efficient. Speakers: Ram Devanathan, Sr. Principal Product Manager at ServiceNow Jason Hammond, Director of Cloud Solutions at Cloudify Anthony Critelli, Sr. Solutions Architect at Cloudify

Why Hyperscale Data Centres Are Becoming More Important for Your Business

As the world propels forward into the digital age, organisations start to rely more and more on technology to organise and store their data. And with the proliferation of cloud computing and big data, businesses that are keen on dominating their landscape are turning to hyperscale data centre services to scale at an unprecedented level.
Featured Post

How to build a workplace developers want to join

With a high demand to fill roles in tech, businesses are left under pressure to create a workplace culture that will attract and retain developer talent. Yet they are struggling to do so. Recent research found that 93% of enterprises face difficulties when it comes to retaining skilled developers, suggesting that developers are seeking, but not finding, organisations that meet their requirements.
Featured Post

How DevOps can improve burnout and communication in healthcare

In healthcare details matter, and providing the services and support patients need means that zero quality compromise is key. As the pandemic has put a strain on most industries, those in healthcare may have had the heaviest burden. Regardless of speciality and size, operational inefficiencies and poor patient communication and service remain highly prevalent in medical practice today. These can lead to high rates of physician and administrative burnout as part of a stressful and busy career.

The New Hosted Gitops Platform Experience from Codefresh

Last month we announced the 3 major features we are adding to the Codefresh platform. Dashboards for DORA metrics, support for any external Continuous Integration system and a hosted GitOps service. The hosted GitOps experience (powered by Argo CD) is now available to all new Codefresh accounts (even free ones) so that simply by signing up you can start deploying applications right away to your Kubernetes cluster without having to maintain your own Argo CD installation.

How I made an impact in my first 100 days at Helios

Joining a new dev team can be an exciting but somewhat intimidating experience. On one hand, you’re jumping into new adventures and opportunities. On the other hand, most onboarding experiences are fraught with stress and a sense of overwhelming from how much you have to learn, fast, to be able to contribute to your new team. To be honest, I’d never worked at a place where the developer onboarding experience was particularly memorable – until I joined Helios.

3 common pitfalls of post-mortems

Small confession: we currently use the term 'post-mortem' in incident.io despite preferring the term 'incident debrief'. Unless you have particularly serious incidents, the link to death here really isn’t helping anyone. However, we're optimising for familiarity, so we're sticking to the term 'post-mortem' here. Ask any engineer and they’ll tell you that a post-mortem is a positive thing (despite the scary name).

SOC 2: Data Security For Cloud-Based Observability

As more companies adopt SaaS services over on-premise delivery models, there is a natural concern around data security and platform availability. Words on a vendor’s website can provide insights to prospective customers on the process and policies that companies have in place to alleviate these concerns. However, the old adage of “actions speak louder than words” does apply. Trust in a website’s words only goes so far.

Moogsoft Green Credentials

Waste is never a good thing. And rumblings of an economic downturn, alongside dire warnings of climate change, are making it increasingly necessary to address waste. As a society, we need to reduce consumption, data included. First, we all must acknowledge the high cost of data. Despite the prevailing opinion of the 2010s, data isn’t free. There’s a monetary and carbon cost to keeping data alive.

"Why Are My Tests So Slow?" A List of Likely Suspects, Anti-Patterns, and Unresolved Personal Trauma

“Lead time to deploy” means the interval from when the code gets written to when it’s been deployed to production. It has also been described as “how long it takes you to run CI/CD.” How important is it? It’s nigh-on impossible to have a high-performing team if you have a long lead time, and shortening your lead time makes your team perform better, both directly and indirectly.

Regex, character count, and word count validation | 30 Days of Form Demos | Jira Service Management

Have complex data validation needs? No worries! Learn how to use Regex (regular expressions) with your new forms in Jira Service Management. This is Day 22 of the forms for Jira Service Management 30 Days of Demos series. Build your first form without even logging in: New to Jira Service Management? Get it for free today: Click here to see all the videos in this series: Subscribe:

Network as Code Explained: How Ansible & Automation Support Agile Infrastructure

When considering application source code, the way you maintain consistency throughout environments is mostly straightforward. You write application code, commit it to source control, and then build, test and deploy via a CI/CD pipeline. Since the application is defined by the source code living in source control, the build will be identical in all environments to which it’s deployed. But what about the infrastructure on which an application runs?

A practical approach to Active Directory Domain Services, Part 10: A study into Group Policies and AD

We have covered a plethora of topics on Active Directory (AD) in parts one to nine of this series on Active Directory Domain Services. In this final and 10th part, we will look at one other crucial aspect of AD—Group Policies and Group Policy Objects (GPOs). We will discuss what Group Policies are and what role GPOs play in the effective setup of any AD environment.

Why Preview Environments Are The New Thing in DevOps

Consider the scenario where a complex product is being developed by dozens of engineers working on different features of a product. Not only the development environment is the same, but the staging environment is also shared. As different features are merged into the shared environment, they break the code. So QA has to wait until this is fixed. A feature or bug fix may be working perfectly on the developer’s own machine, but there is no way for the QA team to test that one feature in isolation.

Announcing CloudZero AnyCost: Cost Intelligence For A Multi-Service World

At CloudZero, we’re confident that we can organize cloud spend better than anyone else out there. For years, we’ve helped companies to efficiently organize spend, while overcoming traditional hurdles to visibility, like shared cost, incomplete tagging, and Kubernetes. We’ve helped customers who have struggled to attribute spend to do so in a matter of days — without saddling engineering teams with new projects or embarking on a resource tagging expedition with no end in sight.

Stop putting off patching!

Let's face it: no one likes patching. When I was a practitioner, we always put off patching until it was absolutely necessary. Until a business need – such as updating an application version or support ending for a version – arose, we didn't patch because "If it ain't broke, don't fix it." We all know this is a bad practice; let's remind ourselves why. The longer a system goes without being patched, the more changes will accumulate.

Celebrating IT's champions: our sysadmins

Sysadmins, short for system administrators, serve as a crucial subset of IT engineers and support staff and are often under-appreciated. Sysadmins are the lynchpins that provide continuity, performance, and security to the systems that connect every corner of the world. When COVID-19 scattered large workforces in offices across small home office networks, organizations relied on their sysadmins more than ever before to maintain work processes.

Announcing GitLab support on CircleCI

Today we are pleased to announce GitLab support on CircleCI. Teams using GitLab SaaS can now build, test, and deploy on CircleCI, and access CircleCI’s most popular features like Docker layer caching and automatic test-splitting. GitLab is now the third version control system we support, in addition to GitHub and Bitbucket.

Protect your cloud with Spot Security

Spot by NetApp is excited to announce that Spot Security is now generally available. Delivering continuous, automated security, Spot Security analyzes, detects, and prioritizes threats to surface the most critical risks and anomalies, while providing prioritized recommendations, guided remediation, and compliance.

How to use GitHub Actions securely

GitHub is one of the most popular source control platforms available. It relies on Git concepts, and millions of developers use it. GitHub Actions embrace all aspects of what source control needs, such as branching, pull requests, feature flags, and versioning. It also integrates nicely into third-party continuous integration and continuous development (CI/CD) pipelines or deployment tools like Azure DevOps, Jenkins, GitLab, and Octopus Deploy.

How to Manage Your Data Center During a Heatwave

The recent heatwave that brought record temperatures to the UK caused cooling systems to fail at a London data center resulting in downtime for Google and Oracle. According to Oracle, “Following unseasonably high temperatures in the UK south (London) region, two cooler units in the data centre experienced a failure when they were required to operate above their design limits.

Omnichannel Enablement: 4 technology success factors

The days in which a business could thrive by serving customers through brick-and-mortar stores alone are long gone. Almost all retailers now offer a variety of online and offline channels, often with some degree of integration to ensure a smooth customer journey across different touchpoints. However, even these multichannel and cross-channel strategies are increasingly falling short of modern expectations.

Top 105+ DevOps Interview Questions and Answers for 2022 ?

Thinking about breaking into the DevOps space? DevOps has become one of the biggest tech buzzwords. Tech giants – like Facebook, Amazon, or Google – have numerous open positions for DevOps engineers. But it is a competitive field to break into. So if you’ve been prepping for DevOps roles, here are some of the most common interview questions (and potential answers) to expect, including.

Hyperview DCIM 3.6 Software Release

ServiceNow CMDB integration is now available! As well, we continue to roll out enhancements to Firmware Management and 3D view. Now you can view, search and report on the current firmware version for all your assets, both Managed and Unmanaged. In 3D view, you can quickly hone in on a specific set of racks or assets using Focus Mode. Plus, Hyperview now generates a multi-level Heat Map for more nuanced temperature ranges.

Key metrics for monitoring Cilium

Cilium is a Container Network Interface (CNI) for securing and load-balancing network traffic in your Kubernetes environment. As a CNI provider, Cilium extends the orchestrator’s existing network capabilities by giving teams more control over how they build their applications and monitor traffic. For example, vanilla Kubernetes installations typically rely on traditional firewalls and Linux-based network utilities like iptables to filter pod-to-pod traffic by an IP address or port.

Monitor Cilium and Kubernetes performance with Hubble

In Part 1, we looked at some key metrics for monitoring the health and performance of your Cilium-managed Kubernetes clusters and network. In this post, we’ll look at how Hubble enables you to visualize network traffic via a CLI and user interface. But first, we’ll briefly look at Hubble’s underlying infrastructure and how it provides visibility into your environment.

Monitor Cilium-managed infrastructure with Datadog

In Part 2 of this series, we showed how Hubble, Cilium’s observability platform, enables you to view network-level details about service dependencies and traffic flows. Cilium also integrates with various standalone monitoring tools, so you can track the other key metrics discussed in Part 1. But since the platform is an integral part of your infrastructure, you need the ability to easily correlate Cilium network and resource metrics with data from your Kubernetes resources.

DevOps Tools

A tool that aids in automating the software development process is called DevOps Tool. It largely concentrates on interaction and cooperation between experts in product management, software development, and operations. A DevOps solution also enables teams to automate the majority of software development procedures including build, conflict management, dependency management, deployment, etc. and lessens human labour.

Preserve Stick Table Data When Reloading HAProxy

With HAProxy situated in front of their servers, many people leverage it as a frontline component for enabling extra security and observability for their networks. HAProxy provides a way to monitor the number of TCP connections, the rate of HTTP requests, the number of application errors and the like, which you can use to detect anomalous behavior, enforce rate limits, and catch application-related problems early.

Building Better In The Cloud: Getting Serious About Optimizing Cloud Spend

In the beginning, companies and cloud cost management vendors focused on reducing the absolute cost of the cloud. That would be the equivalent of solely focusing on the total cost of a sales organization versus considering how much new revenue they were booking, or the productivity of the sales team or the cost of customer acquisition. As cloud spend followed its rapid growth trajectory, curbing it most often relied on discounts.

Kubernetes Cluster Sprawl: How to Effectively Manage It Across Distributed, Heterogeneous Environments

If you’re managing multiple Kubernetes clusters at scale, you’ve probably run into Kubernetes cluster sprawl. And if you haven’t, brace yourself, because you’ll likely cross that bridge in the near future.

DevOps Roadmap: 14 Steps to Become a DevOps Engineer

So, you want to become a DevOps engineer? It’s a stimulating, challenging, high-paying career choice, but the lynchpin role holds software development and operations together. We’ve compiled a DevOps roadmap that includes all the steps required to fill the shoes of a DevOps expert. As you know, DevOps is a set of practices and tools to integrate and automate processes between IT and software development teams.

The Current State of Workload Portability

Have you considered cloud portability, i.e., the ability to easily move workloads between on-premises systems and across multiple cloud service providers (CSPs)? The idea is that workloads should run in the environment that delivers the most value for your organization, but as that “optimal” environment can change over time, you need to be able to move your workloads accordingly.

Conditional CircleCI pipeline execution

The DevOps practice of continuous integration and continuous deployment (CI/CD) improves software delivery. CI/CD platforms monitor and automate the application development process ensuring a better application, faster. CI/CD pipelines build code, run tests, and deploy a production-ready version of an application that has passed all automated checks.

Trunk-Based Development vs. GitFlow: Which Source Code Control is Right for You?

Managing source code with a defined method is one vital aspect of implementing effective application development. Today, two strategies for doing this stand above the rest: trunk-based development and GitFlow. Choosing the proper method for source code control is often dependent upon several factors, such as: In this article, let’s define and compare trunk-based development and GitFlow, look at the factors that drive an organization’s decision between the two.

Use data connections in your forms | 30 Days of Form Demos | Jira Service Management

Want to pull data into your form fields from outside Jira? No worries! Watch this short video to see you can connect to third-party data sources (including REST API). This is Day 19 of the forms for Jira Service Management 30 Days of Demos series.

What Is TBD? Trunk-Based Development & Its Role in CI/CD

In software development, the name of the game is to develop reliable systems in a fast-paced manner. As development shops have evolved to increase the speed of delivery, many organizations have embraced the Agile development practices of continuous integration and continuous deployment (CI/CD). But the very nature of fast-paced development introduces challenges — particularly around the quality and the reliability of the software being developed.

Everything You Need to Know About Deployment Environments

It's common practice that development environments for the same product are kept the same (or at least compatible) for smooth software development life cycle (SDLC) workflow. That brings the question, why do we need more than one environment for the same product. In today’s modern software development, it is crucial for product development teams to maintain an effective and rapid workflow if they want to gain a competitive edge in the product market.

What is User Datagram Protocol (UDP)?

One of the easiest transport layer protocols available in the TCP/IP protocol suite is the User Datagram Protocol (UDP). The communication mechanism involved is minimal. With UDP, neither the receiver nor the sender receives any acknowledgements of packets received. This protocol's shortcoming makes it unreliable and easier to process than many other protocols. Although UDP is considered an unreliable transport protocol, it uses IP services to ensure the best attempts are made to deliver data.

New Features and Enhancements in SQL Server 2022

In the era of technology defined by cloud computing, features evolve at the blinding speed of continuous deployment. When large software development organizations like Microsoft deliver semi-annual releases of products, like SQL Server, the volume of new features can be so large they can be hard to grasp. Microsoft continues to push the boundaries on what’s possible, both in on-premises and cloud data platforms, and they aim to make the life of data professionals much more manageable.

What Is AWS Application Cost Profiler? (+ A Better Solution)

One of the best features of the public cloud is that it utilizes a model that enables multiple organizations to share cloud resources. This approach leverages economies of scale, ensuring each tenant sharing those resources receives a lighter bill than if they utilized a private cloud dedicated to just them. Likewise, multi-tenancy in cloud computing is cloud architecture that enables multiple organizations/customers/users to share resources like virtual machines, storage, and server components.

Azure Automation Best Practices

Kelverion have put together this Azure Automation Best Practices Guide to support the creation of automation process in Azure Automation. Our consultants work with Azure Automation every day and have substantial experience with Azure Automation and IT automation built using other tools. It’s important to recognize that these are recommendations rather than hard and fast rules.

StackPod: Making Customers Successful With Martin Lako of StackState

A while ago, we asked our customers to write reviews about their experiences working with us. With an average rating of 4.6 out of 5 and ten reviews submitted and published within two weeks, we were humbled by the responses. As our CEO, Toffer Winslow wrote, “Perhaps the thing I was most proud of…was just how frequently our customers commented on the high quality of StackState employees they interact with and the caliber of service we deliver.”

What is Azure Active Directory?

Azure Active Directory (Azure AD) is a comprehensive cloud-based platform used around the world. It is an identity provider and access management service. If a company employs OneDrive, Skype, or Outlook, they are already using Azure in some capacity. Similarly, if a company uses Microsoft Teams or other applications in the Microsoft Office Suite, they are accessing them by logging into Azure AD.

Managing Your Hyperconverged Network with Harvester

Hyperconverged infrastructure (HCI) is a data center architecture that uses software to provide a scalable, efficient, cost-effective way to deploy and manage resources. HCI virtualizes and combines storage, computing, and networking into a single system that can be easily scaled up or down as required.

Automatically scale self-hosted runners in AWS to meet demand

Self-hosted runners allow you to host your own scalable execution environments in your private cloud or on-premises, giving you more flexibility to customize and control your CI/CD infrastructure. Teams with unique security or compute requirements can set up and start using self-hosted runners in under five minutes.

Unified Observability is the Solution IT Has Been Waiting For

IT teams have been relying on observability tools to (theoretically) provide intelligence and insights into operating conditions within an organization’s digital infrastructure for years. But most of these tools have come with significant shortcomings that leave IT teams wanting more.

Introduction to Confidential Computing

Public clouds are great! Yet, many users are still reluctant towards moving their security-sensitive workloads away from their private data centers and into the public cloud, due to a set of security concerns. To address these challenges, what we need is a way to perform a privacy-preserving computation that can protect the confidentiality and integrity of your workload. Confidential computing achieves this by running your workload in a hardware-encrypted execution environment, that is isolated from the cloud provider’s privileged system software (e.g. hypervisor, host OS, and firmware), as well as its employees.

How Retrospective Data Enhances Reliability Insights

When things go wrong, we try to learn for the next time. Every incident should be a learning opportunity to make your system more reliable for the future. Luckily with Blameless Reliability Insights, you can see patterns in incidents at a glance, right out of the box. In fact, the ability to tag incidents makes reliability data even more helpful by allowing you to collect granular details about reliability, especially as they pertain to your unique business needs. ‍

IPv4 vs IPv6 - What are the Differences?

An IP (Internet Protocol) address is a numerical label which is used for addressing he location and identification of the network interface for the devices connected to the computer network. The most used and popular IP version is IPv4 which uses 32-bit for IP addresses. Since the IPv4 became popular and the IPv4 addresses are getting depleted, Ipv6 is now used which uses 128-bit for the IP addresses.

Verify image signatures with GitHub Actions and KeylessPrefix

With the latest releases of Kubewarden v1.1.0 and the verify-image-signatures policy, it’s now possible to use GithubActions or KeylessPrefix for verifying images. Read our previous blog post if you want to learn more about how to verify container images with Sigstore using Kubewarden.

6 Best Practices to Create a Data Center Naming Convention

Naming conventions for assets have a rich, engaging, and often hilarious history. While recently organizations have switched to standardized and more informative naming conventions based on location and function, earlier (and even now) IT departments have had a lot of fun naming assets for their data centers and servers. IT departments have often showcased their inner geek by using names of Star Wars or Star Trek characters for their assets.

FireHydrant Tasks provide turn-by-turn navigation during an incident

An incident has been declared and your runbook has fired. Everyone is gathered in your Slack channel, the tickets are opened, and roles are assigned. Now what? This is when most teams manually update status pages and kickoff investigation streams using a patchwork of tribal knowledge and supporting playbook documents.

Data Center Analytics: Top KPIs Chosen by Experts

Today’s data centers generate a lot of data. Intelligent rack PDUs and other metered power infrastructure, environmental sensors, and the constant change in modern data centers all contribute towards a massive volume and variety of data. But data center professionals don’t have the time to collect all the data from its sources, analyze it, and derive insights from it that improve their data center operations.

A birds-eye view with the new dashboard

The Spot family has grown rapidly. Elastigroup, Ocean, and Eco have been joined by Spot PC providing Virtual Desktops, Ocean for Apache Spark, Spot Storage, and Security. Each of these solutions have individual space inside the Spot Console. Today we are excited to unveil a centralized dashboard that provides a full overview of your Spot organization. The new overview dashboard appears as the top option in the side navigation menu accessible to authenticated users.

Get to the Good Stuff

This video blew us away. In 15 seconds, Frito-Lay illustrated the deliciousness of Tostitos® Salsa comes down to 3 simple, wholesome ingredients with transparent packaging that enhances the flavor experience. With ultimate visibility, you can see the quality right through the jar. Man, we couldn’t stop talking about the genius of it all. We boiled that 15 seconds down even further. “3 ingredients. Chop, chop. Yum, yum.” (By my count, that’s 8 seconds)

7 Ways to Accelerate Cloud Native Development

Modern enterprises understand the need to move away from developing monolithic applications to ones that make best use of the cloud to enable business acceleration at scale and speed. That means transforming development to more resilient cloud native architectures that can be readily deployed to cloud, multi-cloud, and hybrid environments. What does it mean to be cloud native?

Save and share reusable dashboard widget groups with Powerpacks

Dashboards allow you to visualize and correlate monitoring data from across disparate data sources, technologies, and infrastructure components to understand what’s going on in your environment. In a growing organization, it’s paramount to standardize how teams build their dashboards to ensure their consistency and legibility.

More reliable merge checks

We are introducing a change to the pull request merge checks that will make them more reliable. Specifically, we will no longer allow pull requests to be merged while a build is in progress. It was possible for a pull request to be merged while some of its builds were still in progress and for those builds to fail after the merge has completed. This created an undesirable situation if build merge checks were enabled.

CICD Tool - Razorops integration with GITLAB

RAZOROPS is the best CI/CD tool since the platform can support, run tests, staging and AWS deployment, all within the pipeline. Razorops helps them continue to focus on their objective. It helps to eliminate queueing and speed up their total build, test cycle and to increase the quality of the code. Razorops integrates easily with GITLAB. You can set up your pipeline within 30 minutes through gitlab.

CICD Tool - Razorops integration with GITHUB

Razorops is a complete container native CI/CD solution handling all aspects of the software lifecycle from the moment a commit is created until it is deployed to production. It is a Saas based platform which helps to be 100% operational on a technical level so we can focus on delivering the best product in a short amount of time. Razorops integrates easily with GITHUB.You can set up your pipeline within 30 minutes through github.

CICD Tool - Razorops integration with BITBUCKET

RAZOROPS is a Saas based CI/CD platform.It is one of best CI/CD tools because it supports, runs tests, staging, all within pipelines. RAZOROPS is easy to maintain and easy to use. With Razorops, there will be very little overhead and it helps to focus on what matters. Razorops integrates easily with BITBUCKET. You can set up your pipeline within 30 minutes through BITBUCKET.

Developers vs. DevOps - the case for developer ownership

With the introduction of container orchestration frameworks like Kubernetes, the adoption of cloud-native technologies, and the transition to microservices architectures, engineering organizations were empowered to build scalable and complex applications. DevOps engineers have had an indispensable role in this revolution, enabling and supporting these processes.

Why SREs Need to Embrace Chaos Engineering

Reliability and chaos might seem like opposite ideas. But, as Netflix learned in 2010, introducing a bit of chaos—and carefully measuring the results of that chaos—can be a great recipe for reliability. Although most software is created in a tightly controlled environment and carefully tested before release, the production environment is harsher and much less controlled.

MetricFire: A Great Instrumental Monitoring Alternative

Instrumental has made the decision to shut down its platform starting August 2022 including its application, servers, and all related APIs being shut down. Users will need to migrate to another solution or risk all their data being permanently deleted! But Instrumental users need not fret!

A Data Lake Is Not Enough to Keep Your Observability Ambitions Afloat

Recently I heard one of our prospects talk about a competitor who was promoting their data lake and ask, how are we different than that? His question got me thinking about why a data lake alone does not provide the depth of observability you really need. The goal of observability is to help SREs, IT Ops and DevOps teams run their IT systems with close-to-zero downtime. Consolidating data from across your environment into a data lake is certainly a good step.

Container Management Report 2022: Timely Advice for a Surging Enterprise Kubernetes Market

2022 Gartner® Market Guide for Container Management identifies the major trends in the container and Kubernetes market and offers guidance for organizations deploying containerized platforms. D2iQ, whose offerings we believe align closely with Gartner recommendations, is listed as a Representative Vendor for container management. The findings in the Gartner container management report should be taken in context with the analyst firm’s predictions for widespread cloud-native adoption.

IDC report: How autonomous compliance ensures better business outcomes

A new report from IDC emphasizes just how critical autonomous compliance is for companies to ensure that their digital infrastructure environments are consistently hardened, resilient, and compliant. Leaders who prioritize compliance optimize company efficiency while reducing risk. The IDC PeerScape report outlines the best practices of these leaders, who, by implementing autonomous compliance, better protect their businesses.

Episode 5: Mooving to... Practical Postmortems

Episode 5, Mooving to… Practical Postmortems covers how to leverage postmortems to effectively learn from failure. Postmortems are a commonplace reference and are now considered a best practice in most modern engineering teams. However, there’s still a lot of confusion on what postmortems should be – and more importantly, what they should NOT be. Thom Duran, Senior Manager of Productivity from Panther walks us through all that and more in the latest Mooving To.. episode!

How Tanzu Application Platform and the Backstage Developer Portal Improve DevX

As cloud native concepts and adoption take hold, many enterprises are now considering and implementing ways to achieve the primary objective of cloud native technology: enabling engineers to make significant changes to systems easily, frequently, and confidently. More and more enterprises are recognizing that cloud native technologies, such as Kubernetes, can indeed serve as the foundational infrastructure for building their own in-house platforms, greatly empowering their operations teams.

How testing in the cloud delivers value to development teams

Testing is an integral part of the software development process and is one of the key ways development teams can better understand how applications function. Testing also prevents changes in the codebase that can affect other parts of the code, enabling you to measure the quality of the software and eliminate any errors before users can interact with it. Most development teams use unit and integration tests assess their software.

What should you choose? Docker Swarm vs Kubernetes

Since the introduction of containerisation by Linux many years ago, maturity has shifted from the traditional virtual machine to these containers. These tools have made application development much easier than the initial process. Docker Swarm and Kubernetes came into action when the number of containers increased within a system, they helped orchestrate these containers. A question that arises is, which one is the better option?

Topology Is Critical for AIOps

In this video, I explain what topology is and why it’s critical for the success of AIOps projects. Simply adding machine learning to event correlation has proven an ineffective approach for root causing IT issues in environments of any size or complexity. If you’re considering different approaches to AIOps, there are two questions you need to ask about topology. This brief video will arm you with those questions and will help make your AIOps project(s) successful.

Announcing HAProxy Data Plane API 2.6

In HAProxy Data Plane API version 2.6, we continued the effort of expanding support for HAProxy configuration keywords, as this has been the priority with this release cycle, and it will be in the next one too to meet our goal of achieving complete feature parity with both the HAProxy configuration and Runtime API. This will enable you to use HAProxy Data Plane API for configuring HAProxy without any gaps in functionality.

Building a Custom Grafana Dashboard for Kubernetes Observability

Distributed systems open us up to myriad complexities due to their microservices architecture. There are always little problems that arise in the system. Therefore, engineering teams must be able to determine how to prioritize the challenges. Viewing logs and metrics of such systems enables engineers to know the shared state of the system components, thereby informing the decision-making on what challenge needs to be solved most immediately.

Deliver IT infrastructure faster with Continuous Delivery for Puppet Enterprise

Puppet Enterprise users can deliver infrastructure faster with Continuous Delivery for Puppet Enterprise. Watch this demo to learn how to automate testing and delivery of Puppet code from the commit all the way through deploying code into production.

Ensure quick and safe IT infrastructure changes with Puppet Impact Analysis

As your DevOps practice matures, and more users make more changes to infrastructure, it's essential that your changes are safe and deliberate. Impact Analysis is an extension of the core Puppet Enterprise installation. Watch our demo to learn how it provides the guardrails to operate quickly and safely at scale.

When and Why To Adopt Feature Flags

What if there was a way to deploy a new feature into production — and not actually turn it on until you’re ready? There is! These tools are called feature flags (or feature toggles or flippers, depending on whom you ask). Feature flags are a powerful way to fine-tune your control over which features are enabled within a software deployment. Of course, feature flags aren’t the right solution in all cases.

How Does Docker Network Host Work?

Docker is a platform as a service product. With Docker, you can easily deploy applications into Docker containers. Containers are software "packages" that bundle together an application's source code with its libraries, configurations, and dependencies. This helps software run more consistently on different machines. To use Docker containers, you need to understand how Docker networking works. Below, we'll answer the question: "what is Docker network host?". We'll also take a look to see how it works.

Our fully-redesigned incident response experience delivers a more intuitive workflow

Today we’re releasing fully redesigned Slack and Command Center experiences for FireHydrant so anyone on your team can intuitively navigate the incident response process — in the app or on the web. There are many things you can do ahead of an incident to help things run smoothly: design and document your process, automate predictable steps, train the team, and run drills.

Default Pull Request Tasks

There are multiple ways to create a task on a pull request. They can be added from the sidebar, top-level pull request comments, file-level comment or inline comments. Once created, they all appear in the sidebar. On any repository, merge checks can be configured for any branch to only allow merging if all pull request tasks are resolved. This is a very useful functionality if some tasks are critical to be resolved before changes are merged.

Voice Network Fraud: How to Fight Back with Automated Threat Prevention

Telecommunications fraud is estimated to be a $39 billion a year problem according to the Communications Fraud Control Association. Despite that, less than 50% of enterprises* have implemented any sort of strategy to address fraud in their voice infrastructure. Firewalls and SBCs are not enough to provide a secure voice network. Enterprises need a more complete approach to network security—one that encompasses the unique vulnerabilities of real-time communications systems—to preempt issues and protect the organization as a whole.

Production Environment Review: The Ultimate Checklist

You’ve written code, you tested it and built it. Now, your release is ready to deploy into production. But: is your production environment ready for the release? That’s a question every IT professional and platform engineer should be asking before accepting a new release — whether the release is an update of an existing app or a totally new deployment. To that end, here’s a checklist to make sure that your production environment is ready to go.

Sponsored Post

Open Source vs. Commercial Cloud Monitoring Tools: How to Choose

There is a multitude of options on the market when it comes to open source and commercial monitoring platforms that are available for cloud management. It can be hard to sift through the various tools and come to an informed decision on what is the best fit for your team. In this article, we will explore the strengths and weaknesses of both open source and commercial tools and when each option is suitable for deployment.

Is Kubernetes Hard? 12 Reasons Why, and What to Do About It

Getting Kubernetes right is hard. If you’ve ever checked out Kelsey Hightower’s “Kubernetes the Hard Way,” you’ll know what we are talking about. Tell your family and friends you’ll see them sometime in the not-so-near future because Kubernetes will be consuming your life. Although Kubernetes adoption is skyrocketing, not all deployments succeed, and the issues that cause deployments to fail can occur between Day 0 planning and Day 2 operation phases.

A new look for Delight, the free, cross-platform monitoring UI for Spark

Delight is a free, cross-platform monitoring UI for Apache Spark featuring: You can install it on top of any existing Spark infrastructure – EMR, Databricks, Spark-on-Kubernetes open-source, Cloudera/Hortonworks, … – by attaching an open-source agent to your Spark applications. Delight consists of an open-source agent attached to your Spark job, and a hosted backend accessible at delight.datamechanics.co.

What To Look For In A Cloud Cost Tool: Must-Haves And Red Flags

When you’re already strapped for time and bogged down by cloud bills that seem higher than they should be, you may not be inclined to devote months to researching the best cloud cost tools. And unless you’ve used a cost tool before, you may have no idea where to start. What happens if you use multiple cloud providers? Is one payment scheme better than another? And what do you do if your company hasn’t stayed on top of tagging?

The Observability Maturity Model Webinar | StackState, TechStrong Research, Ripple X

Based on research and conversations with enterprises from various industries, StackState created the Observability Maturity Model. This model defines the four stages of observability maturity. The ultimate destination is level four, Proactive Observability with AIOps.

Getting Started With Observability on Kubernetes | Webinar with Ricardo Santos and Andreas Prins

Monitoring has traditionally been a way for IT operations to gain insight into the availability and performance of its systems. However, today IT organizations require more than just monitoring. They need a deeper and more precise understanding of what is happening across their IT environment. This is challenging, as infrastructure and applications span multiple environments and are more dynamic, distributed and have to support more ongoing change than ever before.

Generating Secure Passwords for your Linux Server

Having a strong password is necessary to protect our information from being accessible by others. A strong password should be difficult to be identified, guess or decrypt by the attackers. Mostly, while entering passwords, we will be prompted to enter the upper case and lowercase letters along with numbers and special characters. But thinking of a new password every time is very difficult and most people end up repeating the same password for every website and application they use.

Automating compliance in software delivery

Software development teams face a large and growing number of obstacles: shifting design requirements, organizational blockers, tight deadlines, complicated tech stacks and software supply chains. One emerging challenge that developers and IT leaders face is the need to stay compliant with regulations and control frameworks that stipulate comprehensive data security, incident response, and monitoring and reporting requirements.

State of DevOps 2022: Report Roundup

DevOps has never been more popular than it is today. Since first popularized nearly 15 years ago by Patrick Debois and Gene Kim, DevOps has become the standard approach for managing IT. In this blog post, we’ll look at key trends and data that paint a picture of today’s State of DevOps. You can learn more about the history and fundamentals of the topic in our article What is DevOps and why is it important?.

Don't Let Outages Ruin Your Reputation - Prevent Them With AIOps

The world is increasingly digital. The U.S. Census Bureau estimates e-commerce grew 14.2% from 2020 to 2021, for a total of $870.8 billion in sales. And just look at the trends in remote work. According to a FlexJob and Global Workplace Analytics report, remote work has grown 44% over the last five years and an astonishing 159% over the last 12. Indeed, much of America relies on a slew of digital apps and services to get business done every day. So what does this mean for businesses?

Is MetricFire An Alternative to Grafana?

In this article, we will talk about Graphite and Grafana monitoring systems, and their similarities and differences. Also, we will explain why it is an effective solution to use Graphite and Grafana together to monitor your system metrics. We will also learn about the benefits of using MetricFire. Sign up for MetricFire for free and store and process your system metrics with our hosted Graphite solution.

Edge computing vs cloud computing

By now, almost everyone is familiar with cloud computing in one form or another. Throughout the 2010s, the concept of cloud computing evolved within the software industry, then worked its way into everyday life as a universal household term. Somewhat less familiar is the concept of edge computing. The genesis of the “edge” dates to the first content delivery networks in the 1990s. Since then, the edge concept has primarily been the domain of network engineers.

Rack Diagrams: Why You're Doing Them Wrong

Rack diagrams, also known as rack elevations, are visual representations of the IT equipment in a server rack. They are used to track and manage what assets are in each rack and which U position they are in. Rack diagrams are very useful and commonly used for data center asset management and capacity planning. The information rack diagrams provide allow you to know what equipment you have, where you have space to deploy more, and can improve the troubleshooting process.

19 Best CFO Dashboard And KPI Examples For 2022

Your job as a Chief Financial Officer (CFO) carries a lot of weight, regardless of the size and industry of your organization. Among other responsibilities, you are directly accountable for ensuring your company's financial health by providing accurate, up-to-date, and actionable insights. Now, if your revenue comes from recurring subscription revenue, SaaS metrics form the backbone of your business. They measure your profitability and growth.

Press Release: Kelverion releases new Integration Pack for Azure Active Directory due to Microsoft API changes

Kelverion have released a new version of our Integration Pack for Microsoft Azure Active Directory. This new release is a complete rewrite of our Integration Pack because Microsoft have depreciated the underlying API we have been using. This means this release is a breaking update, any Runbooks built against an earlier version of the Integration Pack may cease to operate when you upgrade to this new version.

Top 10 Reasons For A Site Reliability Platform

A system’s reliability is one of the most important things that engineers should care about. They ensure customers are kept happy and keep organizations profitable. Investing in reliable processes and tools to ensure systems are reliable can be critical to company success. Site Reliability platforms are popular choice when it comes to monitoring and observing software services as they help make responding to and solving application problems easier.

Analyze VPC Flow Logs for AWS Transit Gateway in Datadog

AWS Transit Gateway is a service that makes it easy to connect multiple Amazon Virtual Private Clouds (VPCs), AWS accounts, AWS Regions, and on-premises networks together through a central hub. For AWS customers operating at global scale with many accounts and VPCs, AWS Transit Gateway greatly simplifies AWS networking architecture by eliminating the need to manage complex peering relationships and massive route tables.

Automate deployment of ASP.NET Core apps to Heroku

Known for its cross-platform compatibility and elegant structure, ASP.NET Core is an open-source framework created by Microsoft for building modern web applications. With it, development teams can build monolithic web applications and RESTful APIs of any size and complexity. Thanks to CircleCI’s improved infrastructure and support for Windows platforms and technology, setting up an automated deployment process for an ASP.NET Core application has become even easier.

How to Avoid Getting Your Pod OOMKilled

In this blog, understand why your pod has OOMKilled errors when provisioning Kubernetes resources and how Speedscale can aid with automated testing. When creating production-level applications, enterprises want to ensure the high availability of services. This often results in a lengthy development process that requires extensive testing for the applications or a new release.

Enhance Kubernetes data plane monitoring by scraping Ocean metrics via Prometheus

Spot Ocean functions as an autopilot for the Kubernetes data-plane, as it delivers container-driven autoscaling to continuously monitor and optimize your cloud infrastructure for the cluster. Positioned at a busy crossroads in your application deployment pipeline, Ocean generates and maintains data in several manners/formats – data which is valuable when monitoring the containerized environment.

Why DevOps Engineers Love and Recommend Qovery

My team and I built Qovery to empower DevOps engineers and Developers to better work together - without compromises. In 2022, DevOps engineers need to build reliable infrastructure on top of the best cloud service providers (e.g. AWS, Azure, GCP), dealing with security concerns, productivity, reliability, and many services. DevOps engineers are responsible for a lot of things in an organization. From CI/CD, to the run of the apps in production and the backup of databases.

SCP Port: Secure Copy Protocol Definition & Examples

The SCP port has proven to be a very useful tool for SysAdmins. In short, the Secure Copy Protocol (SCP) is a method for securely transferring computer files between a local host and a remote host, or between two remote hosts. It is based on the Secure Shell (SSH) protocol. In other words, SCP servers help you transfer files to and from servers, computers, and other networking devices using a secure SSH tunnel.

Fewer Alerts is Always Better, Right?

Let’s be honest, alert fatigue is a real thing and anyone telling you otherwise is flat out lying. If you have tools generating tens or thousands of daily alerts, eventually people will burn out and simply start ignoring alerts. Even if you have enough team members to divvy up alert reviews, the approach only works for a while. Trouble is, false positives are always generated when managing alerts, and people will eventually ignore false positives.

How to define and measure the reliability of a service

More and more teams are moving away from monolithic applications and towards microservice-based architectures. As part of this transition, development teams are taking more direct ownership over their applications, including their deployment and operation in production. A major challenge these teams face isn't in getting their code into production (we have containers to thank for that), but in making sure their services are reliable.

What's New with VMware Tanzu RabbitMQ for Kubernetes 1.3

Paula Stack and Roser Blasco co-wrote this post. As a refresher, VMware Tanzu RabbitMQ is based on the hugely popular open source technology RabbitMQ, which is a message broker with event streaming capabilities that connects multiple distributed applications and processes high-volume data in real-time and at scale.

Promoted to SRE Advocate: A Dream Turned Reality

I get chills thinking about a line from the first film adaptation of Roald Dahl's Charlie and the Chocolate Factory, Gene Wilder as Wonka nearly whispers it to Charlie, as if it is secret information: We are the music makers, and we are the dreamers of dreams. For me, the quote (taken from a poem by Arthur O'Shaughnessy) is austere: We are the creators of what we create, and what we create becomes what we are.

Easily Scale Your Graphite Deployment

The Graphite database has engineers feeling stuck. Perhaps you’re one of them. You find yourself collecting metrics that were defined years ago when the system was put in place, likely by someone who is no longer with the company. These pre-aggregations make it necessary to collect more data, which results in increased infrastructure and disk space costs.

Best Open Source Application Monitoring Tools

As businesses grow and develop, so must the tools that help manage them. Application monitoring tools provide enterprises with a way to keep track of the health and performance of their applications and ensure that everything is running smoothly. Application monitoring tools have a wide range of capabilities and data that enterprises can use to help answer questions about the current state of an application.

Monitoring Your Platform From Multiple Locations

Mature start-ups and scale-ups create wonderful and challenging environments for Engineers. As the product they’re creating matures and the brand becomes a successful one, the user base generally starts growing, and, for some companies, in places they might not expect it to grow. As that happens, new challenges arise for Engineers. One of these challenges is pretty straightforward to guess. Basically having a particular product available throughout different regions of the world.

Kubernetes 101: How To Set Up "Vanilla" Kubernetes

Kubernetes is an open source platform that, through a central API server, allows controllers to watch and adjust what’s going on. The server interacts with all the nodes to do basic tasks like start containers and pass along specific configuration items such as the URI to the persistent storage that the container requires. But Kubernetes can quickly get complicated. So, let’s look at Vanilla Kubernetes — the nickname for a a K8s setup that’s as basic and elementary as it gets.

Using Automation to Transform IT From Cost Center to Value Driver

In today’s digital age, IT has become the central component of business operations. Yet, despite its critical importance, skilled technicians continue to find their hands tied by time-consuming manual tasks. And, while many of these tasks are essential, they do virtually nothing to drive innovation. Introducing automation into the mix can free up IT talent to focus their skills on more important business initiatives – particularly those that drive change and generate revenue.

CICD Pipeline | Case Study | Razorops | 72pi

72pi is an all in one platform that enables you to compare and adjust your portfolios based on various parameters and complement your investment process and become a smarter investor at portfolio construction. With Razorops they are able to move and fix faster. Efficient feedback and a fast CI/CD pipeline allowed their team to do frequent end-to-end production deployments and get products into customers hands more quickly and efficiently.

How To Minimise Alert Fatigue In SRE

Alert fatigue occurs when people become desensitized to the overwhelming number of alerts they receive and are expected to respond to. Even though these alerts are typically easy to respond to, it is the sheer number of them that ultimately causes people to feel fatigued. The higher the number of alerts, the more likely it is that employees are likely to begin to ignore and potentially miss an important alert leading to bigger consequences.

How Gremlin's reliability score works

In order to make reliability improvements tangible, there needs to be a way to quantify and track the reliability of systems and services in a meaningful way. This "reliability score" should indicate at a glance how likely a service is to withstand real-world causes of failure without having to wait for an incident to happen first. Gremlin's upcoming feature allows you to do just that.

Monitor your T2A-powered GKE workloads with Datadog

Arm processors have become increasingly popular in recent years, providing energy-efficient, cost-effective processing power to both mobile and cloud computing ecosystems. As a part of this growth, more and more organizations are choosing to leverage the many benefits of Arm-based architectures for their containerized workloads. Today, Google Cloud announced its Arm-based Tau T2A virtual machines (VMs), which you can also use to run workloads in Google Kubernetes Engine (GKE).

We've raised $34M to help organisations be resilient in the face of failure

TL;DR: We’ve raised $34M to bring increased resilience to organisations around the world. With this latest round of investment we’re expanding internationally in the US, accelerating our product plans, and growing our amazing team 🎉 As technology becomes more complicated and runs an ever greater part of our lives, failure becomes more inevitable, and more costly.

12 SaaS Renewal Best Practices To Ensure Profitability

This is no secret to you. It costs up to seven times more to attract a new customer than to keep an existing one. Upselling an existing customer is also easier than upselling to a new one. So, keeping customers will increase your revenue and profitability. Not to mention, current customers are likely to buy more goods and services than new ones. But that's not all. The longer you keep a customer, the higher their Customer Lifetime Value (CLV) grows, thus maximizing your return on investment.

3 Pro Tips To Get The Most Out Of Qovery - Part 1

Some people spend hours on a spreadsheet and call themselves “Excell Ninja” here at Qovery; we spend hours on our console because, in case you don’t know yet, we test and deploy using Qovery for Qovery. After a year of using our console almost every day, I started to make a list of all the small tips and tricks that I was able to gather, and because sharing is caring, here are my top three tips to use on Qovery.

Migrate your PSPs to Kubewarden Policies!

As announced in past blog posts, Kubewarden has 100% coverage of the deprecated, and soon to be removed, Kubernetes PSPs. If everything goes as expected the PSPs will be removed in Kubernetes v1.25 due for release on 23rd August 2022. The Kubewarden team has written a script that leverages the migration tool written by AppVia, to migrate PSP automatically. The tool is capable of reading PSPs YAML and can generate the equivalent policies in many different policy engines.

What You Need to Know in This Year's Upskilling IT Global Report

As a gold sponsor for DevOps Institute’s fourth annual Upskilling IT report, we compiled some key takeaways. However, there is so much more to get from this report – so download and read the full report today. In this technology-driven world, skills have a very limited shelf life. The knowledge, tools and resources we rely on in the moment rarely stay relevant or useful forever – especially with rapidly changing demands.

Configuring a pipeline using multiple CircleCI orbs

Continuous integration/continuous delivery (CI/CD) tools give developers the ability to automate the software development process. As soon as developers push code to git, your CI/CD system can build, test, stage, integration test, deploy, and scale. That’s fantastic! In this tutorial, we will look at CircleCI orbs and how they can support your CI/CD practice. We’ll look at how to use multiple orbs and how orbs can help with multi-builds for a variety of application types.

Lessons learned while scaling Collapsed Reply Threads

When the first supporting server-side infrastructure for Collapsed Reply Threads (CRT) shipped with Mattermost v5.29 (November 2020), it included an ominous release note: > This setting is enabled by default and may affect server performance. While performance concerns are possible with any new feature, most features don’t require significant architecture and data model changes. Most features don’t ship incrementally across 20 monthly releases. And most features – to their credit?

Continuous Training and Deployment for Machine Learning (ML) at the Edge

Running machine learning (ML) inference in Edge devices close to where the data is generated offers several important advantages over running inference remotely in the cloud. These include real-time processing, lower cost, the ability to work without connectivity and with increased privacy.

Infrastructure as Code (IaC) vs. Infrastructure as a Service (IaaS)

The heart of any software development operation is infrastructure. This combination of virtual and physical assets ensures that the flow, storage, processing, and analysis of data remains efficient and as seamless as possible. When it comes to selecting a model for managing and deploying infrastructure, IT managers typically have two choices: infrastructure as code (IAC) or infrastructure as a service (IaaS).

Announcing HIPAA compliance for Platform.sh

If you’ve worked with us before, you know we take security seriously. We take measures necessary to safeguard your sensitive and personally identifiable information and comply with a variety of compliance standards. These include maintaining best security practices and aligning with regulations such as SOC-2, PCI-DSS, and GDPR.

Elevate App Development and DevSecOps Experience with New Integrations in VMware Tanzu Application Platform

Many businesses today rely on delivering modern applications that provide the best customer experience and competitive advantage on any cloud. Modern applications require a modern cloud native infrastructure. One of the clearest signs of cloud native technology mainstreaming (i.e., Kubernetes) is the rapid growth in the number of clusters being deployed in the multi-cloud environment.

Cloud Spend Is Now A Board-Level Issue, Survey Finds

According to Gartner, organizations spent $410.9 billion on cloud services in 2021. In the same year, executives estimated that as much as 30% of their cloud spend was wasted. That’s an aggregate $123.27 billion of waste — money that could go toward innovation, or that could help insulate companies from one of the worst market downturns in years. Organizations face a dilemma: The cloud is essential to modern business, and it’s only getting costlier.

Amazon OpenSearch + Squadcast Integration: Routing Alerts Made Easy

Developers often find comfort in embracing open-source software for numerous reasons. One of the most important reasons is the freedom to use that software anywhere and how they wish to. Amazon OpenSearch is an open-source search and analytics suite derived from Elasticsearch. It lets you perform interactive log analytics and real-time application monitoring with ease.

The Improved xMatters Group Experience: Product Feature Updates

We’re constantly looking for new ways to help DevOps, SREs, and operations teams automate operations workflows, secure infrastructure and applications, and rapidly deliver their products at scale. This commitment to our customers — and yours! — led us to redesign the way you experience groups in xMatters.

7 ways tagging incidents can teach you about system health

One of the most powerful ways to prepare for future incidents is to study and learn from patterns in past incidents. Blameless Reliability Insights highlights these patterns for you, with out-of-the-box dashboards that automatically collect and present all types of statistical information about your incidents.

Making the wrong choice on build vs buy

A few years ago I’d just moved to London and started out at my first software job. I was having a great time building things and making new friends, and one evening a friend and I decided there was a new problem we wanted to solve: we really didn’t like the expenses software. We thought it was confusing and over-complex, and decided we could do better.

Stream application logs into Cloud Logging

Do you have workloads that generate logs inside your Google Compute Engine (GCE) instances? Would you like to troubleshoot your application directly from Google Cloud Platform? Then check out this video to learn how to install and configure the Ops Agent to stream any third party application log into Cloud Logging.

Building World Class Solutions Together

With the launch of the Cycle Partner Program, we are committing to making it easier for companies to work with Cycle by creating more transparent and predictable relationships, offering training, resources, incentives, and benefits, some of which will roll out over time as the program evolves. Interested in partnering with Cycle? Contact our partner lead to schedule a meeting.

Sponsored Post

Using Open Source for API Observability

API Observability isn't exactly new, however it's popularity has seen rapid growth in the past few years in terms of popularity. API Observability using open source is different from regular API monitoring, as it allows you to get deeper and extract more valuable insights. Although it takes a bit more effort to set up, once you've got an observability infrastructure running it can be immensely helpful not only in catching errors and making debugging easier, but also in finding areas that can be optimized.

10 Best DevOps Tools to Make Your Life Easier

DevOps is not just a set of practices but a culture that focuses on collaboration. Thus, its primary goal is to foster improved communication between software development and operations teams. There’s, more or less, a perfect DevOps tool for every job. This article will cover the 10 best of them so you can make informed choices.

Multi-cloud connectivity tips: Transferring data between cloud regions

When it comes to multi-cloud, it’s easy to assume that the biggest connectivity challenge facing network professionals would be connecting between different cloud providers. But actually moving data between different geographic instances of the same cloud platform is not as straightforward as you might think.

Shutdown by a Cyber Criminal: What To Know About Security Breach Preparedness and Response in 2022

Winston & Strawn LLP Health Care and Life Sciences Summit 2022. Watch this session for an engaging discussion with our VP, Privacy and Security Joey Stanford and our Data Privacy Counsel and Manager of Data Privacy and Compliance Erika Bustamante about recent lessons learned from security breach investigations, including the details that most companies do not think about until they are in the thick of an incident response.

Is Open-Source Kubernetes Free? Yes, "Like a Puppy." Here's Why.

The common misconception of open-source Kubernetes is that it is free—but in reality, it has a lot of associated costs, including labor and potential business losses from wasted time, effort, and being late to market. Just like a puppy, Kubernetes software itself might be free, but a do-it-yourself (DIY) deployment involves a lot of care, patience, and unforeseen costs.

UX Deep Dive: Classify interactions for a more intuitive user interface

We try hard to make our products as intuitive and familiar as possible, but there will always be “advanced” options and rarely-used features. Giving users choice and control over their experience will naturally lead to features that are used less frequently or settings that only a small percentage of users will change. So how do we decide what order and prominence to give to these lesser-used features?

How the best leaders are fueled by failure ft. TaxBit CTO, Tramale Turner

How should leaders be showing up during failure? Rob sits down with TaxBit CTO, Tramale Turner to dive deep on embracing failure. In this episode, learn how to integrate failure as part of your company culture, when to ask pertinent questions, and how to not make the same mistake twice.

Optimizing Developer Experience with Open DevOps

Delivering customer value and fostering development innovation through fast and iterative releases has been top of mind, thanks to the advent of digital transformation. Developers must adapt to an increasing work pace and workload, while navigating a dynamic workplace. Many organizations are relying on industry prescribed methodologies and tools to adapt to this new normal. However, these same organizations are learning this approach doesn’t deliver the “have it all” outcome they expected.

Automatically enroll your AWS accounts with the Onboarding Stackset

Spot’s onboarding process is simple: once you have a new AWS account, you are only a few clicks away from creating a Spot account and associating the two together. But for bigger organizations that are provisioning AWS accounts regularly, this repetitive process becomes a bit laborious. There’s also another quibble with manual onboarding: suppose you want to grant Spot new permissions for a new product you’d like to use in all of your AWS accounts.

Important Ways to Regain Control of Azure Cost Management

This blog will brief on vital ways to regain control of Azure cloud cost and to achieve cost optimization. To start with, let us have a short introduction about Azure cost management. As the growth towards cloud adoption is increasing in enterprises, it is getting more difficult for enterprises to manage the cost spent on the cloud across the organization.

Blueprint for Secure OSS Supply Chains

Open source has become a critical part of global infrastructure. Kubernetes and cloud native adoption is seeing record high growth, especially at large companies. An estimated 5.6 million developers use Kubernetes today. Alongside this growth, software supply chain attacks are on the rise with some reports showing them having increased 650% in 2021. These attacks have had huge knock-on effects to the extent that the White House has issued an executive order and additional guidance with recommendations and upcoming regulation.

DevOps Engineer Job Description

As a manager, it’s up to you to be on-point about who does what in your team. While all of the job titles and descriptions can get a little confusing at times, knowing comes with the territory. Nowadays, DevOps engineers are an increasingly important part of any solid IT team. Today, we’ll take a look at the DevOps engineer job description, so by the time we’re done, you’ll be an expert on the topic. DevOps, as we’ve covered before, is not one thing, but many.

The Difference Between Generation 1 and Generation 2 AIOps Platforms

In this video, I explain the key difference between Generation 1 and Generation 2 AIOps platforms. As organizations develop strategies for implementing AIOps and as they consider different vendor approaches, it’s critical to understand the differences between those approaches. This brief video will help arm you with a key question you need to ask to easily identify the difference between Gen 1 platforms and Gen 2 platforms. It’s all about the types of data being collected.

Creating competitive differentiation in telcos through cloud-native

It’s never too early or late to start talking about cloud-native. By 2025, more than 95% of new workloads will be deployed on cloud-native platforms. Clearly, a lot of organizations are on their way to cloud-native adoption, among them some of the prominent telecom operators of our time. After all, the benefits of cloud-native are most pronounced in the telecom sector, where the need for scale, automation and predictable cost structure at optimal OPEX and CAPEX is more persistent than ever.

6 Snowflake Cost Optimization Strategies And Best Practices

Database costs are a fact of doing business for most cloud-based companies. But unlike a traditional database, which runs all the time and therefore charges you a fairly flat rate for continuous service, Snowflake operates on a different pricing model. Snowflake allows you to set up multiple database warehouses that store your raw data. You can access these warehouses on demand, whenever you execute a query.

What is Transmission Control Protocol (TCP) and How it works?

The Transmission Control Protocol provides reliable, ordered and, sometimes, time-sensitive data flow between applications across a network. As well as economizes network use by attempting to improve error-handling capability and providing reliable data transmission. The Transmission Control Protocol is the underlying communication protocol for a wide variety of applications, including web servers and websites, email applications, FTP and peer-to-peer apps.

DevOps Best Practices for Database

DevOps has been bridging the gap between the development and operations teams for more than a decade. It is eliminating the organizational barriers between the two and automates the delivery process. It's time to start treating databases the same way we treat the delivery pipeline when applying DevOps. When we have a large database, automation is crucial. When the database has too much information, changing a table can take ages and block further changes like inserts, updates, or deletes.

VMware Application Catalog Now Accessible through VMware Marketplace

Neeharika Palaka and Shagun Tewari co-wrote this blog post. VMware Marketplace is VMware’s one-stop shop for all ecosystem solutions, with a robust catalog of more than 2,000 solutions covering open source software, first-party tools, and commercial software. VMware Marketplace is currently used by thousands of people to download, deploy, subscribe to, and purchase these solutions in a direct and easy way.

That's a Wrap for DevOps Loop 2022: Recap and Highlights

For the second year in a row, the DevOps community came together virtually for our DevOps Loop conference. This event allowed us to examine DevOps and its core principles in the context of modern applications, multi-cloud, and Kubernetes. Organizations are increasingly looking to internal platform teams to deliver an awesome developer experience while ensuring reliability, scalability, and security, by unlocking the path to production for modern apps and helping their products soar!

Chaos Engineering Tools: Build vs Buy

Chaos Engineering, where engineers intentionally inject failure to test the reliability of their systems, is becoming a regular practice for companies who value uptime and availability. As cloud-based systems have grown more complex, Chaos Engineering has become a critical part of the software testing and release process to uncover surprise dependencies, fix problems before they become 3am outages, and bake reliability into every feature.

Redgate SQL Monitor - Monitor SQL server performance and availability

Open up Redgate SQL Monitor and, wherever your servers are, you’ll get the full picture of their health in an instant. Its web-based interface gives you an at-a-glance understanding of your entire estate whether it’s hosted on-premises, on Virtual Machines, or in Azure, AWS or Google Cloud. You can then drill down to analyse both current and historic metrics for your servers such as top queries, waits, tempdb and more.

Cut Your Cloud Burn with Intel and Densify

Intel Cloud Optimizer (ICO), powered by Densify, the market leading resource optimization analytics engine, will tell you if there are immediate and truly actionable opportunities to reduce costly excess resources safely to immediately cut cloud related burn. For qualifying enterprises (based on annual cloud spend) Intel funds the cost of advanced analytics software and expert assistance from both Densify and Intel cloud architects.

CICD Pipeline Using Razorops | Continuous Integration | Continuous Deployment

Teams can have a complete CI/CD pipeline with the aid of Razorops. It is a fully managed continuous integration and deployment platform that is a complete software package that is ready for deployment, runs tests, and builds container images from source code. With Razorops, teams have a fully managed continuous integration and deployment platform that creates container images from source code and runs tests on a complete software package that is ready for deployment.

Kubernetes Cluster Autoscaler vs Karpenter

One of the most exciting things when using Kubernetes is the ability to scale up and down the number of nodes based on application consumption. So you don’t have to manually add and remove nodes on demand and let it go on usage. Obviously what you want is to keep control on the minimum and the maximum number of nodes to avoid an unexpected bill.

What is LDAP and How Does it Work?

LDAP or Lightweight Directory Access Protocol is one of the oldest and most popular protocols used to retrieve information from directory services, authenticate users, and build applications that don’t compare on security or speed. It’s one of the protocols to manage assets and data over a network and provides secure access to them. So what is LDAP? How does it work? What are some of the best practices while using the LDAP protocol? Let’s have a look.

Bitbucket Cloud migration Q&A

With support for Atlassian Server products ending in February 2024, many of you are likely evaluating or planning your migration to Bitbucket Cloud. To help you navigate the migration process, we've put together a list of frequently asked questions that we hear from customers. Each section below has several resource links to help you learn more and start planning your migration.

Tracing Gorm queries with OpenCensus & Google Cloud Tracing

At incident.io we use gorm.io as the ORM library for our Postgres database, it’s a really powerful tool and one I’m very glad for after years of working with hand-rolled SQL in Go & Postgres apps. You may have seen from our other blog posts that we’re heavily invested in tracing, specifically with Google Cloud Tracing via OpenCensus libraries.

Enabling Trust Driven Development - Shipa Insights

When you think of TDD, you might lean towards Test-Driven-Development. Though in Tomasz Manugiewicz’s ACE 2022 talk, the ‘T’ in TDD could also mean Trust e.g Trust-Driven-Development. The talk, boils down to if there is trust, there is autonomy. If there is autonomy, creativity flourishes. Building trust is done incrementally, incremental success builds success. Software engineering is a team sport and an exercise in iteration.

Introducing VMware Tanzu GemFire for Redis Apps

The release of VMware Tanzu GemFire 9.15 introduces compatibility with the VMware Tanzu GemFire for Redis Apps add-on. This add-on enables compatibility between Redis applications and Tanzu GemFire for the first time ever, unlocking enterprise-ready features for your Redis applications.

We're increasing the default cron jitter from 5 to 20 minutes

At Platform.sh, we are committed to making your site perform as best as possible. As part of this commitment, we need to smooth down the system load spikes as much as possible—especially when many crons are triggered at the same time on a particular Grid region. To do so, we are increasing the default cron jitter from five minutes to 20 minutes.

Add traceability to your pipeline with Configuration as Code

Configuring applications, services, and environments by modifying plain text files is a standard part of modern software development. Configuration as Code (CaC) takes this one step further by systematically generating, storing, and managing configuration files. CaC allows development teams to automate config management for their applications and environments while ensuring consistency and traceability throughout the development life cycle.

The future of K3s and Kubernetes

Join us in our roundtable panel as we discuss how the future of k3s is being shaped by the industry and examples of how we utilize k3s applications. Kunal Kushwaha and Kai Hoffman, Developer Advocates at Civo, will address the concepts surrounding k3s as well as where the Kubernetes industry is heading. K3s is designed to be a single binary of less than 40MB that completely implements the Kubernetes API. This is recognized as a fully CNCF (Cloud Native Computing Foundation) certified Kubernetes offering whilst removing a lot of the extra drivers that aren’t needed.

geeks+gurus: Tackling Common DevOps and Security Issues in Game Development

In this 25-minute conversation, Melissa Sussmann and Jason Dunne will lead a discussion with special guest Yuval Dovrat - Amazon Web Services, Solutions Architect. Discussion will cover the unique challenges gaming presents for DevOps practitioners and security engineering teams. We will cover.

Monitor custom serverless metrics with the Datadog Lambda extension

When building serverless applications on AWS Lambda, Amazon CloudWatch provides out-of-the-box metrics that measure the performance, errors, and duration of your functions. Although these standard Lambda metrics provide visibility into your serverless applications, it can also be invaluable to monitor custom metrics that are unique to your use case and application.

Find Flow Podcast - The Past, Present and Future of AIOps

In this video, Sean McDermott of the Find Flow podcast sits down with Ani Gujrathi, chief technical officer for Zenoss, and myself. We dive into the original approach to AIOps, how that has evolved, and how it continues to evolve. We start by exploring the entire purpose of AIOps — figuring out how to accelerate problem resolution in modern, complex IT environments while dealing with the pervasive problem of monitoring tool silos.

Creating your first Pub project with JFrog Artifactory

Developers today need to build software from many platforms in order to reach their users. All while maintaining quality and achieving the best user experience possible. This can be a challenging task when you need to meet the growing needs of software development. This is where the Dart and Flutter come into the picture.

Drive Tanzu Mission Control Cluster Configuration and Add-ons with Flux CD

VMware Tanzu Mission Control users can now drive clusters via GitOps. This new feature of Tanzu Mission Control is built on Flux CD and enables users to attach a git repository to a cluster and sync YAML artifacts (using Kustomize) from the repository to the cluster. This feature provides a method for managing cluster configurations with Tanzu Mission Control via continuous delivery from a git repository.

Building Everything-as-Code? Learn These CI/CD Processes and Tools First

Here at Kublr we always emphasize how important it is to understand the foundations of Kubernetes (K8s) and its operations tools so you can more efficiently manage your applications and simplify your cloud-native development workflow. Understanding these components on the front end is equally important as we begin our build processes, especially when building with an everything-as-code approach.

A Path to Legacy Application Modernization Through Kubernetes

Modern application deployments rely heavily on containerization for its scalability, availability and ease of maintenance. Legacy applications implemented before the containerization era often use monolithic, hardware-centric architectures that are difficult to scale and manage. These legacy applications may have multiple services bundled into the same deployment unit without a logical grouping.

Benefits of running continuous integration jobs on self-hosted infrastructure

The first continuous integration (CI) tools were all self-hosted, meaning they ran on a developer’s local computer or server. Although this setup was viewed favorably by dev teams at the time, it has limited flexibility, and developers had to spend time maintaining the infrastructure.

Q2 2022 product retrospective - Last quarter's top features

The second quarter is now over and after the start of our V3 at the beginning of this quarter we are super happy to announce that it’s now out in Alpha but there is so much more to speak about so without further due, let me show you all the great things we achieve during the past quarter 🚀

How 2bcloud supports clients in setting up and implementing Continuous Integration/Continuous Delivery

Continuous integration (CI) / continuous delivery (CD) is a model that allows software development teams to automate the integration and delivery of code changes in a more frequent and reliable manner. This gives development teams more time to improve the quality of their code, test with greater depth, and leads to more customer deployments overall.

Proactively monitor service performance with SLO alerts

Service level objectives (SLOs) state your team’s goals for maintaining the reliability of your services. Adopting SLOs is an SRE best practice because it can help you ensure that your services perform well and consistently deliver value to users. But to gain the greatest benefit from your SLOs, you need ongoing visibility into how well your services are performing relative to your objectives.

What I learned from leading my first incident

A few weeks ago we had a major incident. We were releasing our Practical Guide to Incident Management, and after posting about it online an incident.io employee noticed that the page wasn’t loading. Just to set the scene, I’ve been at incident.io for 3 months and don’t have any experience of incidents in my previous role. When the team got paged I expected this to be one of those “follow along and learn how the wizards work their magic” exercises.

A CFO's Guide To Evaluating Cloud Spend

We have a term we like to use when we meet CFOs who have just gotten their biggest AWS bill ever: bill shock. Bill shock is when finance suddenly rings the alarm that the bill is “too high” and gets everyone scrambling to explain what they’re spending money on. It often happens when the bill reaches a new milestone (the first million, ten million, or hundred million) or growth trajectory (it doubled in a quarter!?). The problem with bill shock is that it can be highly disruptive.

Change Failure Rate explained

This post is the third in a series of deeper dive articles discussing DORA metrics. In previous articles, we looked at: The third metric we’ll examine, Change Failure Rate, is a lagging indicator that helps teams and organizations understand the quality of software that has been shipped, providing guidance on what the team can do to improve in the future.

Insurance Provider Reduces Software Licensing Costs, Saving Millions

A large U.S.-based insurance provider was experiencing rising database software licensing costs. In order to reduce the software licensing costs, the organization needed to complete a comprehensive infrastructure analysis of over 200 physical servers. 75 percent of these physical servers supported one software application, their database solution. Additionally, the software routinely only utilized between two and four cores, despite having 24 cores on each server.

Speedscale Traffic Replay is now v1.0

Nate Lee here, and I’m one of the founders of Speedscale. The founding team’s worked at several observability and testing companies like New Relic, Observe Inc, and iTKO over the last decade. Speedscale traffic replay was borne out of a frustration from reacting to problems (even if they were minor) that could have been prevented with better testing.

Why Now Is the Best Time to Become a Sysadmin

Does technology fascinate you? Are you curious about and interested in learning about different software, hardware and devices? If you answered yes to these questions, you should become a system administrator aka sysadmin. A sysadmin is responsible for monitoring and maintaining computer systems in a network or environment that has multiple users. It’s a great time to become a sysadmin now because the technology sector is booming and yet it’s facing a major Skills shortage.

Why Companies Are Increasingly Going Multi-Cloud

Multi-cloud strategy – the use of multiple private or public clouds – is increasingly becoming the main method companies use to deploy their IT infrastructure. In the next three years, an estimate 64% of companies will rely on multi-cloud as their main deployment model source. Despite the complexities that come from operationalizing it, as we disccussed in The Challenges of Building Multi Cloud, the multiple benefits that come from this deployment model can often make it worth the effort.

It's Time for a Straight-Forward Pricing Model

Today, we’re excited to announce the release of Cycle’s new pricing model! With this new model, we aim to make our pricing far more straightforward and better suited for larger deployments and customers. While our current pricing model solved the needs of our customers for the last few years, we’ve learned enough that it’s now time to make a change. Before talking about the new model, let’s dive into how we got here.

Civo Update - July 2022

In June, we hosted our online meetup with ContainIQ surrounding k8s monitoring and observability. You can catch up on the discussion between Matthew Lenhard (Co-founder & CTO of ContainIQ) and Kai Hoffman (Developer Advocate at Civo) here if you missed it. Meanwhile, Kamesh Sampath from our Developer Advocate demo program explains how Civo’s speed and developer experience is great to work with in our latest Civo Shorts.

Upcoming improvements: Reducing deployment downtime, improving caching strategy, and pausing crons

At Platform.sh, we are committed to making your deployment experience as fast and seamless as possible, so that you can continue pushing changes as much as you need, and keep your customers happy. As part of this commitment, we are releasing three new infrastructure improvements, which will greatly improve caching strategy and significantly reduce downtime during deployments.

Monitor Azure Functions with the Datadog extension for Azure App Service

Azure Functions is an on-demand serverless compute offering built on top of Azure App Service that enables you to deploy event-driven code without the need to provision and manage infrastructure. Because applications rely on Azure Functions to handle business-critical tasks such as processing orders or logging in users, it’s important to ensure that your functions respond quickly when they’re invoked.

Using Argo CD and Kustomize for ConfigMap Rollouts

Kubernetes offers a way to store configuration files and manage them via a ConfigMap. Functionally, they seem very similar to Kubernetes Secrets, where both constructs are used to store information that can be used in a Pod. This information could be usernames and passwords of a connection string to a database.

What Is Replatforming? Everything You Need To Know

Depending on your business reason for migrating to the cloud, you can either move in one go or incrementally. If, however, you prefer to retain some of your operations, app design, or workflows on-premise, you can still do so based on the cloud migration strategies you use. Sometimes, it makes more sense to modify your existing system rather than make too many changes all at once. Cost, data security, and service availability (thus revenue and customer experiences being impacted) are just three concerns.

Build and deploy a Nuxt3 application to Netlify

Imagine you want to build and deploy a Nuxt3 app on Netlify. Because custom scripts are not allowed on Netlify, you will not be able to perform custom tasks like automated testing before deploying the website to your Jamstack hosting platform. That is where continuous integration/continuous deployment comes in. With a CI/CD system, you can run the kind of automated tests that create successful deployments.

DevOps Release Management Best Practices

Because DevOps practices can bring great speed and reliability to the software delivery lifecycle, release management can seem daunting. But, the improved visibility and collaboration brought about by DevOps can also help with the release management process. DevOps-centric release management is the future of software development and IT operations.

NoOps Explained: How Does NoOps Compare with DevOps?

Since the evolution of the IT industry, different concepts have been introduced to enhance and speed application production. Automating processes is gradually becoming the way forward and, so far, the best way to speed the deployment process of projects. Today, though, NoOps has come along. The prevalence of NoOps means manual intervention may not be needed in IT operations, but is this going to mean the extinction of DevOps? Turns out, NoOps might just be a next step in the progression of DevOps.