Citrix Hypervisor, formerly known as Citrix XenServer, is a type 1 hypervisor that enables organizations to run and manage an entire virtual infrastructure—including VMs, virtual desktops, and virtual applications. Organizations can also use Citrix Hypervisor to optionally host these virtual workloads with higher availability and flexibility by implementing managed server groups called resource pools.
Major outages are bound to occur in even the most well-maintained infrastructure and systems. Being able to quickly classify the severity level also allows your on-call team to respond more effectively. Imagine a scenario where your on-call team is getting critical alerts every 15 minutes, user complaints are piling up on social media, and since your platform is inoperative revenue losses are mounting every minute. How do you go about getting your application back on track? This is where understanding incident severity and priority can be invaluable. In this blog we look at severity levels and how they can improve your incident response process.
A few episodes ago, we talked with fellow podcaster and tech evangelist Dotan Horovits. During that episode, Dotan shared that he wrote a blog post with Jujhar Singh called “How Much Observability Is Enough?” which is definitely a recommended read if you’re implementing observability and feeling overwhelmed. After reading this article, we were eager to invite Jujhar to the StackPod as well, to dive into this topic a bit more.
Developers rejoice! The Multipass team has been listening to your feedback, and we are excited to announce that the latest update to Multipass contains one of our most requested features – instance modification. For those who are just discovering Multipass, it’s software designed to make working with virtual machines as painless as possible. It has an intuitive command line interface, and abstracts away the hard work of configuring, launching, modifying and destroying VMs.
Many organizations struggle to create data-driven cultures where each employee is empowered to make decisions based on data. This is especially true for enterprises with a variety of systems and tools in use across different teams. If you are a leader, manager, or executive focused on how your team can leverage Google's SRE practices or wider DevOps practices, definitely you are in the right place!
What if we could find a way to protect ourselves from Powershell Gallery outages, with a more highly-available option? Well, Adil may have just the very solution for you here at Cloudsmith! 😉
One difficult challenge in the software development cycle is increasing the speed of development while ensuring the quality of the code remains the same. The data world has adopted software development practices in recent years to test data changes before deployment. The testing process can be time-consuming and prone to unexpected errors.
Containers and microservices have revolutionized the way applications are deployed on the cloud. Since its launch in 2014, Kubernetes has become a de-facto standard as a container orchestration tool. In this tutorial, you will learn how to deploy a Node.js application on Azure Kubernetes Service (AKS) with continuous integration and continuous deployment (CI/CD).
Performetriks is a service provider that specializes in assessing and improving application performance and security for enterprise clients. To streamline these processes, Performetriks offers frameworks for automation, benchmarking, and security testing, as well as tools that evaluate and improve application performance. This includes their Composer tool, an on-prem piece of software that allows teams to more efficiently manage monitoring settings by storing, tracking, and managing them as code.
What is self-healing infrastructure and why do you need it? The first part is easy; it’s exactly what the name implies. It’s a methodology for creating automation that allows systems to identify and repair errors and misconfigurations without any human action. The “why” is a little more complex, but, like self-healing infrastructure, is well worth the effort.
Rachna Srivastava contributed to this blog post. Given the popularity of Prometheus and the open source community behind it, it’s no surprise that customers often ask about support for the Prometheus Query Language, PromQL. Many users are already comfortable with PromQL but need the additional performance and scalability of the VMware Tanzu Observability platform.
Happy System Administrator’s Day! July 29th is the day we thank and honor all the hard-working system administrators (and network administrators, network engineers, IT helpdesk staff – basically anyone who helps keep the networks up and running) for all their hard work. You’ve spent the last year fixing problems, onboarding users, integrating new systems and keeping the entire world connected.
On behalf of all Canonical teams, I am happy to announce the general availability of Ubuntu Confidential VMs (CVMs) on Microsoft Azure! They are part of the Microsoft Azure DCasv5/ECasv5 series, and only take a few clicks to enable and use. Ubuntu 20.04 is the first and only Linux distribution to support Confidential VMs on Azure.
Usually, we have top and htop to monitor the Linux system and get to know the running processes along with CPU and Memory utilization. But these commands have certain limitations which refrain them from giving a detailed overview of the system performance. This limitation is overcome by the alternative called Bashtop. In this blog, we will learn about Bashtop, its advantages, and disadvantages along with its shortcuts and installation guide.
In a previous blog post, we dove into the wayback machine and looked at Simple Network Management Protocol (SNMP) Traps – a technology that allows devices (including network devices) to send alerts when specific thresholds have been reached. In this post, we are going to be a bit more forward looking and discuss some technologies that will, in theory, replace SNMP. It is important to keep in mind that the demise of SNMP has been predicted for years (actually decades).
GitHub is a web-based platform used for project version control and codebase hosting. GitHub uses Git, a widely-used version control system. GitLab and Bitbucket are similar tools. Using GitHub is a prerequisite of most tutorials on the CircleCI blog, so it is helpful to learn to use it. In this tutorial, I’ll show you how to push a project to GitHub.
Last month we announced the 3 major features we are adding to the Codefresh platform. Dashboards for DORA metrics, support for any external Continuous Integration system and a hosted GitOps service. The hosted GitOps experience (powered by Argo CD) is now available to all new Codefresh accounts (even free ones) so that simply by signing up you can start deploying applications right away to your Kubernetes cluster without having to maintain your own Argo CD installation.
Joining a new dev team can be an exciting but somewhat intimidating experience. On one hand, you’re jumping into new adventures and opportunities. On the other hand, most onboarding experiences are fraught with stress and a sense of overwhelming from how much you have to learn, fast, to be able to contribute to your new team. To be honest, I’d never worked at a place where the developer onboarding experience was particularly memorable – until I joined Helios.
Small confession: we currently use the term 'post-mortem' in incident.io despite preferring the term 'incident debrief'. Unless you have particularly serious incidents, the link to death here really isn’t helping anyone. However, we're optimising for familiarity, so we're sticking to the term 'post-mortem' here. Ask any engineer and they’ll tell you that a post-mortem is a positive thing (despite the scary name).
As more companies adopt SaaS services over on-premise delivery models, there is a natural concern around data security and platform availability. Words on a vendor’s website can provide insights to prospective customers on the process and policies that companies have in place to alleviate these concerns. However, the old adage of “actions speak louder than words” does apply. Trust in a website’s words only goes so far.
Waste is never a good thing. And rumblings of an economic downturn, alongside dire warnings of climate change, are making it increasingly necessary to address waste. As a society, we need to reduce consumption, data included. First, we all must acknowledge the high cost of data. Despite the prevailing opinion of the 2010s, data isn’t free. There’s a monetary and carbon cost to keeping data alive.
“Lead time to deploy” means the interval from when the code gets written to when it’s been deployed to production. It has also been described as “how long it takes you to run CI/CD.” How important is it? It’s nigh-on impossible to have a high-performing team if you have a long lead time, and shortening your lead time makes your team perform better, both directly and indirectly.
We are happy to announce the first minor release of Kubewarden v1.0: v1.1.1 is now available! For those of you new to Kubewarden, it is a policy manager for Kubernetes.
When considering application source code, the way you maintain consistency throughout environments is mostly straightforward. You write application code, commit it to source control, and then build, test and deploy via a CI/CD pipeline. Since the application is defined by the source code living in source control, the build will be identical in all environments to which it’s deployed. But what about the infrastructure on which an application runs?
We have covered a plethora of topics on Active Directory (AD) in parts one to nine of this series on Active Directory Domain Services. In this final and 10th part, we will look at one other crucial aspect of AD—Group Policies and Group Policy Objects (GPOs). We will discuss what Group Policies are and what role GPOs play in the effective setup of any AD environment.
Let's face it: no one likes patching. When I was a practitioner, we always put off patching until it was absolutely necessary. Until a business need – such as updating an application version or support ending for a version – arose, we didn't patch because "If it ain't broke, don't fix it." We all know this is a bad practice; let's remind ourselves why. The longer a system goes without being patched, the more changes will accumulate.
Sysadmins, short for system administrators, serve as a crucial subset of IT engineers and support staff and are often under-appreciated. Sysadmins are the lynchpins that provide continuity, performance, and security to the systems that connect every corner of the world. When COVID-19 scattered large workforces in offices across small home office networks, organizations relied on their sysadmins more than ever before to maintain work processes.
Today we are pleased to announce GitLab support on CircleCI. Teams using GitLab SaaS can now build, test, and deploy on CircleCI, and access CircleCI’s most popular features like Docker layer caching and automatic test-splitting. GitLab is now the third version control system we support, in addition to GitHub and Bitbucket.
GitHub is one of the most popular source control platforms available. It relies on Git concepts, and millions of developers use it. GitHub Actions embrace all aspects of what source control needs, such as branching, pull requests, feature flags, and versioning. It also integrates nicely into third-party continuous integration and continuous development (CI/CD) pipelines or deployment tools like Azure DevOps, Jenkins, GitLab, and Octopus Deploy.
The recent heatwave that brought record temperatures to the UK caused cooling systems to fail at a London data center resulting in downtime for Google and Oracle. According to Oracle, “Following unseasonably high temperatures in the UK south (London) region, two cooler units in the data centre experienced a failure when they were required to operate above their design limits.
The days in which a business could thrive by serving customers through brick-and-mortar stores alone are long gone. Almost all retailers now offer a variety of online and offline channels, often with some degree of integration to ensure a smooth customer journey across different touchpoints. However, even these multichannel and cross-channel strategies are increasingly falling short of modern expectations.
Thinking about breaking into the DevOps space? DevOps has become one of the biggest tech buzzwords. Tech giants – like Facebook, Amazon, or Google – have numerous open positions for DevOps engineers. But it is a competitive field to break into. So if you’ve been prepping for DevOps roles, here are some of the most common interview questions (and potential answers) to expect, including.
Cilium is a Container Network Interface (CNI) for securing and load-balancing network traffic in your Kubernetes environment. As a CNI provider, Cilium extends the orchestrator’s existing network capabilities by giving teams more control over how they build their applications and monitor traffic. For example, vanilla Kubernetes installations typically rely on traditional firewalls and Linux-based network utilities like iptables to filter pod-to-pod traffic by an IP address or port.
In Part 1, we looked at some key metrics for monitoring the health and performance of your Cilium-managed Kubernetes clusters and network. In this post, we’ll look at how Hubble enables you to visualize network traffic via a CLI and user interface. But first, we’ll briefly look at Hubble’s underlying infrastructure and how it provides visibility into your environment.
In Part 2 of this series, we showed how Hubble, Cilium’s observability platform, enables you to view network-level details about service dependencies and traffic flows. Cilium also integrates with various standalone monitoring tools, so you can track the other key metrics discussed in Part 1. But since the platform is an integral part of your infrastructure, you need the ability to easily correlate Cilium network and resource metrics with data from your Kubernetes resources.
An Active Directory (AD) environment has things like forests, trees, domains, organization units, and objects. After growing acquainted with these concepts, the next step on this learning journey is to understand AD sites.
With HAProxy situated in front of their servers, many people leverage it as a frontline component for enabling extra security and observability for their networks. HAProxy provides a way to monitor the number of TCP connections, the rate of HTTP requests, the number of application errors and the like, which you can use to detect anomalous behavior, enforce rate limits, and catch application-related problems early.
If you’re managing multiple Kubernetes clusters at scale, you’ve probably run into Kubernetes cluster sprawl. And if you haven’t, brace yourself, because you’ll likely cross that bridge in the near future.
Have you considered cloud portability, i.e., the ability to easily move workloads between on-premises systems and across multiple cloud service providers (CSPs)? The idea is that workloads should run in the environment that delivers the most value for your organization, but as that “optimal” environment can change over time, you need to be able to move your workloads accordingly.
The DevOps practice of continuous integration and continuous deployment (CI/CD) improves software delivery. CI/CD platforms monitor and automate the application development process ensuring a better application, faster. CI/CD pipelines build code, run tests, and deploy a production-ready version of an application that has passed all automated checks.
Managing source code with a defined method is one vital aspect of implementing effective application development. Today, two strategies for doing this stand above the rest: trunk-based development and GitFlow. Choosing the proper method for source code control is often dependent upon several factors, such as: In this article, let’s define and compare trunk-based development and GitFlow, look at the factors that drive an organization’s decision between the two.
In software development, the name of the game is to develop reliable systems in a fast-paced manner. As development shops have evolved to increase the speed of delivery, many organizations have embraced the Agile development practices of continuous integration and continuous deployment (CI/CD). But the very nature of fast-paced development introduces challenges — particularly around the quality and the reliability of the software being developed.
One of the easiest transport layer protocols available in the TCP/IP protocol suite is the User Datagram Protocol (UDP). The communication mechanism involved is minimal. With UDP, neither the receiver nor the sender receives any acknowledgements of packets received. This protocol's shortcoming makes it unreliable and easier to process than many other protocols. Although UDP is considered an unreliable transport protocol, it uses IP services to ensure the best attempts are made to deliver data.
Kelverion have put together this Azure Automation Best Practices Guide to support the creation of automation process in Azure Automation. Our consultants work with Azure Automation every day and have substantial experience with Azure Automation and IT automation built using other tools. It’s important to recognize that these are recommendations rather than hard and fast rules.
A while ago, we asked our customers to write reviews about their experiences working with us. With an average rating of 4.6 out of 5 and ten reviews submitted and published within two weeks, we were humbled by the responses. As our CEO, Toffer Winslow wrote, “Perhaps the thing I was most proud of…was just how frequently our customers commented on the high quality of StackState employees they interact with and the caliber of service we deliver.”
Hyperconverged infrastructure (HCI) is a data center architecture that uses software to provide a scalable, efficient, cost-effective way to deploy and manage resources. HCI virtualizes and combines storage, computing, and networking into a single system that can be easily scaled up or down as required.
Self-hosted runners allow you to host your own scalable execution environments in your private cloud or on-premises, giving you more flexibility to customize and control your CI/CD infrastructure. Teams with unique security or compute requirements can set up and start using self-hosted runners in under five minutes.
IT teams have been relying on observability tools to (theoretically) provide intelligence and insights into operating conditions within an organization’s digital infrastructure for years. But most of these tools have come with significant shortcomings that leave IT teams wanting more.
An IP (Internet Protocol) address is a numerical label which is used for addressing he location and identification of the network interface for the devices connected to the computer network. The most used and popular IP version is IPv4 which uses 32-bit for IP addresses. Since the IPv4 became popular and the IPv4 addresses are getting depleted, Ipv6 is now used which uses 128-bit for the IP addresses.
With the latest releases of Kubewarden v1.1.0 and the verify-image-signatures policy, it’s now possible to use GithubActions or KeylessPrefix for verifying images. Read our previous blog post if you want to learn more about how to verify container images with Sigstore using Kubewarden.
An incident has been declared and your runbook has fired. Everyone is gathered in your Slack channel, the tickets are opened, and roles are assigned. Now what? This is when most teams manually update status pages and kickoff investigation streams using a patchwork of tribal knowledge and supporting playbook documents.
Today’s data centers generate a lot of data. Intelligent rack PDUs and other metered power infrastructure, environmental sensors, and the constant change in modern data centers all contribute towards a massive volume and variety of data. But data center professionals don’t have the time to collect all the data from its sources, analyze it, and derive insights from it that improve their data center operations.
ITOps and DevOps are technology management practices that have been around long enough that anyone in IT should have a good grasp of what they mean. Here’s our experts’ take on ITOps vs. DevOps.
In a DevOps environment, continuous testing is essential to success. By automating the testing process, you can release new, bug-free code faster, and more efficiently. In this software development tutorial, we will examine continuous testing, its benefits, and best practices.
Dashboards allow you to visualize and correlate monitoring data from across disparate data sources, technologies, and infrastructure components to understand what’s going on in your environment. In a growing organization, it’s paramount to standardize how teams build their dashboards to ensure their consistency and legibility.
With the introduction of container orchestration frameworks like Kubernetes, the adoption of cloud-native technologies, and the transition to microservices architectures, engineering organizations were empowered to build scalable and complex applications. DevOps engineers have had an indispensable role in this revolution, enabling and supporting these processes.
Reliability and chaos might seem like opposite ideas. But, as Netflix learned in 2010, introducing a bit of chaos—and carefully measuring the results of that chaos—can be a great recipe for reliability. Although most software is created in a tightly controlled environment and carefully tested before release, the production environment is harsher and much less controlled.
Instrumental has made the decision to shut down its platform starting August 2022 including its application, servers, and all related APIs being shut down. Users will need to migrate to another solution or risk all their data being permanently deleted! But Instrumental users need not fret!
Hello Netreo Customers! In case you missed our Q1 survey invitation, I’m Netreo’s Vice President of Product Management. I joined Netreo for the opportunity to create great business solutions and customer experiences.
Recently I heard one of our prospects talk about a competitor who was promoting their data lake and ask, how are we different than that? His question got me thinking about why a data lake alone does not provide the depth of observability you really need. The goal of observability is to help SREs, IT Ops and DevOps teams run their IT systems with close-to-zero downtime. Consolidating data from across your environment into a data lake is certainly a good step.
2022 Gartner® Market Guide for Container Management identifies the major trends in the container and Kubernetes market and offers guidance for organizations deploying containerized platforms. D2iQ, whose offerings we believe align closely with Gartner recommendations, is listed as a Representative Vendor for container management. The findings in the Gartner container management report should be taken in context with the analyst firm’s predictions for widespread cloud-native adoption.
A new report from IDC emphasizes just how critical autonomous compliance is for companies to ensure that their digital infrastructure environments are consistently hardened, resilient, and compliant. Leaders who prioritize compliance optimize company efficiency while reducing risk. The IDC PeerScape report outlines the best practices of these leaders, who, by implementing autonomous compliance, better protect their businesses.
Episode 5, Mooving to… Practical Postmortems covers how to leverage postmortems to effectively learn from failure. Postmortems are a commonplace reference and are now considered a best practice in most modern engineering teams. However, there’s still a lot of confusion on what postmortems should be – and more importantly, what they should NOT be. Thom Duran, Senior Manager of Productivity from Panther walks us through all that and more in the latest Mooving To.. episode!
As cloud native concepts and adoption take hold, many enterprises are now considering and implementing ways to achieve the primary objective of cloud native technology: enabling engineers to make significant changes to systems easily, frequently, and confidently. More and more enterprises are recognizing that cloud native technologies, such as Kubernetes, can indeed serve as the foundational infrastructure for building their own in-house platforms, greatly empowering their operations teams.
Testing is an integral part of the software development process and is one of the key ways development teams can better understand how applications function. Testing also prevents changes in the codebase that can affect other parts of the code, enabling you to measure the quality of the software and eliminate any errors before users can interact with it. Most development teams use unit and integration tests assess their software.
It can be a big can of worms, but tackling IT downtime can be the first step to major cost savings. Here’s everything you need to know about downtime but were too afraid to ask.
Since the introduction of containerisation by Linux many years ago, maturity has shifted from the traditional virtual machine to these containers. These tools have made application development much easier than the initial process. Docker Swarm and Kubernetes came into action when the number of containers increased within a system, they helped orchestrate these containers. A question that arises is, which one is the better option?
Ciara discusses how to analyze SBOMs for vulnerabilities using Open Source tools, and how Cloudsmith can take actions like quarantining your images if it contains vulnerabilities above a certain level.
In HAProxy Data Plane API version 2.6, we continued the effort of expanding support for HAProxy configuration keywords, as this has been the priority with this release cycle, and it will be in the next one too to meet our goal of achieving complete feature parity with both the HAProxy configuration and Runtime API. This will enable you to use HAProxy Data Plane API for configuring HAProxy without any gaps in functionality.
Distributed systems open us up to myriad complexities due to their microservices architecture. There are always little problems that arise in the system. Therefore, engineering teams must be able to determine how to prioritize the challenges. Viewing logs and metrics of such systems enables engineers to know the shared state of the system components, thereby informing the decision-making on what challenge needs to be solved most immediately.
What if there was a way to deploy a new feature into production — and not actually turn it on until you’re ready? There is! These tools are called feature flags (or feature toggles or flippers, depending on whom you ask). Feature flags are a powerful way to fine-tune your control over which features are enabled within a software deployment. Of course, feature flags aren’t the right solution in all cases.
With ARM based dev machines and servers becoming more common, it is become increasingly important to build Docker images that support multiple architectures. This guide will show you how to build these Docker images on any machine of your choosing.
Docker is a platform as a service product. With Docker, you can easily deploy applications into Docker containers. Containers are software "packages" that bundle together an application's source code with its libraries, configurations, and dependencies. This helps software run more consistently on different machines. To use Docker containers, you need to understand how Docker networking works. Below, we'll answer the question: "what is Docker network host?". We'll also take a look to see how it works.
Today we’re releasing fully redesigned Slack and Command Center experiences for FireHydrant so anyone on your team can intuitively navigate the incident response process — in the app or on the web. There are many things you can do ahead of an incident to help things run smoothly: design and document your process, automate predictable steps, train the team, and run drills.
To learn more about functional vs non-functional testing, visit: https://circleci.com/blog/functional-vs-non-functional-testing/
You’ve written code, you tested it and built it. Now, your release is ready to deploy into production. But: is your production environment ready for the release? That’s a question every IT professional and platform engineer should be asking before accepting a new release — whether the release is an update of an existing app or a totally new deployment. To that end, here’s a checklist to make sure that your production environment is ready to go.
Getting Kubernetes right is hard. If you’ve ever checked out Kelsey Hightower’s “Kubernetes the Hard Way,” you’ll know what we are talking about. Tell your family and friends you’ll see them sometime in the not-so-near future because Kubernetes will be consuming your life. Although Kubernetes adoption is skyrocketing, not all deployments succeed, and the issues that cause deployments to fail can occur between Day 0 planning and Day 2 operation phases.
The Multi-access Edge Compute (MEC) framework enables mobile operators, application developers, and content providers to deploy predictable cloud-computing capabilities at the network’s edge and in the immediate proximity of mobile networks.
Today we are rolling out our new "operation log viewer." This feature is not just a new page to show logs but a whole new way to find logs for actions taken by you and your team on your applications. The new log viewer page supports historical deployment logs, allowing you to see logs from previous actions.
Mattermost v7.1 (Extended Support Release) is generally available today. The following new features are included (see changelog for more details).
Having a strong password is necessary to protect our information from being accessible by others. A strong password should be difficult to be identified, guess or decrypt by the attackers. Mostly, while entering passwords, we will be prompted to enter the upper case and lowercase letters along with numbers and special characters. But thinking of a new password every time is very difficult and most people end up repeating the same password for every website and application they use.
Software development teams face a large and growing number of obstacles: shifting design requirements, organizational blockers, tight deadlines, complicated tech stacks and software supply chains. One emerging challenge that developers and IT leaders face is the need to stay compliant with regulations and control frameworks that stipulate comprehensive data security, incident response, and monitoring and reporting requirements.
DevOps has never been more popular than it is today. Since first popularized nearly 15 years ago by Patrick Debois and Gene Kim, DevOps has become the standard approach for managing IT. In this blog post, we’ll look at key trends and data that paint a picture of today’s State of DevOps. You can learn more about the history and fundamentals of the topic in our article What is DevOps and why is it important?.
We’re proud to announce the general availability of dcTrack® 8.2, the latest version of Sunbird’s DCIM Operations software. This release includes exciting new features including a ticket connector to drive automation via integration, automatic data network diagrams for remote visualization, and asset audit via barcode scanning.
The world is increasingly digital. The U.S. Census Bureau estimates e-commerce grew 14.2% from 2020 to 2021, for a total of $870.8 billion in sales. And just look at the trends in remote work. According to a FlexJob and Global Workplace Analytics report, remote work has grown 44% over the last five years and an astonishing 159% over the last 12. Indeed, much of America relies on a slew of digital apps and services to get business done every day. So what does this mean for businesses?
The Multi-access Edge Compute (MEC) framework enables mobile operators, application developers, and content providers to deploy predictable cloud-computing capabilities at the network’s edge and in the immediate proximity of mobile networks.
Is 99.999% uptime realistic? We cover why you should care, and how you can achieve it.
In this article, we will talk about Graphite and Grafana monitoring systems, and their similarities and differences. Also, we will explain why it is an effective solution to use Graphite and Grafana together to monitor your system metrics. We will also learn about the benefits of using MetricFire. Sign up for MetricFire for free and store and process your system metrics with our hosted Graphite solution.
By now, almost everyone is familiar with cloud computing in one form or another. Throughout the 2010s, the concept of cloud computing evolved within the software industry, then worked its way into everyday life as a universal household term. Somewhat less familiar is the concept of edge computing. The genesis of the “edge” dates to the first content delivery networks in the 1990s. Since then, the edge concept has primarily been the domain of network engineers.
Rack diagrams, also known as rack elevations, are visual representations of the IT equipment in a server rack. They are used to track and manage what assets are in each rack and which U position they are in. Rack diagrams are very useful and commonly used for data center asset management and capacity planning. The information rack diagrams provide allow you to know what equipment you have, where you have space to deploy more, and can improve the troubleshooting process.
Kelverion have released a new version of our Integration Pack for Microsoft Azure Active Directory. This new release is a complete rewrite of our Integration Pack because Microsoft have depreciated the underlying API we have been using. This means this release is a breaking update, any Runbooks built against an earlier version of the Integration Pack may cease to operate when you upgrade to this new version.
A system’s reliability is one of the most important things that engineers should care about. They ensure customers are kept happy and keep organizations profitable. Investing in reliable processes and tools to ensure systems are reliable can be critical to company success. Site Reliability platforms are popular choice when it comes to monitoring and observing software services as they help make responding to and solving application problems easier.
AWS Transit Gateway is a service that makes it easy to connect multiple Amazon Virtual Private Clouds (VPCs), AWS accounts, AWS Regions, and on-premises networks together through a central hub. For AWS customers operating at global scale with many accounts and VPCs, AWS Transit Gateway greatly simplifies AWS networking architecture by eliminating the need to manage complex peering relationships and massive route tables.
Known for its cross-platform compatibility and elegant structure, ASP.NET Core is an open-source framework created by Microsoft for building modern web applications. With it, development teams can build monolithic web applications and RESTful APIs of any size and complexity. Thanks to CircleCI’s improved infrastructure and support for Windows platforms and technology, setting up an automated deployment process for an ASP.NET Core application has become even easier.
In this blog, understand why your pod has OOMKilled errors when provisioning Kubernetes resources and how Speedscale can aid with automated testing. When creating production-level applications, enterprises want to ensure the high availability of services. This often results in a lengthy development process that requires extensive testing for the applications or a new release.
Sleuth is pleased to announce a new set of features that enable our customers to measure, compare, and drive efficiency improvements on a per-team basis!
Let’s be honest, alert fatigue is a real thing and anyone telling you otherwise is flat out lying. If you have tools generating tens or thousands of daily alerts, eventually people will burn out and simply start ignoring alerts. Even if you have enough team members to divvy up alert reviews, the approach only works for a while. Trouble is, false positives are always generated when managing alerts, and people will eventually ignore false positives.
More and more teams are moving away from monolithic applications and towards microservice-based architectures. As part of this transition, development teams are taking more direct ownership over their applications, including their deployment and operation in production. A major challenge these teams face isn't in getting their code into production (we have containers to thank for that), but in making sure their services are reliable.
Communications are evolving at a dizzying pace, challenging service providers of all stripes to adopt the latest technologies and reshape their businesses for the new world. Telco Cloud, an area my team and I are deeply focused on, is one of the key components in making that transformation a success.
Paula Stack and Roser Blasco co-wrote this post. As a refresher, VMware Tanzu RabbitMQ is based on the hugely popular open source technology RabbitMQ, which is a message broker with event streaming capabilities that connects multiple distributed applications and processes high-volume data in real-time and at scale.
The Graphite database has engineers feeling stuck. Perhaps you’re one of them. You find yourself collecting metrics that were defined years ago when the system was put in place, likely by someone who is no longer with the company. These pre-aggregations make it necessary to collect more data, which results in increased infrastructure and disk space costs.
As businesses grow and develop, so must the tools that help manage them. Application monitoring tools provide enterprises with a way to keep track of the health and performance of their applications and ensure that everything is running smoothly. Application monitoring tools have a wide range of capabilities and data that enterprises can use to help answer questions about the current state of an application.
Mature start-ups and scale-ups create wonderful and challenging environments for Engineers. As the product they’re creating matures and the brand becomes a successful one, the user base generally starts growing, and, for some companies, in places they might not expect it to grow. As that happens, new challenges arise for Engineers. One of these challenges is pretty straightforward to guess. Basically having a particular product available throughout different regions of the world.
Kubernetes is an open source platform that, through a central API server, allows controllers to watch and adjust what’s going on. The server interacts with all the nodes to do basic tasks like start containers and pass along specific configuration items such as the URI to the persistent storage that the container requires. But Kubernetes can quickly get complicated. So, let’s look at Vanilla Kubernetes — the nickname for a a K8s setup that’s as basic and elementary as it gets.
In today’s digital age, IT has become the central component of business operations. Yet, despite its critical importance, skilled technicians continue to find their hands tied by time-consuming manual tasks. And, while many of these tasks are essential, they do virtually nothing to drive innovation. Introducing automation into the mix can free up IT talent to focus their skills on more important business initiatives – particularly those that drive change and generate revenue.
Alert fatigue occurs when people become desensitized to the overwhelming number of alerts they receive and are expected to respond to. Even though these alerts are typically easy to respond to, it is the sheer number of them that ultimately causes people to feel fatigued. The higher the number of alerts, the more likely it is that employees are likely to begin to ignore and potentially miss an important alert leading to bigger consequences.
In order to make reliability improvements tangible, there needs to be a way to quantify and track the reliability of systems and services in a meaningful way. This "reliability score" should indicate at a glance how likely a service is to withstand real-world causes of failure without having to wait for an incident to happen first. Gremlin's upcoming feature allows you to do just that.
Arm processors have become increasingly popular in recent years, providing energy-efficient, cost-effective processing power to both mobile and cloud computing ecosystems. As a part of this growth, more and more organizations are choosing to leverage the many benefits of Arm-based architectures for their containerized workloads. Today, Google Cloud announced its Arm-based Tau T2A virtual machines (VMs), which you can also use to run workloads in Google Kubernetes Engine (GKE).
TL;DR: We’ve raised $34M to bring increased resilience to organisations around the world. With this latest round of investment we’re expanding internationally in the US, accelerating our product plans, and growing our amazing team 🎉 As technology becomes more complicated and runs an ever greater part of our lives, failure becomes more inevitable, and more costly.
As announced in past blog posts, Kubewarden has 100% coverage of the deprecated, and soon to be removed, Kubernetes PSPs. If everything goes as expected the PSPs will be removed in Kubernetes v1.25 due for release on 23rd August 2022. The Kubewarden team has written a script that leverages the migration tool written by AppVia, to migrate PSP automatically. The tool is capable of reading PSPs YAML and can generate the equivalent policies in many different policy engines.
As a gold sponsor for DevOps Institute’s fourth annual Upskilling IT report, we compiled some key takeaways. However, there is so much more to get from this report – so download and read the full report today. In this technology-driven world, skills have a very limited shelf life. The knowledge, tools and resources we rely on in the moment rarely stay relevant or useful forever – especially with rapidly changing demands.
Continuous integration/continuous delivery (CI/CD) tools give developers the ability to automate the software development process. As soon as developers push code to git, your CI/CD system can build, test, stage, integration test, deploy, and scale. That’s fantastic! In this tutorial, we will look at CircleCI orbs and how they can support your CI/CD practice. We’ll look at how to use multiple orbs and how orbs can help with multi-builds for a variety of application types.
When the first supporting server-side infrastructure for Collapsed Reply Threads (CRT) shipped with Mattermost v5.29 (November 2020), it included an ominous release note: > This setting is enabled by default and may affect server performance. While performance concerns are possible with any new feature, most features don’t require significant architecture and data model changes. Most features don’t ship incrementally across 20 monthly releases. And most features – to their credit?
If you’ve worked with us before, you know we take security seriously. We take measures necessary to safeguard your sensitive and personally identifiable information and comply with a variety of compliance standards. These include maintaining best security practices and aligning with regulations such as SOC-2, PCI-DSS, and GDPR.
Over the last few months, a number of our users have asked if we can add more context to their alerts. We spoke with them on our live chat on dashboard and brainstormed the idea of Title Remapper.
The "atop" is an advanced system and process monitor used in the Linux environment to analyze the server performance. It is necessary to analyze the performance of the server continuously. It is a performance monitor which gives us a report on all the activities of processes running on a server.
Many businesses today rely on delivering modern applications that provide the best customer experience and competitive advantage on any cloud. Modern applications require a modern cloud native infrastructure. One of the clearest signs of cloud native technology mainstreaming (i.e., Kubernetes) is the rapid growth in the number of clusters being deployed in the multi-cloud environment.
We’re constantly looking for new ways to help DevOps, SREs, and operations teams automate operations workflows, secure infrastructure and applications, and rapidly deliver their products at scale. This commitment to our customers — and yours! — led us to redesign the way you experience groups in xMatters.
A few years ago I’d just moved to London and started out at my first software job. I was having a great time building things and making new friends, and one evening a friend and I decided there was a new problem we wanted to solve: we really didn’t like the expenses software. We thought it was confusing and over-complex, and decided we could do better.
With the launch of the Cycle Partner Program, we are committing to making it easier for companies to work with Cycle by creating more transparent and predictable relationships, offering training, resources, incentives, and benefits, some of which will roll out over time as the program evolves. Interested in partnering with Cycle? Contact our partner lead to schedule a meeting.
So, you’re looking for the right instance type for your public cloud workload, but how do you decide? Major cloud providers such as Amazon Web Services (AWS), Microsoft Azure Cloud and Google Cloud Platform (GCP) now offer such a large catalog of IaaS instances that it can become difficult to make sense of it all.
API Observability isn't exactly new, however it's popularity has seen rapid growth in the past few years in terms of popularity. API Observability using open source is different from regular API monitoring, as it allows you to get deeper and extract more valuable insights. Although it takes a bit more effort to set up, once you've got an observability infrastructure running it can be immensely helpful not only in catching errors and making debugging easier, but also in finding areas that can be optimized.
The common misconception of open-source Kubernetes is that it is free—but in reality, it has a lot of associated costs, including labor and potential business losses from wasted time, effort, and being late to market. Just like a puppy, Kubernetes software itself might be free, but a do-it-yourself (DIY) deployment involves a lot of care, patience, and unforeseen costs.
We try hard to make our products as intuitive and familiar as possible, but there will always be “advanced” options and rarely-used features. Giving users choice and control over their experience will naturally lead to features that are used less frequently or settings that only a small percentage of users will change. So how do we decide what order and prominence to give to these lesser-used features?
pstree command is a Linux command which displays the running processes as a tree. It is a visual alternative to another similar command which is called the ps command in Linux. The root of the visual tree output from this command is either the init or the process with the given pid.
It’s never too early or late to start talking about cloud-native. By 2025, more than 95% of new workloads will be deployed on cloud-native platforms. Clearly, a lot of organizations are on their way to cloud-native adoption, among them some of the prominent telecom operators of our time. After all, the benefits of cloud-native are most pronounced in the telecom sector, where the need for scale, automation and predictable cost structure at optimal OPEX and CAPEX is more persistent than ever.
The Transmission Control Protocol provides reliable, ordered and, sometimes, time-sensitive data flow between applications across a network. As well as economizes network use by attempting to improve error-handling capability and providing reliable data transmission. The Transmission Control Protocol is the underlying communication protocol for a wide variety of applications, including web servers and websites, email applications, FTP and peer-to-peer apps.
DevOps has been bridging the gap between the development and operations teams for more than a decade. It is eliminating the organizational barriers between the two and automates the delivery process. It's time to start treating databases the same way we treat the delivery pipeline when applying DevOps. When we have a large database, automation is crucial. When the database has too much information, changing a table can take ages and block further changes like inserts, updates, or deletes.
At incident.io, we’re continually building out our integrations to work with all the tools you already know and love. Next on the list, is our first bug tracker, Sentry. Try posting a Sentry link on your next incident to check it out.
Neeharika Palaka and Shagun Tewari co-wrote this blog post. VMware Marketplace is VMware’s one-stop shop for all ecosystem solutions, with a robust catalog of more than 2,000 solutions covering open source software, first-party tools, and commercial software. VMware Marketplace is currently used by thousands of people to download, deploy, subscribe to, and purchase these solutions in a direct and easy way.
For the second year in a row, the DevOps community came together virtually for our DevOps Loop conference. This event allowed us to examine DevOps and its core principles in the context of modern applications, multi-cloud, and Kubernetes. Organizations are increasingly looking to internal platform teams to deliver an awesome developer experience while ensuring reliability, scalability, and security, by unlocking the path to production for modern apps and helping their products soar!
Chaos Engineering, where engineers intentionally inject failure to test the reliability of their systems, is becoming a regular practice for companies who value uptime and availability. As cloud-based systems have grown more complex, Chaos Engineering has become a critical part of the software testing and release process to uncover surprise dependencies, fix problems before they become 3am outages, and bake reliability into every feature.
At incident.io we use gorm.io as the ORM library for our Postgres database, it’s a really powerful tool and one I’m very glad for after years of working with hand-rolled SQL in Go & Postgres apps. You may have seen from our other blog posts that we’re heavily invested in tracing, specifically with Google Cloud Tracing via OpenCensus libraries.
When you think of TDD, you might lean towards Test-Driven-Development. Though in Tomasz Manugiewicz’s ACE 2022 talk, the ‘T’ in TDD could also mean Trust e.g Trust-Driven-Development. The talk, boils down to if there is trust, there is autonomy. If there is autonomy, creativity flourishes. Building trust is done incrementally, incremental success builds success. Software engineering is a team sport and an exercise in iteration.
The release of VMware Tanzu GemFire 9.15 introduces compatibility with the VMware Tanzu GemFire for Redis Apps add-on. This add-on enables compatibility between Redis applications and Tanzu GemFire for the first time ever, unlocking enterprise-ready features for your Redis applications.
At Platform.sh, we are committed to making your site perform as best as possible. As part of this commitment, we need to smooth down the system load spikes as much as possible—especially when many crons are triggered at the same time on a particular Grid region. To do so, we are increasing the default cron jitter from five minutes to 20 minutes.
Configuring applications, services, and environments by modifying plain text files is a standard part of modern software development. Configuration as Code (CaC) takes this one step further by systematically generating, storing, and managing configuration files. CaC allows development teams to automate config management for their applications and environments while ensuring consistency and traceability throughout the development life cycle.
When building serverless applications on AWS Lambda, Amazon CloudWatch provides out-of-the-box metrics that measure the performance, errors, and duration of your functions. Although these standard Lambda metrics provide visibility into your serverless applications, it can also be invaluable to monitor custom metrics that are unique to your use case and application.
VMware Tanzu Mission Control users can now drive clusters via GitOps. This new feature of Tanzu Mission Control is built on Flux CD and enables users to attach a git repository to a cluster and sync YAML artifacts (using Kustomize) from the repository to the cluster. This feature provides a method for managing cluster configurations with Tanzu Mission Control via continuous delivery from a git repository.
Ciara details how and when to generate an SBoM with the help of open-source tooling. Learn how to host SBoMs, as well as other SBoM considerations.
Here at Kublr we always emphasize how important it is to understand the foundations of Kubernetes (K8s) and its operations tools so you can more efficiently manage your applications and simplify your cloud-native development workflow. Understanding these components on the front end is equally important as we begin our build processes, especially when building with an everything-as-code approach.
Modern application deployments rely heavily on containerization for its scalability, availability and ease of maintenance. Legacy applications implemented before the containerization era often use monolithic, hardware-centric architectures that are difficult to scale and manage. These legacy applications may have multiple services bundled into the same deployment unit without a logical grouping.
The first continuous integration (CI) tools were all self-hosted, meaning they ran on a developer’s local computer or server. Although this setup was viewed favorably by dev teams at the time, it has limited flexibility, and developers had to spend time maintaining the infrastructure.
Continuous integration (CI) / continuous delivery (CD) is a model that allows software development teams to automate the integration and delivery of code changes in a more frequent and reliable manner. This gives development teams more time to improve the quality of their code, test with greater depth, and leads to more customer deployments overall.
Service level objectives (SLOs) state your team’s goals for maintaining the reliability of your services. Adopting SLOs is an SRE best practice because it can help you ensure that your services perform well and consistently deliver value to users. But to gain the greatest benefit from your SLOs, you need ongoing visibility into how well your services are performing relative to your objectives.
A few weeks ago we had a major incident. We were releasing our Practical Guide to Incident Management, and after posting about it online an incident.io employee noticed that the page wasn’t loading. Just to set the scene, I’ve been at incident.io for 3 months and don’t have any experience of incidents in my previous role. When the team got paged I expected this to be one of those “follow along and learn how the wizards work their magic” exercises.
This post is the third in a series of deeper dive articles discussing DORA metrics. In previous articles, we looked at: The third metric we’ll examine, Change Failure Rate, is a lagging indicator that helps teams and organizations understand the quality of software that has been shipped, providing guidance on what the team can do to improve in the future.
Nate Lee here, and I’m one of the founders of Speedscale. The founding team’s worked at several observability and testing companies like New Relic, Observe Inc, and iTKO over the last decade. Speedscale traffic replay was borne out of a frustration from reacting to problems (even if they were minor) that could have been prevented with better testing.
Are you new to OKRs, or perhaps you’ve used them before but are looking for tips and tricks to make them work better for you? Here is a practical how-to guide developed by a VMware Tanzu Labs product manager who has worked with dozens of teams from the Fortune 500 to adopt the OKR framework.
For more information, read this tutorial: https://circleci.com/blog/intro-to-software-testing-life-cycle/
Does technology fascinate you? Are you curious about and interested in learning about different software, hardware and devices? If you answered yes to these questions, you should become a system administrator aka sysadmin. A sysadmin is responsible for monitoring and maintaining computer systems in a network or environment that has multiple users. It’s a great time to become a sysadmin now because the technology sector is booming and yet it’s facing a major Skills shortage.
Multi-cloud strategy – the use of multiple private or public clouds – is increasingly becoming the main method companies use to deploy their IT infrastructure. In the next three years, an estimate 64% of companies will rely on multi-cloud as their main deployment model source. Despite the complexities that come from operationalizing it, as we disccussed in The Challenges of Building Multi Cloud, the multiple benefits that come from this deployment model can often make it worth the effort.
Today, we’re excited to announce the release of Cycle’s new pricing model! With this new model, we aim to make our pricing far more straightforward and better suited for larger deployments and customers. While our current pricing model solved the needs of our customers for the last few years, we’ve learned enough that it’s now time to make a change. Before talking about the new model, let’s dive into how we got here.
In June, we hosted our online meetup with ContainIQ surrounding k8s monitoring and observability. You can catch up on the discussion between Matthew Lenhard (Co-founder & CTO of ContainIQ) and Kai Hoffman (Developer Advocate at Civo) here if you missed it. Meanwhile, Kamesh Sampath from our Developer Advocate demo program explains how Civo’s speed and developer experience is great to work with in our latest Civo Shorts.
At Platform.sh, we are committed to making your deployment experience as fast and seamless as possible, so that you can continue pushing changes as much as you need, and keep your customers happy. As part of this commitment, we are releasing three new infrastructure improvements, which will greatly improve caching strategy and significantly reduce downtime during deployments.
Improving team health within DevOps is vital for success in any engineering team. In this article, we’ll look at some of the ways that you can improve team health with Reliably so you can keep your developers happier, healthier and free from burnout.
Azure Functions is an on-demand serverless compute offering built on top of Azure App Service that enables you to deploy event-driven code without the need to provision and manage infrastructure. Because applications rely on Azure Functions to handle business-critical tasks such as processing orders or logging in users, it’s important to ensure that your functions respond quickly when they’re invoked.
Kubernetes offers a way to store configuration files and manage them via a ConfigMap. Functionally, they seem very similar to Kubernetes Secrets, where both constructs are used to store information that can be used in a Pod. This information could be usernames and passwords of a connection string to a database.
If you’ve ever experienced downtime during sudden traffic spikes, then have no fear: Auto-scaling is here. That’s right. As of today, all Dedicated plans on Platform.sh that have subscribed to the Observability Suite package now benefit from auto-scaling out-of-the-box.
Imagine you want to build and deploy a Nuxt3 app on Netlify. Because custom scripts are not allowed on Netlify, you will not be able to perform custom tasks like automated testing before deploying the website to your Jamstack hosting platform. That is where continuous integration/continuous deployment comes in. With a CI/CD system, you can run the kind of automated tests that create successful deployments.
Because DevOps practices can bring great speed and reliability to the software delivery lifecycle, release management can seem daunting. But, the improved visibility and collaboration brought about by DevOps can also help with the release management process. DevOps-centric release management is the future of software development and IT operations.
Since the evolution of the IT industry, different concepts have been introduced to enhance and speed application production. Automating processes is gradually becoming the way forward and, so far, the best way to speed the deployment process of projects. Today, though, NoOps has come along. The prevalence of NoOps means manual intervention may not be needed in IT operations, but is this going to mean the extinction of DevOps? Turns out, NoOps might just be a next step in the progression of DevOps.