Operations | Monitoring | ITSM | DevOps | Cloud

April 2021

Guide to using Docker for your CI/CD pipelines

Docker is a platform for developers and sysadmins to develop, deploy, and run applications using containers. Docker is also referred to as an application packaging tool. This means that enabled applications can be configured and packaged into a Docker image that can be used to spawn Docker containers that run instances of the application. It provides many benefits including runtime environment isolation, consistency via code, and portability.

How to build a CI/CD pipeline with Docker

I talk with many of my fellow engineers at conferences and other events throughout the year. One thing I like demonstrating is how they can implement a continuous integration/continuous deployment (CI/CD) pipeline into a codebase with very little effort. In this post I will walk through some demo code and the CircleCI config that I use in the demonstration. Following these steps will show you how to implement CI/CD pipelines into your code base.

Five Reasons To Choose Dell and Robin Cloud Native Platform For AI/ML (Blog series - Part 3 of 3)

In part 1 and part 2 of this series, we examined how AI/ML can help improve healthcare and the challenges faced by AI/ML teams in realizing the benefits respectively. In this part, we will explore how Robin and Dell can help overcome these challenges.

Un-Excuse-ing Upgrades

When we talk about upgrades here at SolarWinds, we spend a lot of time discussing the beneficial features, performance, and capabilities you can gain. That’s not by accident. The honest-to-goodness truth is, the most compelling reason to upgrade ANYTHING—from our phone to our game console to our monitoring software—is because we’ll be able to do something both new and useful to us.

Interlink Software and AppDynamics deliver unified, data-driven Service Visualization and faster fault resolution.

We are delighted to share news of our partnership with leading, real-time Application Performance Monitoring (APM) vendor Cisco AppDynamics and are now a fully-fledged member of their Integration Partner Program (IPP.) For our mutual enterprise customers service affecting issues can lie undetected in the vast volumes of data generated by the multiple, disconnected tools used to monitor their multi-cloud environments, applications and technical solutions.

Announcing support for the AWS managed Lambda Layer for OpenTelemetry

Datadog’s support of OpenTelemetry—a vendor-agnostic, open source set of APIs and libraries for collecting system and application telemetry data—has helped thousands of organizations implement monitoring strategies that complement their existing workflows. Many of our customers leverage OpenTelemetry for their server- and container-based deployments, but also need visibility into the health and performance of their serverless applications running on AWS Lambda.

Monitoring for Success: What All SREs Need to Know

The last ten years have seen a massive change in how IT operations and development enable business success. From virtualization and cloud computing to continuous delivery, continuous integration, and rapid application development, IT has never been more complex or more critical to creating competitive advantage. To support increasingly Web-Scale IT operations and wide-scale cloud adoption, applications now operate as services.

Seamless Cloud account management - The Future of Qovery - Week #8

During the next two weeks, our team will work to improve the overall experience of Qovery. We gathered all your feedback (thank you to our wonderful community 🙏), and we decided to make significant changes to make Qovery a better place to deploy and manage your apps. This series will reveal all the changes and features you will get in the next major release of Qovery. Let's go!

A Guide to AWS Certifications

If you’re interested in cloud computing, AWS certifications are one of the most rewarding paths to a dynamic career. As a worldwide leader in cloud infrastructure service, Amazon prepares certified experts who are highly sought after by IT organizations around the world. Did you know that 94% of organizations use a cloud service and 30% of their IT budgets are allocated to cloud computing?

7 Ways SRE Is Changing IT Ops And How To Prepare For Those Changes

SRE best practices are disrupting and catalyzing change in the ways organizations approach IT Operations. In this blog we look at 7 ways SRE is bringing this transition. ‍Site Reliability Engineering is a new practice that has been growing in popularity among many businesses. Also known as SRE, the new activity puts a premium on monitoring, tracking bugs, and creating systems and automations that solve the problem in the long term.

Accelerating DevOps Using Cloud Native Technologies With AWS, Docker & JFrog

In this webinar we help you gain a deeper understanding of the benefits of migrating and modernizing applications from a monolithic architecture to microservices, in order to accelerate DevOps processes. We outline the efforts required to reach this stage of sophistication in application development and deployment.

How NOT to take a side project to startup - Dev Matters S1E03

What side projects make terrible startups? What should you avoid when trying to make the transition? In this episode of Dev Matters, Don and his guest Dylan Etkin discuss side projects and lessons learned taking Sleuth from a side project to a startup. This episode was recorded in front of a live studio audience on Twitch.

Extinguishing our performance fires and rebuilding for the future

I stepped into the role of Head of Engineering for Bitbucket Cloud in late 2020, having served as one of the team's senior engineering managers for several years. It is an honor and a privilege to lead this team, and I couldn't be prouder of the hard work we've done and continue to do each day to make Bitbucket a world-class product empowering teams to build, test, and deploy software to millions of people around the world. It has been an eventful journey, and the past few weeks are no exception.

Fireside Chat with Jesse Robbins and Kolton Andrus Failover Conf 2021

Long before Chaos Engineering was even a phrase, Jesse Robbins was Amazon.com's "Master of Disaster" using intentional failure to help the company become more reliable. Kolton Andrus (CEO at Gremlin), sits down with Jesse to learn more about his early work with GameDays, the evolution of reliability, and where the future of SRE lies.

Fireside Chat with Ines Sombra and Ana Medina Failover Conf 2021

Reliability is a requirement for the modern internet. Ana Medina joins Inés Sombra, Sr. Director of Engineering at Fastly, to discuss their approach to resilience, how the past year has influenced the way they work, and what practices your engineering organization can adopt to become more reliable.

JFrog Expands APAC Presence To Support Growing DevOps Adoption

At JFrog, we’ve seen DevOps and DevSecOps adoption growing robustly in Asia-Pacific (APAC), as the region’s large enterprises recognize the competitive advantage and importance of DevOps and digital transformation. In fact, by 2025, up to 25% of Asia’s 500 largest companies will become software producers to digitally transform and maintain their A500 status, IDC predicts1.

Q&A from the Moogsoft/Datadog Fireside Chat

On April 15th Moogsoft’s VP Marketing, John Haley, welcomed Datadog Product Manager, Alex Vetras, along with DevOps Institute Chief Ambassador, Helen Beal, and Moogsoft’s CTO, Dave Casper, for an informal roundtable exploring how users can now see rich-context incidents from across the full stack in minutes, and the opportunities this presents to organizations.

Atlassian Open DevOps Overview Video

Software and DevOps teams have everything they need to develop, ship, and operate software in Atlassian Open DevOps, an development experience built on Jira Software.Open DevOps starts with Jira Software, Confluence, Bitbucket, and Opsgenie. Teams can easily add the tools they want, such as GitHub or GitLab, with a single click. In this video get an overview of Open DevOps and how it can supercharge your development.

Building CI/CD pipelines using dynamic config

Creating robust, manageable, and reusable functionality is a big part of my job as a CI/CD engineer. Recently, I wrote about managing reusable pipeline configuration by adopting and implementing pipeline variables within pipeline configuration files. As I showed in that tutorial, pipeline variables and orbs have added some flexibility to this process, but they are still a bit limited.

Introducing dynamic config via setup workflows

With the new release of dynamic config via setup workflows, CircleCI customers can now use jobs and workflows, not only to execute work but to determine the work they want to run. We built dynamic config because we know our users want more dynamism in the CircleCI build process. Historically, our platform has been very deterministic: the config is pre-set in a file based on the revision for a given pipeline.

Leaving the Nest: Guidelines, guardrails, and human error by Laura Santamaria Failover Conf 2021

When we talk about reliable systems, we talk a lot about human error. Human error in an incident or a bug report is often treated with a bit of a facepalm reaction. The term masks a lot of scenarios from accidents to exhaustion to everything in between. However, human error helps us understand where our processes failed and how we can prevent the same error from happening again. In short, we need to think in terms of a framework of guidelines and guardrails. In this short talk, let’s discuss how guidelines like runbooks and guardrails like automation can help us address the fact that everyone will, at some point, make mistakes.

Implementing DevSecOps in the DoD by Nicolas Chaillan Failover Conf 2021

Delivering software quickly and securely is important for every organization, but it's even more important at the US Department of Defence (DoD) where reliability directly impacts national security. Nicolas Chaillan (Chief Software Officer, US Air Force) will discuss the DoD Enterprise DevSecOps Initiative—an initiative he leads along with the DOD’s Chief Information Officer that brings automated software tools, services and standards to DoD programs. He'll also share about Platform One, the Air Force's DoD-wide DevSecOps Enterprise Level Service that provides managed IT services capabilities, on-boarding, support, and baked-in zero trust security. This insight from operating at the most rigorous level will help you level up your own organization.

Pragmatic Incident Response: Lessons learned from failures by Robert Ross Failover Conf 2021

Incident response is overwhelming. So where do you start? There's a lot of advice out there, but it's mostly theories that aren't taking reality into account. So how do you get a process in place that actually works and scales? In this session, FireHydrant CEO and Co-Founder, Robert Ross, will share quick stories from his experience as an SRE and what tips he’s learned along the way.

Whats Next for DevOps by Emily Freeman  Failover Conf 2021

For over a decade, the DevOps movement has been using cultural change to power technological transformation and help companies deliver better products faster and more reliably. While many organizations have embraced this change and reaped the benefits, it hasn't come without challenges and many more remain. In this session, Emily Freeman (author of DevOps for Dummies) shares what's next for DevOps and how it will impact your organization.

The Evolution of Observability and Monitoring panel discussion Failover Conf 2021

Observability and monitoring are critical to detecting and troubleshooting problems to build more reliable applications. As our systems become increasingly complex, our tools for getting this crucial visibility and the way we respond need to evolve too. We'll sit down with SRE leaders to discuss the processes they use to get the most insight into their applications, how they've increase the speed of detection and response, and what organizations need to do to stay on top of growing complexity.

The Evolution of Teams & Culture panel discussion Failover Conf 2021

The most successful organizations are the ones that embrace change and use it to become stronger and more resilient. In this panel discussion, we'll talk with engineering leaders about how they adapted to the challenges of 2020, what successes (and failures) they've seen, and where the future of reliable engineering teams is headed.

Atlassian Open DevOps and Codefresh

Codefresh is excited to partner with Atlassian on their new Open DevOps launch. Codefresh is offering native support for connecting the two platforms and giving better visibility on deployments and features of each deployment for our mutual customers. At the heart of this integration is the Codefresh App; which can be found on the Atlassian Marketplace. Simply define Codefresh as the CI/CD partner that will connect to Atlassian’s DevOps API.

Integrate security into development with Snyk, now a seamless part of Bitbucket Cloud

While PR analysis within Code Insights and Snyk Pipes are available to use right now, we're rolling out a native Security tab in Bitbucket Cloud. This will be a gradual rollout through the month of May so watch out for it in the left nav. We look forward to your feedback. Even small vulnerabilities can cost a team a lot. All too frequently we see news reports of organisations that mishandled their code & build level security, causing customer data to be exposed.

Four Key Challenges To Adopting AI/ML In Healthcare (Blog series - Part 2 of 3)

In part 1 of this series, we examined how AI/ML can help improve healthcare. AI/ML is an ambitious undertaking that promises to revolutionize healthcare. Getting excited is easy, but where do you start and why is it not just another empty promise? In fact, despite all these promises and futures, most AI/ML projects fail and don’t deliver. The failure rate of AI/ML projects is starting to make some wonder if this is real or hype.

JFrog and PagerDuty Extend Ecosystem Integration

JFrog and PagerDuty have deepened their technology integration to further boost IT operators’ and developers’ visibility into the software development lifecycle and accelerate incident resolution. The latest integration, which involves the JFrog Pipelines DevOps pipeline automation solution, simplifies and streamlines how to identify faulty builds that impact production environments.

Monitor cloud endpoint health with Datadog's cloud service autodetection

Your modern cloud-hosted applications rely on a number of key components—such as databases and load balancers—that are managed by the cloud provider. While these cloud resources can reduce the overhead of maintaining your own infrastructure, capturing and contextualizing monitoring data from services you don’t own can be difficult.

What's Changed in VMware vSphere 7 Update 2: All You Need to Know

VMware has recently released vSphere 7 Update 2, and there is a lot of new stuff to look out for. vSphere, VMware’s server virtualization product, has been an industry favorite for a long time. The vSphere 7 came out in April 2020, and this is so far the second update to it, hence the name. When you look at the changes they’ve rolled out, you’ll know that they are really focusing on some key areas. As a result, VMware infrastructure is getting pretty solid and modern.

Announcing Services Discovery for tracking and improving service reliability

Gremlin helps teams proactively improve the reliability of their systems by running chaos experiments on infrastructure including hosts, containers, and Kubernetes clusters. But as microservice-based architectures and automated cloud platforms become the norm, engineers are shifting their focus from managing infrastructure to managing services. In order to keep these services as resilient as possible, they need tools that can help them find failure modes, reduce incidents, and improve availability.

How to deploy an application on Friday

No one likes giving their weekends up to fix release issues. Developers and operations teams are traditionally hesitant to make changes or deploy applications on a Friday, in case something goes wrong and they have to spend their weekend making emergency fixes. Or worse, trying to roll back changes that were made. However, with a strong set of practices and a reliable deployment pipeline, there should be no reason why a deployment cannot happen anytime — even on a Friday afternoon.

Chaos Engineering in 60 seconds - Attack a service

Learn how to run a chaos experiment on a distributed service using Services Discovery in Gremlin. Gremlin is the enterprise Chaos Engineering platform on a mission to help build a more reliable internet. Their solutions turn failure into resilience by offering engineers a fully hosted SaaS platform to safely experiment on complex systems, in order to identify weaknesses before they impact customers and cause revenue loss.

Announcing Services Discovery for tracking and improving service reliability

Gremlin announces Services Discovery for tracking and improving the reliability of distributed services. Gremlin is the enterprise Chaos Engineering platform on a mission to help build a more reliable internet. Their solutions turn failure into resilience by offering engineers a fully hosted SaaS platform to safely experiment on complex systems, in order to identify weaknesses before they impact customers and cause revenue loss.

GitOps Use Cases You May Not Have Considered

GitOps is growing in popularity. You’ve probably seen it mentioned on Reddit or dev.to. But what the heck is GitOps? Broadly speaking, GitOps takes the principles of Git and CI-powered workflows favored by software developers — commonly used to automate the process of building, testing and deploying software — and applies them to other business processes.

Test Azure Service Bus Performance by Generating a Million Test Messages

For the people using Azure Service Bus namespaces – we often have the need to ensure the Azure Service Bus Performance by testing our system integration by generating some test messages on the Azure Service Bus resources. You might need this for QA/Development for performance testing, load testing etc. This blog will explain how to simulate the test environment using Serverless360 to check Azure Service Bus performance and its throughput.

Introduction to cron job monitoring with Healthchecks

Software teams use cron jobs to handle many important tasks like database backups and maintenance scripts. Cron jobs make sure that your applications are behaving as they should, but cron job failures are often silent and not noticed until the problem becomes worse. In this guide, we will learn how to stay aware about cron job failures by using Healthchecks.

Improve Your CMDB for Business Outcomes with Application Dependency Mapping

A configuration management database (CMBD) is a centralized repository that stores information about all the significant entities in your IT environment. These can include your hardware, installed software applications, documents, business services, and even the people who are part of your IT system. The CMDB is designed to help you maintain and support the interrelationships between the configuration items (CIs) within a vast IT structure.

Integrating a Cloudsmith repository with a Harness CD pipeline

In this blog, we will walk through the process of configuring a private Cloudsmith repository as an artifact source for a Harness Continuous Deployment pipeline. Harness is a Continuous Deployment platform that allows you to easily automate the deployment of your software to your infrastructure and environments.

What is Site Reliability Engineering [Simple Intro to SRE]

Wondering what SRE is all about? We will explain what it is, how it works, why it was developed, and how it can help your organization. So what is SRE (Site Reliability Engineering)? SRE is a methodology that fuses software and operations teams, with the goal of producing reliable, resilient, and scalable systems. Site Reliability Engineering (SRE) was developed by Google engineer Ben Treynor Sloss in 2003. Google’s goal was to increase the reliability of its sites and services.

Why VPS Plans are Cheaper than Shared Hosting

You might be looking around for a new web host, or trying to get a better deal. You notice VPS plans are incredibly cheap - as low as $3-5 per month, while the cheapest shared hosting is around $10 per month. You wonder, "How is this possible?!" - especially when people recommend moving to a VPS once a site becomes popular. Let's discuss what these are first, before going into why cheaper doesn't necessarily mean better for you and your business.

How database DevOps can enable the evolving insurance landscape

In 2020, Deloitte reported on The four trends that define insurance and showed that the future of the insurance marketplace is going to be significantly different. Life and Property and Casualty insurers, for example, estimated that 93% of their volume already came from propositions that were not offered five years ago. New propositions were expected to keep on rising, with nearly a quarter of investment spend in insurance allocated to new product development.

Autoscaling with the HAProxy Kubernetes Ingress Controller and KEDA

One of the greatest strengths of containers is the ability to spin more of them up quickly. As the volume of traffic to your application increases, you can create more application containers on the fly to handle it, in almost no time at all. Kubernetes ships with autoscaling baked in, giving you the power to scale out when the system detects an increase in traffic—automatically!

OpenStack CentOS alternatives:7 reasons to migrate to Ubuntu

Looking for OpenStack CentOS alternatives after recent changes in the CentOS project? Think Ubuntu – the most popular Linux distribution for OpenStack deployments, after CentOS, across development and production environments. Wondering what makes Ubuntu different? Here are seven reasons you should consider Ubuntu when planning your CentOS migration.

5 Digital Transformation Mistakes Infrastructure Leaders Make

Senior IT leaders, motivated by both the changing nature of our economy and more recently, the COVID pandemic, have decisively shifted their focus toward applications. The industry catchphrase for this shift, digital transformation, makes clear its dual nature: directed toward the digital future while at the same time acknowledging that the existing environment must be modernized—in other words, transformed. Tasked with enabling this new breed of applications are operations groups.

How Artificial Intelligence Enhances Customer Service Management?

Throughout the world, the business and service sectors thrive on the best customer service management practices to ensure customer retention, boost customer sentiments to increase the profitability and branding of their business. But amidst the emerging technological advances, AI (Artificial Intelligence), as it is well known, is turning the table for businesses in winning the game of gaining customer trust and loyalty.

Featured Post

How should start-ups court software talent?

There is a game of 'speed dating' going on between technology businesses and the software engineering talent that bring amazing solutions to market. In recent years big tech companies, expanding aggressively in Europe, have competed ferociously with locally headquartered tech startups for the best software engineers. These engineers are in short supply. A government-supported Tech Nation report disclosed that 10 per cent of all UK job vacancies were in tech. The report suggested that at current growth there could be 100,000 job openings per month before the end of June this year.

Five Ways Containerized eSBCs Optimize Teams, Zoom & Other Cloud Communications Deployments

Enterprises are using Unified Communications as a Service (UCaaS) solutions like Microsoft Teams and Zoom, and Contact Center as a Service Solutions (CCaaS) like Five9 and Genesys to improve communications, simplify operations, and accelerate IT agility. As the COVID-19 outbreak clearly demonstrated, UCaaS and CCaaS solutions are ideal for delivering enterprise communications services to remote workers, mobile users, and small/home offices.

Five Ways Containerized eSBCs Optimize Teams, Zoom & Other Cloud Communications Deployments

Enterprises are using Unified Communications as a Service (UCaaS) solutions like Microsoft Teams and Zoom, and Contact Center as a Service Solutions (CCaaS) like Five9 and Genesys to improve communications, simplify operations, and accelerate IT agility. As the COVID-19 outbreak clearly demonstrated, UCaaS and CCaaS solutions are ideal for delivering enterprise communications services to remote workers, mobile users, and small/home offices.

4 Characteristics of Monitoring Essential to Implementing DevOps

In the new world of rapid releases, continuous change, and increasingly high user expectations, more organizations are embracing DevOps. One of the primary drivers for adopting DevOps is speed — particularly the reduction of risk at speed. As DevOps seeks to reduce risk and deliver insight at an increasingly faster pace, new tools have emerged in the monitoring space. But these tools alone will not deliver us into the low-risk world of DevOps — not without new and updated thinking.

What Comes After Kubernetes?

You probably can’t believe I’m asking that question. It’s like showing up to a party and immediately asking about the afterparty. Is it really time to look for the exit? No…but yes. We used to deploy apps on systems in data centers. Then we moved the systems to the cloud. Then we moved the apps to containers. Then we wrapped it all in Kubernetes for orchestration, and here we are. Each advance in technology unlocks doors we couldn’t reach before.

Software Engineering Daily Podcast

Large portions of software development budgets are dedicated for testing code. A new component may take weeks to thoroughly test, and even then mistakes happen. If you consider software defects as security issues then the concern goes well beyond an application temporarily crashing. Although even minor bugs can cost companies a lot of time to locate the bug, resolve it, retest it in lower environments, then deploy it back to production.

CI/CD Pipeline Security 101

In our previous post, we discussed the recent security incident at Codecov and the following investigation at Mattermost. As a follow-up to that we wanted to share some of the basic design principles as well as a handful of more technical tips and tricks around CI/CD pipeline security that helped Mattermost come out of the incident unscathed.

Using Dokku On DigitalOcean

Dokku can be a cost-effective, convenient way to deploy apps to DigitalOcean. SolarWinds® Papertrail™ can make monitoring the logs of those apps simple and frustration-free. Combine these two technologies and you have an effective deployment process and log management system. Let’s look at Dokku first. Dokku is an open-source platform-as-a-service (PaaS). If you’re familiar with Heroku, you can consider Dokku a private Heroku that you manage.

A DBA's Habits for Success: CMMI (Part One)

Finding the perfect flow for your business can take time and patience. Like almost anything else in life, a business must go through stages of maturity before it reaches its final form and only requires regular maintenance. In this five-part series, we’ll dive into Capability Maturity Model Integration (CMMI) and what phases businesses and their DBAs must go through to successfully manage IT as a business.

New Gartner AIOps Platform Market Guide Shows More Use Cases for Ops and Dev Teams

Gartner jumps right into it, describing a reorientation of a tool that has previously focused on IT service management and automation. AIOps is now also enabling a variety of new observability use cases for DevOps and Site Reliability Engineering (SRE) teams. This blog presents the guide’s major findings and a link so you can read the report for more details. About the AIOps Platform Market

FireHydrant April 2021 Product Updates: Incident Tags & Customizable Slack Incident Modals

We're excited to announce the release of two new features this month: customizable Slack incident modals and Incident Tags. Keep reading to more about how they can help your teams manage incidents better!

Fintech AI/ML on Ubuntu

The financial services (FS) industry is going through a period of change and disruption. Technology innovation has provided the means for financial institutions to reimagine the way in which they operate and interact with their customers, employees and the wider ecosystem. One significant area of development is the utilisation of artificial intelligence (AI) and machine learning (ML) which has the potential to positively transform the FS sector.

What is KFServing?

TL;DR: KFServing is a novel cloud-native multi-framework model serving tool for serverless inference. KFServing was born as part of the Kubeflow project, a joint effort between AI/ML industry leaders to standardize machine learning operations on top of Kubernetes. It aims at solving the difficulties of model deployment to production through the “model as data” approach, i.e. providing an API for inference requests.

Can I Send an Alert to Discord?

This is a great question. The answer is yes. You can send Graylog alerts via email, text, or Slack, and now Discord. Yes Discord! The growth and use of Discord has transformed from just many Gaming users to businesses using it as a communication platform. Many businesses like: Gaming Developers, Publishers, Journalists, Community and Event Organizers use Discord. Discord lets Gamer Developers work in teams with each other on their projects.

Cloud Migration Strategy [Guide]

Migrating to cloud infrastructure is one of the most critical requirements for modern enterprises to ensure long-term sustainability. Initially, there was a general apprehension about adopting cloud technology and developing a cloud migration strategy; certain aspects like security and resilience were concerns. But today, with increasing technological advancements and familiarity, there is not an iota of doubt that the advantages of adopting cloud technology have far outnumbered the few drawbacks.

The easiest way to deploy your database - The Future of Qovery - Week #7

During the next four weeks, our team will work to improve the overall experience of Qovery. We gathered all your feedback (thank you to our wonderful community 🙏), and we decided to make significant changes to make Qovery a better place to deploy and manage your apps. This series will reveal all the changes and features you will get in the next major release of Qovery. Let's go!

10 Benefits Of Virtualization In The Data Center

Are you looking for ways to improve your data center performance and resource utilization? Consider employing virtualization. Virtualization offers a cost-effective solution to satisfy the growing need for storage capacities and IT support required by most organizations. It is a process that allows you to scale up your physical resources to meet your increasing demands. You can virtualize physical servers, networking, storage, and other infrastructure components to enhance your data center operations.

April Online Meetup - Hypper: Dependency-aware package management for Kubernetes

Introducing Hypper, a new package manager for Kubernetes designed with cluster administrators in mind. Hypper is built on Helm and charts but makes some different assumptions around multi-tenancy and dependent charts (which can be useful with CRD handling). Where Helm assumes a user could be one of many users running in multi-tenant, Hypper assumes the user is a cluster administrator managing a cluster.

Comparison: Snyk, Aqua Security, Sysdig

Security testing tools help us to monitor our cloud-native resources for potential vulnerabilities throughout our development lifecycle. By flagging security vulnerabilities early, our teams can react on time to reduce potential damage to our end-users and our business. This post will compare three different Security scanning tools that focus on cloud-native infrastructure and application security, namely.

Announcing role based access control for API keys for more control over automation

Today, Gremlin is excited to announce the ability to create an API key that can perform actions with the same set of permissions as your user account. This allows you to automate Gremlin tasks safely and securely.

Collision 2021 - Securing Software Pipelines with Continuous Packaging

Building automation and security into software supply chains requires packaging source code, dependencies and containers into logical, versioned units. But, in 2021, how engineers package their software is more vital than ever, requiring a serious refresh with a name befitting its focus on security within the cloud. We call it Continuous Packaging. Now on-demand, watch our talk from Collision Conference 2021 where we explored Continuous Packaging and how it can help secure your delivery pipelines, from development through to deployment.

Continuous Monitoring: What Is It and How Is it Impacting DevOps Today?

Continuous monitoring (CM), also referred to continuous control monitoring (CCM), is an automated process that allows DevOps teams to detect compliance and security threats in their software development lifecycle and infrastructure. Traditionally, businesses have relied on periodic manual or computer-assisted assessments to provide snapshots of the overall health of their IT environment.

[Webinar] Observability and Resilience in Microservice Environments with Komodor & Epsagon

Kubernetes has made it easier to manage and scale microservices. However, keeping track of so many moving parts is often challenging for Dev & Ops teams. Achieving clear observability for better monitoring and troubleshooting is key to improving the development process. Part 2 of the webinar, which includes a talk by Komodor's CTO and co-founder, Itiel Shwartz, concluded with a quick demo of Komodor's troubleshooting platform and a Q&A session.

What Is SaaS Finance? (Plus 12 Metrics You Should Be Monitoring)

One of the most important things SaaS companies need to think about in order to be successful is financial modeling. To succeed in the increasingly competitive SaaS space, finance teams need to carefully consider a wide variety of KPIs and find ways to effectively manage both present and future cash flows. Cash flows and liquidity are two of the most common challenges faced in this industry — if effectively addressed, your company can position itself for long-term success.

Monitoring in a Cloud-Native Era

The move to the cloud creates massive opportunities to deliver great applications and experiences to customers and employees, but it also comes with a new set of complexities. These new environments, powered by containers and microservices, among others, are dynamic and ever-changing. The old ways of monitoring don't apply anymore-but the need to ensure the reliability and performance of your applications is more important than ever.

Introducing all new Serverless360 in preview

Towards the end of 2016, it all started with developing a simple platform to manage Microsoft Azure Service Bus namespaces. The then classic Azure portal had limited capabilities to manage Azure Messaging resources like Service Bus Queues and Topics. Paolo Salvatori developed and managed a community tool called Service Bus Explorer. We identified that there are some challenges or limitations in managing and monitoring Azure Messaging resources using the above two.

The future of testing with Launchable

Do we really need to run all the tests every time we make a change to the source code or make a release? That could take minutes or even hours. Wouldn't it be better to run only the tests related to the changes we are making or the phase of the lifecycle of an application? Is the future of testing in AI and ML? Kohsuke Kawaguchi from Launchable might have the answers to those and quite a few other questions.

Ruby on Rails Development Setup for Beginners

Today we will install Ruby on Rails (RoR) on a Debian Linux operating system (Ubuntu 18.04 LTS). With that said, RoR is compatible with other operating systems with just a few tweaks. This blog will assist you in installing RoR with a simple step-by-step process. Your installation may differ, for other operating systems refer to this site. I am new to developing and have been using Ubuntu 18.04 LTS, a flavor of Debian Linux, for my projects.

Trigger a Codefresh Pipeline from ArgoCD

Codefresh is an awesome platform for doing GitOps deployments to Kubernetes. Starting last year, the Codefresh team has been adding rich integrations with Argo CD and Argo Rollouts, GitOps observability dashboards, and more. Codefresh pipelines, in particular, have played an integral role in our customers’ progressive delivery workflows by allowing them to orchestrate all of the testing, analysis, and rollback activities that work in conjunction with Argo CD synchronization.

Full-stack monitoring for code-to-cloud visibility

Engineering teams are very used to talking about their tech stack as the technologies and tools used to build their application. Monitoring also has a stack, and full-stack monitoring is when you align each layer of your tech stack with a monitoring practice and weave a thread from every layer. True code-to-cloud visibility is only accomplished with full-stack monitoring, and necessary for long-term DevOps success.

The 7 Hues of DevOps

Purple teams. Blue, green, red, back, canary deploys. Golden signals and red metrics. There are oddly a lot of color adjectives used in DevOps terminology, and Dave and Chris cover them all in this episode. They will talk about the range of deployment strategies for modern applications. The various types of metrics used to monitor them, and the different approaches to understanding how much visibility is good enough.

Cloud 66 Feature Highlight: Delete Protection

What is Delete Protection? This feature stops specific servers from being accidentally deleted from your account. Delete Protection helps developers to prevent applications from going down and gives more control over your server mix. This includes avoiding deletion of core servers when scaling down via the API, and safeguarding servers with intentionally unique configurations ("snowflakes").

The true cost of IT Ops, the added value of AIOps

Today’s IT landscape is complex, hybrid, and fast-moving, and the adoption of multi-cloud infrastructure, applications, and new digital transformation initiatives is accelerating. IT operations teams, playing a vital role in enabling the delivery of uninterrupted services and creating business value for enterprises, are finding they need to constantly grow their resources to manage all the moving pieces in their IT stack. This can get expensive … but how much are they spending?

A Day in the Life: James the IT Ops Guy Learns How to Connect All that Data

“Morning, mate,” I greeted Dinesh as he walked into the office. “Nice get up for the big day!” He was wearing a pressed shirt, rather than his usual hoodie. “Thought I’d make an effort, you know,” he grinned. We’d been planning intensely for this moment for the last week or so – our meeting with Charlie, the CIO, to present the results of our Moogsoft experiments and ask for permission to extend the rollout across the enterprise.

Deploying Mattermost and Kubeflow on Kubernetes with Juju 2.9

Since 2009, Juju has been enabling administrators to seamlessly deploy, integrate and operate complex applications across multiple cloud platforms. Juju has evolved significantly over time, but a testament to its original design is the fact that the approach Juju takes to operating workloads hasn’t fundamentally changed; Juju still provides fine grained control over workloads by placing operators right next to applications on any platform.

How to monitor HashiCorp Vault with Datadog

In this series, we’ve introduced key HashiCorp Vault metrics and logs to watch, and looked at some ways to retrieve that information with built-in monitoring tools. Vault is made up of many moving parts, including the core, secrets engine, and audit devices. To get a full picture of Vault health and performance, it’s important to track all these components, along with the resources they consume from their underlying infrastructure.

Tools for HashiCorp Vault monitoring

In Part 1, we looked at the key metrics for monitoring the health and performance of your HashiCorp Vault deployment. We also discussed how Vault server and audit logs can give you additional context for troubleshooting issues ranging from losses in availability to policy misconfiguration. Now, we’ll show you how to access this data with tools that ship with Vault.

Connect Civo Kubernetes to Codefresh

Codefresh is a DevOps automation platform with Kubernetes and Docker native tools and features. You can create powerful pipelines and utilize the provided dashboards by connecting different Kubernetes clusters and registries to receive further insights into your deployments. Additionally, by enabling GitOps for your repositories you can reach the highest level of confidence in your Kubernetes deployments.

Five Use Cases for AI/ML in Healthcare (Blog series - Part 1 of 3)

Technology has accelerated changes toward information-based healthcare delivery and management. Today’s multi-disciplinary approach to delivering better healthcare outcomes coupled with advanced imaging and genetic-based customized treatment models depend on AI/ML driven information systems. At Robin.io, we believe machine learning is the life-saving technology that will transform healthcare. AI/ML challenges the traditional, reactive approach to healthcare.

JFrog Artifactory Terraform Provider Gains Xray Functionality

A few months ago, I was asked if I wanted to develop an open-source Terraform provider. Eleanor Saitta, principal at Systems Structure Ltd, had a client who was setting up JFrog Xray across their Github repositories but didn’t want to configure each repository by hand. As an SRE who enjoys working on projects that automate away those sorts of pain points (and someone who works extensively with Terraform during their day job), this sounded like an interesting project to work on.

Serverless 101: What Is It, How Serverless Computing Works, Pros and Cons, and More

Serverless computing is one of the fastest-growing areas in software development, along with hybrid clouds and using multi-cloud strategies. The global serverless architecture market is expected to grow at a CAGR of 16.2% in 2021 through 2026. It will reach $10.29 billion from $4.2 billion in 2020, according to MarketWatch. The serverless approach promises to help developers beat the cons of self-managed and even virtual server infrastructure. But is serverless right for you?

The future of testing with Launchable

In this video with with Kohsuke Kawaguchi (KK) from Launchable and Viktor Farcic we talk about testing K8s applications. Do we really need to write all the tests every time we make a change to the source code or make a release? That could take minutes or even hours. Wouldn’t it be better to run only the tests related to the changes we are making or the phase of the lifecycle of an application? Is the future of testing in AI and ML?

Stackify + Netreo Creates a Dev + Ops Powerhouse

TLDR: Stackify is joining with Netreo to bring best-of-breed solutions for developers and IT operations. Together, their observability platform can help both small development teams and the world’s largest enterprises manage and monitor their applications and infrastructure. Stackify has been working for the last 9 years to help software developers monitor and deb their productions applications.

Log Shipping Using Fluent Bit and vSphere with Tanzu

One of the new features that came with the latest update of vSphere with Tanzu was the ability to use TKG Extensions. This powerful framework allows simplified deployment and management of multiple open source projects that are backed by VMware support, including alerting with Prometheus, visualization with Grafana, ingress with Contour, and logging with Fluent Bit.

Power Your Consul Service Mesh with HAProxy

Many of you use HashiCorp Consul for service discovery. It makes connecting one backend application or service to another easy: Your Consul servers store a catalog of addresses to all of your services; when an application within the network wants to discover where a service is listening, it asks Consul, which gives it the address.

Resilience in Action E6: Oversize Coffee Mugs, SLOs, and ML with Todd Underwood

‍Resilience in Action is a podcast about all things resilience, from SRE to software engineering, to how it affects our personal lives, and more. Resilience in Action is hosted by Kurt Andersen. Kurt is a practitioner and an active thought leader in the SRE community. He speaks at major DevOps & SRE conferences and publishes his work through O'Reilly in quintessential SRE books such as Seeking SRE, What is SRE?, and 97 Things Every SRE Should Know.

Cloud Migration: A Race You Shouldn't Try to Win

The race to cloud migration can seem like a good idea for most businesses, but a common trend has been to get there as soon as possible. Much like with the story of the tortoise and the hare, there are lessons to be learned from moving too fast. When it comes to cloud migration, businesses would be smart to choose a more strategic route and take their time.

Building and running FIPS containers on Ubuntu

Whether running on the public cloud or a private cloud, the use of containers is ingrained in today’s devops oriented workflows. Having workloads set up to run under the mandated compliance requirements is thus necessary to fully exploit the potential of containers. This article focuses on how to build and run containers that comply with the US and Canada government FIPS140-2 data protection standard.

Environment variables and Secrets - The Future of Qovery - Week #6

During the next five weeks, our team will work to improve the overall experience of Qovery. We gathered all your feedback (thank you to our wonderful community 🙏), and we decided to make significant changes to make Qovery a better place to deploy and manage your apps. This series will reveal all the changes and features you will get in the next major release of Qovery. Let's go!

Maximizing VMware Performance and Memory Utilization

VMware is one of the top virtualization software that allows you to create virtual machines and make the best use of your resources. One of the major focuses of virtualization solutions is to enable optimized use of resources like memory and computing power, but overcommitting your hypervisor towards greedy resource management can lead to severe degradation in the overall performance.

Top 20 DevOps Blogs to Keep Your Eye on in 2021

The process of developing software applications has evolved over the years. DevOps has taken over the world, and more and more companies are starting to adopt the DevOps culture in their process. DevOps is a practice of operations where engineering teams work together through the entire software development cycle. It includes functions from development to testing and deployment to production support. Because of that, DevOps has become popular in the industry.

How to monitor your cron jobs using Cronitor

Cron jobs handle a lot of background plumbing that keep applications running smoothly. But cron job failures often go unnoticed and be disastrous for your users and business. To make sure that you are aware about cron job issues, you should use a cron monitoring tool. In this post, we will see how to get started with Cronitor to monitor your cron jobs.

New SQL Monitor release gives organizations the opportunity to manage their on-premises and cloud databases from a single global dashboard

To help organizations explore and manage the advantages the cloud provides, the latest release of Redgate's popular database monitoring tool, SQL Monitor, now supports Amazon EC2 and RDS, and Azure SQL Database and Azure Managed Instances as well as on-premises SQL Server.

Puppet Releases Remediate 2.0

As we look to continue to provide value to our Remediate customers, we focused on how we create simple and effective workflows in the product. Our customers have told us there are some really important quality of life features that would go a long way in helping reduce the pain and frustration of remediating vulnerabilities and enable them to better communicate with their security partners.

Using Telepresence 2 for Kubernetes debugging and local development

Telepresence 2 was recently released and (like Telepresence 1) it is a worthy addition to your Kubernetes tool chest. Telepresence is one of those tools you cannot live without after discovering how your daily workflow is improved. So what is Telepresence? It is too hard to describe all the functionalities of the tool in a single sentence, but for now I would describe it as the “Kubernetes swiss army networking tool”.

Q&A: Best Practices for Storing and Analyzing Time-Series Data

The exponential growth of machine generated data in recent years has created the need for solutions purpose-built to handle extremely high-frequency telemetry data. This has driven increasingly more organizations to adopt time series databases and address the unique challenges around ingesting, analyzing, and storing massive amounts of time-series data.

Creating Custom Slack Commands

Site Reliability Engineers are expected to know everything that’s happening, all of the time. That’s a lot of things! To help you sift through the noise, we’ve developed a feature that lets you find accurate data about your organization on-demand. You can do this by sending custom-designed commands to FireHydrant directly from your integrated Slack account.

Should you ever reinstall your Linux box? If so, how?

Broadly speaking, the Linux community can be divided into two camps – those who upgrade their operating systems in-vivo, whenever there is an option to do so in their distro of choice, and those who install from scratch. As it happens, the former group also tends to rarely reinstall their system when problems occur, while the latter more gladly jump at the opportunity to wipe the slate clean and start fresh. So if asked, who should you listen to?

Visualize the financial impact of RIs and Savings Plans with Eco's Savings Over Time report

When buying AWS Savings Plans and RIs, Spot Eco makes it easy to create a highly utilized and well-balanced blend of Savings Plans and RIs. Leveraging 3rd party Standard RIs (with shorter terms) alongside Convertible RIs, and Savings Plans, Eco helps ensure maximum savings with minimum commitment.

Sleuth + SOC 2 Type II: Our constant commitment to security

‍In Sleuth’s continuing efforts to help our customers to deliver faster and safer, we have always put security as a top-level business priority. Security and privacy of our customers’ data is always in the forefront of our design, development, and deployment concerns. We understand the level of trust our customers put in us when they connect key systems together with Sleuth.

Recover automatically from failed deployments with Argo Rollouts and Prometheus metrics

Argo Rollouts is a progressive delivery controller created for Kubernetes. It allows you to deploy your application with minimal/zero downtime by adopting a gradual way of deploying instead of taking an “all at once” approach. Argo Rollouts supercharges your Kubernetes cluster and in addition to the rolling updates you can now do In the previous article, we have seen blue/green deployments.

What's Redgate's plan for PASS?

My blog post from February 1 explains that Redgate took the opportunity to purchase the assets of PASS with the main goal of supporting the community. The PASS association ran for 21 years bringing together a community to connect, share, and learn. The community of course lives on, however the association no longer exists as it once did. Working with SQL Server and the data platform is what unites us all. Data is at the heart of everything we do.

3D-printed, Sleuth logo UNBOXING

Andy, a regular viewer of Don's dev-focused Twitch streams, created a 3D-printed, 100 LED RGB Sleuth logo, and this is its unboxing. Don and Andy also get it working, connected to the internet, then Don extends his Twitch chat bot to allow viewers to change the logo's lights. The stream finishes with Don hooking the logo up to Twitch follow events so that when a viewer starts following, the logo lights up. This video is a lightly edited from the original Twitch stream. Huge thanks to Andy for building and sharing such a cool project!

Empowering Founding Engineers

Massive tomes have been written on engineering management, but I thought it might be helpful to take a brief minute to discuss setting up your Founding Engineers (FE) for success. For this post I define FEs as the first wave of engineers hired after the founding team. This round of hiring usually takes place after seed funding has been secured and some semblance of initial product/market fit has been achieved.

Cloud 66 Feature Highlight: Archived Application

Archived Applications allows developers to "park" an application at any time. Archiving an application saves the configuration and state of your deployment, turns your servers off, and puts your application into a dormant state. Once your application is archived, you will not be able to open, edit or access it until you restore it. Servers used by this application will be turned off (deactivated) but not deleted. Please do not delete these servers, or you will not be able to restore your application. Note that you don't pay Cloud 66 for applications that are archived.

Automatically Assess and Remediate the SolarWinds Hack

With software supply chain attacks on the rise, are you wondering how you can recover quickly from the recent SolarWinds breach at your company? Months after its discovery, the devastating SolarWinds hack remains a top concern for business, government and IT leaders. This destructive supply chain attack put the spotlight on software development security — a critical issue for the DevOps community.

Incident triage: a key element in your MTTR

One of the key performance indicators for IT Ops is MTTR (Mean-Time-To-Resolution). MTTR essentially measures the length of your incident management lifecycle: from detection; through assignment, triage and investigation; to remediation and resolution. IT Ops teams strive to shorten their incident management lifecycle and lower their MTTR, to meet their SLAs and maintain healthy infrastructures and services. But that’s often easier said than done.

Lift-and-Shift Cloud Migrations: The Good, the Bad, and How To Avoid the Ugly

There are many paths to the cloud, and the one you choose depends on your particular digital transformation requirements and resources. About a decade ago, Gartner cleverly developed an alliterative nomenclature to describe five different migration strategies: the five Rs. That list has evolved over time and there a lot of 5-, 6-, and 7-strategy variations out there.

From lightweight to featherweight: MicroK8s memory optimisation

If you’re a developer, a DevOps engineer or just a person fascinated by the unprecedented growth of Kubernetes, you’ve probably scratched your head about how to get started. MicroK8s is the simplest way to do so. Canonical’s lightweight Kubernetes distribution started back in 2018 as a quick and simple way for people to consume K8s services and essential tools.

What are MTTx Metrics Good For? Let's Find Out.

Data helps best-in-class teams make the right decisions. Analyzing your system’s metrics shows you where to invest time and resources. A common type of metric is Mean Time to X, or MTTx. These metrics detail the average time it takes for something to happen. The “x” can represent events or stages in a system’s incident response process. Yet, MTTx metrics rarely tell the whole story of a system’s reliability.

Multi-instance GPU (MIG) with MicroK8s on NVIDIA A100 GPU

Although Kubernetes revolutionised the software life cycle, its steep learning curve still discourages many users from adopting it. MicroK8s is a production-grade, low-touch Kubernetes that abstracts the complexity and can address use cases from workstations to clouds to the edge. We’ll highlight the details of MicroK8s’ simplicity and robustness and demonstrate the different usage scenarios, running it on NVIDIA DGX, EGX, DPU and Jetson hardware using real applications from NVIDIA marketplace.

Kubernetes Master Class - How to Update Monitoring After Upgrading to Rancher 2.5

Rancher 2.5 introduces a new, improved monitoring integration. It is still based on Prometheus, Grafana and Alertmanager, but much more flexible regarding configuration options and customizations. It also directly ships with much improved dashboards and alerting rules. Unfortunately, due to the necessary internal changes, there is no automatic upgrade path available from the old to the new monitoring. While you can continue to use the old monitoring with 2.5, there are some manual migration steps necessary to get all the benefits from the new monitoring system and keep all the configurations and customizations from the old one.

The More You Monitor - What is AIOps?

You might know AIOps as just another buzzword that been getting thrown around almost as much as 'cloud' and 'digital transformation', but do you really know what AIOps is and how it uses AI and machine learning to unlock a whole new realm of possibilities and efficiencies? In this episode of The More You Monitor Lead Sales Engineer, Donde Aponte, walks you through what AIOps and Observability truly means and how this new approach to IT operations can save IT and DevOps teams tons of time and stress by dramatically shortening MTTR and reducing the number outages and slowdowns.

A Guide to Kubernetes Certifications

In an age of virtualization and cloud computing, developers increasingly use Kubernetes’ open-source platform to manage containerized workloads and services. Kubernetes container became popular because it was impossible to define a resource boundary for multiple applications in a traditional CPU environment. Misuse of resources created an inefficient environment.

Using K8s But Not Overhauling Your DevOps Processes

Kubernetes is now the industry standard for organizations that are born in the cloud. Slowly, many enterprises and mid-level companies are adopting it as the default platform for managing their applications. But we all know, Kubernetes adoption has its own challenges, as well as its associated costs. How do we decide when and what to migrate to Kubernetes? Does migrating to Kubernetes mean overhauling all devops processes?

How Much Does AWS Fargate Cost?

Amazon Web Services (AWS) provides numerous solutions for developers. One of these is Fargate, a serverless container service that allows users to run containers on AWS without the need to manage any underlying infrastructure. Containers are helpful because they solve the problem of getting applications to run reliably when the application is moved from one computing environment to another. AWS Fargate works with both Amazon Elastic Container Services (ECS) and Amazon Elastic Kubernetes Service (EKS).

Practitioner's Guide: An Introduction to Kubernetes Multi-tenancy

If your organization is adopting multiple Kubernetes clusters, chances are that multiple users or groups have access to these clusters on the same shared infrastructure. Kubernetes multi-tenancy aims to drive efficient use of infrastructure, while providing operators with robust isolation mechanisms between users, workloads, or teams. Running more applications on the same shared infrastructure means better utilization of resources and a reduction in overall operating costs.

Why Enterprises need to Modernize AIX (WebSphere) Workloads to Linux

IBM’s AIX operating system has powered zillion mission-critical applications for over three decades, providing enterprise applications the edge to do more. And, let’s not forget that a huge chunk of BFSI applications is still nesting on AIX within their own data centers due to its security, performance, and reliability.

Datadog acquires Sqreen to strengthen application security

We began our security journey last year with the release of Datadog Security Monitoring, which provides runtime security visibility and detection capabilities for your environment. Today, we are thrilled to announce that Sqreen, an application security platform, is joining the Datadog team. Together, these products further integrate the work of security, development, and ops teams—and provide a robust, full-stack security monitoring solution for the cloud age.

How Does Microservices Architecture Change Database Deployment?

This question was raised at the recent Redgate Summit: How does the implementation of a microservices architecture affect the implementation of a database DevOps approach? I could even rephrase it a little: Does a microservices architecture affect a database DevOps approach?

Having On-call Nightmares? Runbooks can Help you Wake Up.

You aren't sure how long you've been here, but the view outside the window sure is soothing. Before you can fully take in your surroundings, a siren rips you back into the conscious world. Slowly, you begin to piece together that you exist, and you are on call. The ringing, much louder now, pierces through your skull as you begin to open your bleary eyes. You turn over your pillow, grab your phone, and click through the PagerDuty notification.

A Quick Guide to Developing Steps for Relay

Relay has a substantial library of external services and tools — as of March 2021 there are 60 integrations in our Github organization. Each integration repo can contain multiple triggers, containers that receive webhook payloads from other services, and steps, which Relay executes to get stuff done in your workflow.

Continuous integration that you can trust: announcing SOC 2 certification

At CircleCI, we care about security - in 2018, we became the first CI/CD tool to meet the rigorous security and privacy standards required by government agencies to get FedRAMP authorized. Now, CircleCI is SOC 2 certified, adding another industry-recognized security accreditation.

Top 5 Challenges in the Adoption of multi-cloud Strategies

In my previous blog post, we discussed why research shows that over 90% of enterprises are embracing a multi-cloud strategy. Simply put, a multi-cloud approach offers significant advantages to organizations seeking to optimize the Three Cs of Cloud: Cost, Capabilities and Compliance. Still, with all the advantages to be gained with a multi-cloud approach, executives should be aware of the downsides. Here is my take on the top 5 challenges of multi-cloud adoption.

How to Integrate Microsoft Teams and Netreo

Netreo’s metrics, event monitoring, and notification capabilities can be extended to 3rd-party collaboration and messaging platforms for maximum operational efficiency. As outlined in our previous post, integrations with Netreo already include popular services such as Slack, PagerDuty, Jira, ServiceNow, and ZenDesk. Microsoft Teams is another messaging and collaboration application that enterprises are rapidly adopting.

How to Integrate Microsoft Teams and Netreo

Netreo’s metrics, event monitoring, and notification capabilities can be extended to 3rd-party collaboration and messaging platforms for maximum operational efficiency. As outlined in our previous post, integrations with Netreo already include popular services such as Slack, PagerDuty, Jira, ServiceNow, and ZenDesk. Microsoft Teams is another messaging and collaboration application that enterprises are rapidly adopting.

IAM Policies: Good, Bad & Ugly

In my last post we looked at the structure of AWS IAM policies and looked at an example of a policy that was too broad. Let's look at a few more examples to explore how broad permissions can lead to security concerns. By far the most common form of broad permissions occurs when policies are scoped to a service but not to specific actions.

What is an ARP Table?

ARP (Address Resolution Protocol) is the protocol that bridges Layer 2 and Layer 3 of the OSI model, which in the typical TCP/IP stack is effectively gluing together the Ethernet and Internet Protocol layers. This critical function allows for the discovery of a devices’ MAC (media access control) address based on its known IP address. By extension, an ARP table is simply the method for storing the information discovered through ARP.

What is YAML? A Beginner's Guide

YAML is a digestible data serialization language that is often utilized to create configuration files and works in concurrence with any programming language. YAML is a data serialization language designed for human interaction. It’s a strict superset of JSON, another data serialization language. But because it’s a strict superset, it can do everything that JSON can and more.

Can devs and designers get along? - Dev Matters S1E02

In this episode of Dev Matters, Don and his guest Ben Sanders discuss whether software developers and designers can really get along. Ben shares stories, tips, and strategies pulled from his experience of over 15 years in the software industry. This episode was recorded in front of a live studio audience on Twitch.

IDC: Become a Digital Innovation Factory with These 4 Pillars of Modern DevOps

It’s do or die. In today’s brutally competitive digital economy, it is imperative for organizations to transform themselves into software-driven businesses — becoming “digital innovation factories” that can quickly and efficiently create and distribute new digital services. This enables them to be resilient, nimble, and innovative, creating business value and responding to market shifts and to customer needs. What does it take to keep your digital innovation factory humming?

AWS Reserved Instances 101: The Complete Guide

Choosing the right service plan is crucial when you are using Amazon Web Services (AWS). With 170 distinct services, ranging from compute to storage to networking and content delivery — each offered at different price points — the process requires careful consideration to make the right choice for your business. By default, AWS services are available on-demand and you pay a monthly bill for services used.

The Top 8 Tools For Any Java Developer's Toolkit

Java developers are unlikely to ever suffer from a lack of libraries, utilities, and programs at their disposal. There's no shortage of tools that offer niche as well as fundamental features. However, some tools undoubtedly stand out due to their popularity, usefulness, data representation, and in-tool features. These 8 tools listed below are often listed as some of the best Java development tools available on the market.

The State of Robotics - March 2021

It’s never too late to learn. As any reinforcement learning agent, we get rewarded by the new knowledge that we acquire. Likewise, we learn by doing, by rolling up our sleeves and getting to work. (Do you want a hands-on book on Reinforcement Learning? Here is my personal favourite) March has shown us great examples of this. From robots learning to encourage social participation to detect serious environmental problems, it was a learning month.

Visualizing your CloudFormation Template with Stackery

Stackery can be used to create a new CloudFormation template or to quickly visualize an existing one. Code is automatically generated as you simply drag-and-drop resources on a graphical grid. The experience is much more intuitive than previous generation tools like AWS CloudFormation Designer. Stackery visualizes resources the way a human would perceive them, grouping related resources together.

Funding update: $840k secured and more to come

As with all start-ups, especially for a cloud provider, access to funds is imperative to build and scale quickly – after all building out new data centre regions doesn't come cheap! So in recent months we quietly opened a seed round to acquire $2.8m worth of funding – giving Civo a pre-money valuation of $16,800,000. Since launching into beta nearly 2 years ago, we’ve had tons of VC companies knocking on our door, but at this stage we decided not to take VC money.

How to Extend your Monitoring with Automation and Scripting - VirtualMetric Webinar

With the growth of APIs adoption, increasing the complexity of APIs use cases. More and more organizations are using API to get the most out of their monitoring solutions. With the help of automation and scripting, you can customize your monitoring based on your business-specific needs. Sounds complicated, but we got you covered.

Qovery goes beyond app deployment - The Future of Qovery - Week #5

During the next six weeks, our team will work to improve the overall experience of Qovery. We gathered all your feedback (thank you to our wonderful community 🙏), and we decided to make significant changes to make Qovery a better place to deploy and manage your apps. This series will reveal all the changes and features you will get in the next major release of Qovery. Let's go!

Reduce Toil with Better Alerting Systems

If not tackled early, increasing toil can affect the morale and productivity of your SRE team. In this blog we look at some of the ways you can counter toil with the help of better alerting systems in place. Are you an SRE or On-call engineer struggling to manage toil? Toil is any repetitive or monotonous activity that can lead to frustration within an incident management team. Also at the business level, toil doesn't add any functional value towards growth and productivity.

Run confidently with secure DevOps

The rapid pace of digital transformation is accelerating the shift to cloud-native applications using containers and Kubernetes to speed the pace of delivery. But application delivery is one thing. Application uptime performance and protection are another. For cloud teams already running production one fact is clear, monitoring and troubleshooting are only the beginning. They also need to own security and compliance for their apps. In cloud-native DevOps is not enough. It's time for secure DevOps.

HAProxy Forwards Over 2 Million HTTP Requests per Second on a Single Arm-based AWS Graviton2 Instance

For the first time, a software load balancer exceeds 2-million RPS on a single Arm instance. A few weeks ago, while I was working on an HAProxy issue related to thread locking contention, I found myself running some tests on a server with an 8-core, 16-thread Intel Xeon W2145 processor that we have in our lab. Although my intention wasn’t to benchmark the proxy, I observed HAProxy reach 1.03 million HTTP requests per second.

Top Observability Strategies for Distributed Systems

In a distributed IT environment, there are a lot of moving parts, and all of them need to be monitored to ensure everything is working as it should. The rise of more complex infrastructures interweaving the cloud, on-premises, and hybrid architectures makes this a challenge. To make sure you have adequate visibility, you need an IT observability strategy.

Just call us "Major Incident Software Innovation of the Year"

We won an award! We're excited to share that we were named the Major Incident Software Innovation of the Year 2020 at the MIM Awards. Our CEO, Robert Ross (better known as Bobby), accepted over video on our behalf (watch the video below). A lot happened for us in 2020 -- not only from winning new business, but growing as a team, and maturing our product. We're excited that MIM felt the same way about us and we're honoured to recieve this award!

Kubernetes 1.21 available from Canonical

Today, Canonical announces full enterprise support for Kubernetes 1.21, from cloud to edge. Canonical Kubernetes support covers MicroK8s, Charmed Kubernetes and kubeadm. Starting with 1.21, moving forward Canonical commits to supporting N-2 releases as well as providing extended security maintenance (ESM) and patching for N-4 releases in the stable release channel.

Visualizing CloudFormation templates

As your infrastructure grows, getting a handle on all your AWS resources can be overwhelming. While that’s probably an understatement, help could be right around the corner. We’ll cover a few CloudFormation visualizer tools that can help, but let’s level set first. AWS CloudFormation is an established Infrastructure-as-Code solution that allows you to define, provision, organize, manage and update your AWS resources from a text-file template.

AI in the enterprise: Avoid hitting the infrastructure performance wall

“It’s nearly impossible to manage the growing complexity for corporate on-prem and Cloud infrastructure,” says Tim Conley, Principal at The ATS Group & Galileo Suite. “Most IT teams use a mix of tools to monitor and measure the health of their environment. However, this delays incident resolution, contributes to silos within an IT organization, and slows down your business.”

Have your say on the state of database monitoring in 2021

Since 2018, over 2,400 SQL Server professionals have provided valuable insights into how they monitor and manage their estates, and what challenges they’re facing, through the only industry-wide survey of its kind. The results of the annual survey have not only benefited the community but also helped us better understand how we could shape our own product development to deliver more value where organizations need it.

Monitor your SQL Server databases in the cloud and on-premises with one monitoring tool

There’s no doubt the cloud is having a big impact on the nature and make-up of SQL Server estates. The 2021 State of Database DevOps report from Redgate, for example, showed that 58% of organizations now use the cloud either wholly or in combination with on-premises servers, compared to 46% in the same report a year earlier.

How to Ensure Successful Remote Support

In recent times, particularly during the pandemic, working remotely has become the new normal. Not only is it a need of the time, but employers have also started acknowledging the benefits of a remote workforce. Some of these include cost elimination of renting a workspace, access to a wider talent pool, and increased productivity. Furthermore, a better work-life balance also relates to higher employee satisfaction, loyalty and retention.

Enabling Profitable Business Services through OTN Switching

OTN Switching in the Access can change the equation for offering attractively-priced L1 business services. By eliminating the complexities of muxponders and transponders, DWDM optical and traffic engineering, OTN switching technology can enable most common services over a greatly simplified architecture. Efficiencies can be achieved in both capital as well as in operational costs allowing a competitive edge to your services portfolio.

Adding IaC security scans to your CI pipeline

The adoption of Infrastructure as Code(IaC) has skyrocketed in recent years as engineers seek ways to deploy cloud infrastructure faster and more efficiently. IaC refers to the technologies and processes that manage and provision infrastructure using machine-readable languages (code) as opposed to inefficient manual operations.

How Long Does It Take To Get Started With CloudZero?

There’s no question that engineering teams at innovative tech companies are busy. No matter how successful your company is, there’s always a long backlog of feature enhancements that can make it challenging to focus on other priorities. So, it’s no surprise that one of the most common questions we get from companies that engage with us is “How much of an effort will this be for my engineering team?” The short answer is that you can get value within minutes.

Container Sprawl Is the New VM Sprawl

We are seeing organizations struggle to deploy and manage their Kubernetes clusters due to the increasing level of oversight required and the current lack of attention during the planning phase. Day 2 operations can be a “sink or swim” time for these organizations. Without effective Day 2 operations, organizations will face challenges scaling their IT environment and will not be ready to handle new threats to security and availability.

Monitor Azure Service Health events with Datadog

Azure Service Health continuously notifies you of issues that may affect the availability of your environment, such as service incidents, planned maintenance periods, or regional outages. We’ve recently enhanced our Azure integration to include additional support for monitoring Service Health issues, enabling you to keep tabs on the health of your Azure environment and take proactive measures to mitigate downtime.

Run container-optimized clusters with Ocean and Bottlerocket OS

AWS is one of the primary providers for services that help users deploy and manage their containerized applications on the cloud. Since launching ECS in 2014 and EKS in 2017, AWS has learned a lot about running containers at scale and in production. AWS developed Bottlerocket OS, a new operating system for hosting containers. This OS was specifically designed to address gaps left by the ECS and EKS-optimized AMIs, which are based on operating systems that run traditional software applications.

Announcing Relay's General Availability Launch

Today we’re proud to announce the general availability of Relay, a cloud-native workflow automation platform. We launched our public beta of Relay last June, and we’re now officially out of beta and open for business! We’ve been pretty busy during the beta period - early users have executed thousands of workflows, processed tons of events, and given us incredibly helpful feedback.

How Netflix Uses Fault Injection To Truly Understand Their Resilience

Distributed systems such as microservices have defined software engineering over the last decade. The majority of advancements have been in increasing resilience, flexibility, and rapidity of deployment at increasingly larger scales. For streaming giant Netflix, the migration to a complex cloud based microservices architecture would not have been possible without a revolutionary testing method known as fault injection. With tools like chaos monkey, Netflix employs a cutting edge testing toolkit.

Intro To Flyway - Enabling Cross Database Migrations

In this video Solution Engineer Dave Ong will provide you with an insight into Flyway, Redgate’s cross database solution to help you manage database migrations and achieve your DevOps goals. We will look at how you can leverage both versioned and repeatable migrations with both manual and CI/CD deployments that can be unlocked with the power of Flyway.

Why Stateful Applications Need a Cloud-Native Storage Stack

Enterprises are increasingly moving their applications to containerized architectures (typically using Kubernetes, the most popular container orchestration platform in use today). They’re doing this to take advantage of the significant benefits containers provide—namely portability, flexibility, speed of deployment, DevOps agility and resource efficiency.

A Complete Guide to Enterprise DevOps

It’s easy to assume that DevOps only works for start-ups that build their culture from scratch, or for tech giants with cloud-native roots. But in reality, DevOps best practices can benefit everyone—from agile new businesses to decades-old enterprises. As a result, DevOps adoption is on the rise, with 74% of enterprises adopting DevOps in some form.

Hardened ROS with 10 year security from Open Robotics and Canonical

Canonical ROS ESM customers now can access a long-term supported ROS and Ubuntu environment by the Ubuntu and ROS experts. Learn more about ROS ESM. 6 April 2021: Canonical and Open Robotics announced today a partnership for Robot Operating System (ROS) Extended Security Maintenance (ESM) and enterprise support, as part of Ubuntu Advantage, Canonical’s service package for Ubuntu. ROS support will be made available as an option to Ubuntu Advantage support customers.

Integrate FlashDrive inside your multi-cloud strategy

Your company, your infrastructure, and probably your whole business rely more and more on cloud services to provide services to your clients and your cloud infrastructure is probably a critical asset for your company. At FlashDrive our mission is to offer a simple and reliable way to deploy apps while we take care of the infrastructure and make sure your apps and services are always online and ready to scale on demand.

How Do You Overcome "We Have Always Done It This Way"?

I work in computers and my son works in manufacturing, but both of us loathe a single phrase: We have always done it this way. Please allow me to be clear on this. If you can back up this statement with “Because…”, and you list out valid points, even if I disagree with them, we’re all good. However, frequently, if you follow up this statement with the simplest of questions, “Why?”, you don’t get good answers. Usually, you’ll get a repetition of the phrase.

So you Want an SRE Tool. Do you Build, Buy, or Open Source?

As your organization’s reliability needs grow, you may consider investing in SRE tools. Tooling can make many processes more efficient, consistent, and repeatable. When you decide to invest in tooling, one of the major decisions is how you’ll source your tools. Will you buy an out-of-the-box tool, build one in-house, or work with an open source project? This is a big decision. Switching methods half-way through adoption is costly and can cause thrash.

Deploying infrastructure with an approval job using Terraform

If you are looking for an Infrastructure as Code (IaC) tool, Terraform probably tops your list. In this tutorial, you will learn how to automate the deployment of changes to your infrastructure using Terraform and CircleCI workflows. The workflows will use Approval Jobs. For this project, we will deploy the infrastructure we build to Google Cloud Platform (GCP).

3 Ways for Administrators to Scale DevOps Projects | JFrog

With the global pandemic transforming the majority of customer interactions and transactions to operate in a contactless world, software development has accelerated to address the market shift towards digital businesses. For many enterprises managing this increase in volume and velocity of development projects has added stress to their processes and workforce.

Why Modernizing the Data Layer Requires More than New Tools

While architectures and platforms like Kubernetes get a lot of attention in discussions about application modernization, we ignore the data layer at our own risk. How applications and users access data is a concern that gets more important by the day. It’s a trend we’ve seen playing out for a while, as technological concerns around latency and scalability have ceded ground to business-level concerns around compliance, security, and data privacy.

Using Modern DevOps Practices to Become a Digital Innovation Factory

Today, it is imperative for businesses to transform themselves into digital innovation factories with the agility to quickly and efficiently create and distribute new digital products and services. In this webinar, guest speakers from IDC will share key insights, market data, and best practices around the four strategic pillars for establishing a digital innovation culture as well as the modern DevOps methods and tools required to support these practices.

How to Be the Hero (Company) They Need AND Want (To Work At)

If you’re like most companies, you made many changes to your business throughout the COVID-19 pandemic. You may have adopted new tools, implemented new processes, or completely changed the way your business runs. Your workforce has likely gone fully remote—at least for a while—which requires more tools and processes to support remote employees.

Why We Built the CloudZero Platform on Serverless Infrastructure - and Its 2 Main Advantages

The promise of running a business on the cloud is that — in theory, at least — you should be able to scale your infrastructure up and down with your customer utilization. This should lead, again in theory, to less maintenance and more opportunity for innovation.

Analyze and audit your infrastructure as code with stack.new

Defining and managing your AWS resources using an Infrastructure-as-Code (IaC) approach implemented with CloudFormation templates makes a lot of sense. While implementing IaC is a widely accepted best practice, it does come with challenges. Managing your infrastructure from lines of code and text-file templates, in the case of AWS CloudFormation, can quickly become overwhelming. We built stack.new to ease that pain.

Is the cloud coming to all of us?

During the past twenty years, so much has changed in the IT office. Two decades ago, we were still using dial-up modems. Now, the entire world wide web is at our fingertips, and our world of IT is more efficient but complicated too. A few significant IT trends have also developed during this time. One of the most important is the cloud that has also become a common buzzword in business. Like many buzzwords, there is a lot of excitement and confusion surrounding the term.

Puppet on Windows: Top questions (and answers!)

Whether you’re a current customer looking to expand across your Windows estate, or thinking of deploying Puppet across your infrastructure for the first time, we hope this blog post — based on real-world customer questions and problems — can help answer some of the questions you may have about Puppet.

Announcing our latest attacks to deal with meeting fatigue

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin. With everyone working remotely, video conference tools like Zoom have been a critical part of maintaining business continuity. It’s truly amazing that we can continue to work and connect with one another, even during a time where getting together in an office hasn’t been possible…

How Should your Business Approach Multi-Cloud Adoption?

The year 2020 can be seen as a major win for cloud infrastructure, even though it has been a tough year socioeconomically. Even before the pandemic, experts predicted that 83 percent of workloads of enterprises would be residing in the cloud by 2020. Now, as more enterprises are going full cloud, they are considering multi-cloud. As more people work from home, cloud computing is becoming more of a necessity. For a decade now, companies have been using the cloud for daily activities and communication.

Looking back at almost a decade of DevOps and forward to what's coming next

TL;DR: This year’s State of DevOps Report is the 10th anniversary edition of this annual research on how practitioners are making DevOps work for them. Whether you’re a big time CircleCI user or are just beginning your career, we want to hear from you. Please take the survey so your voice is represented in the 2021 State of DevOps.

What is Relay (by Puppet)?

Relay is easy to use workflow automation tool for cloud infrastructure operations. Build and share fully automated workflows in minutes instead of days. Ensure hybrid cloud environments are secure, compliant, and cost-efficient. Improve operational efficiency with increased reliability, minimized hiring of specialized roles, and reduced time to recovery through automation that the whole organization can use.

One Year of Graviton2 at Honeycomb

A year ago, we wrote about our experiences as early adopters of Graviton2, and how we were able to see 30% price-performance improvements on one dogfood workload from switching to the arm64 architecture. In those initial experiments, we validated running 20% fewer shepherd ingest workers, using the m6g instance type, which cost 10% less per instance compared to c5 instances.

Github vs Gitlab: An Impartial Guide

In our latest tools guide, we wanted to gather insights from a number of real users of these two giants in the Git & version control space to help you decide between using Github or Gitlab for your latest software development project. “GitHub is a common and easy-to-use website to host code in a way that's shareable with a large number of people”, states Melanie, Content Director at KitelyTech.

Virtana Awarded 'Customer First' Status by Gartner for ITOM (IT Operations Management); Receives All 5-Star Reviews

The Virtana team are excited to announce that we have pledged to be a Customer First vendor in the ITOM (IT Operations Management) market for our product(s): VirtualWisdom, CloudWisdom, Virtana Platform, Virtana Migrate (Cloud Migration Readiness). Our team takes great pride in this program commitment, as customer feedback continues to be a critical priority, and shapes our products and services. Everyone at Virtana is deeply proud to be part of the Customer First program.