This post is our third one sharing our real-world experience using Discord for more than one year. I think it is pretty interesting for any company interested to get the pros and cons of using Discord over Slack. At Qovery, we are a remote-first software company. When we decided to move out of Slack to Discord 13 months ago, we were only 3 developers on the team.
Arm processors have long been at the center of mobile computing, powering billions of smartphones, tablets, smartwatches, and other IoT devices. Today, these processors are beginning to see broader adoption in the cloud as they promise better performance, higher energy efficiency, and lower costs than their x86-based predecessors. Just this week, Oracle announced its new Oracle Cloud Infrastructure Ampere A1 Compute platform, built on the Ampere Altra Arm processor.
Five worthy reads is a regular column on five noteworthy items we’ve discovered while researching trending and timeless topics. Distributed cloud allows organizations to bring cloud computing closer to their location. This week we look at why it’s the future of cloud computing.
What are the differences between incident management and incident response? The answer varies widely depending on whom you ask.
It has been five months since we announced project Harvester, open source hyperconverged infrastructure (HCI) software built using Kubernetes. Since then, we’ve received a lot of feedback from the early adopters. This feedback has encouraged us and helped in shaping Harvester’s roadmap. Today, I am excited to announce the Harvester v0.2.0 release, along with the Beta availability of the project!
It’s official: regulations by the US and UK governments are coming down the track to secure the software supply chain.
ECS Anywhere allows you to use Amazon Web Services’ container service outside of the AWS cloud, and Canonical is proud to be a launch partner for this service. Using Ubuntu as the base OS for your ECS clusters on-prem or elsewhere will allow you to benefit from Ubuntu’s world-leading hardware support, professional services, and vast ecosystem, in turn allowing your ECS clusters to run with optimal performance everywhere you need it.
Amazon Elastic Container Service (ECS) is a managed compute platform for containers that was designed to be simple to configure, with opinionated defaults to help users get started quickly. ECS customers can run containerized workloads on either Amazon EC2 instances or the serverless Fargate platform without having to maintain a control plane—and can easily integrate ECS with other AWS resources, like Network Load Balancers, to architect their infrastructure.
In the face of unexpected crises or disruptions, maintaining business continuity has become more important than ever. Last year, businesses around the world had to shift to a remote workforce model overnight. Were their IT departments prepared for this massive shift?
Testing is vital because it helps you discover bugs before you release software, enabling you to deliver a high-quality product to your customers. Sometimes, though, tests are flaky and unreliable. Tests may be unreliable because of newly-written code or external factors. These flaky tests, also known as flappers, fail to produce accurate and consistent results. If your tests are flaky, they cannot help you find (and fix) all your bugs, which negatively impacts user experience.
With 50% of the US adult population vaccinated, there’s a lot to look forward to this summer, life no longer feels like it’s on hold, and we’re fully embracing that. Get your fire hoses ready, 'cause extinguishing incidents just got easier. We’re rolling out a summer full of new integrations, product releases, events, and more.
In 1946, a moth found its way to a relay of the Mark II computer in the Computation Laboratory where Grace Hopper was employed. Since that time, software engineers and operations specialists have been plagued by “bugs.” In the age of DevOps, we can catch many bugs before they escape into a production environment. Still, occasionally they do, and they can spawn all kinds of unexpected problems when they do.
Manufacturing is more important than ever as governments, businesses, and individuals rely on the industry to drive innovation and economic prosperity through employment and exports, producing both essential and non-essential products that enhance our daily lives.
With CircleCI, there are many different CI/CD flows that can be automated. One such flow is the use of Infrastructure-as-Code (IaC) to build cloud environments. For example, you can use CircleCI to automate the process of building Terraform plans and applying them, in order to create massive production setups in AWS, Azure, GCP, and other cloud environments.
Beaker is a Puppet testing harness focused on acceptance testing via interactions between multiple (virtual) machines. It provides platform abstraction between different Systems Under Test (SUTs), and it can also be used as a virtual machine provisioner setting up machines, running any commands on those machines, and then exiting. Recently, Vox Pupuli, a collective of Puppet community authors, has taken over responsibility to care and feed Beaker for its continued widespread community use.
Argo Rollouts, part of the Argo project, recently released their 1.0 version. You can see the changelog and more details on the Github release page. If you are not familiar with Argo Rollouts, it is a Kubernetes Controller that deploys applications on your cluster. It replaces the default rolling-update strategy of Kubernetes with more advanced deployment methods such as blue/green and canary deployments.
Back in the days, the role of the CIO was relatively clear: the focus was on deploying, managing, and maintaining IT systems across the organization. The CIO’s responsibilities started to blur when end-users became more tech-savvy - around the millenium. Reasons were that ‘they can now get their own technology and don’t need IT to do it for them’. This even led to the much-repeated “death of the CIO meme”.
Without a strategy in place, it will introduce a handful of challenges. Platform teams will be unable to do the following: As you’re defining policies for multi-tenant AKS, EKS, or GKE clusters, consider these tips: To help you get started on the right track, we created this cheatsheet for multi-tenancy success.
When you’re feeling the stress and pain around incidents, making the decision to find an incident management tool is a no-brainer. But how do you choose the one that will work for you, your team, and your business? You might be asking yourself: Where do I start? What do I need to know? What questions do I ask? What are the options? How can I be sure we’re choosing the right tool?
Chatbots can be like double-edged swords. It can either boost your customer service or turn customers away. Hence, you must make sure that you research and prepare properly before committing to it. This way, you will know how to optimize and maintain your chatbots to ensure its effectiveness. There are many chatbot benefits for business. In fact, 78% of businesses have started integrating such technology into their customer service in the past months.
You’re called in. The system is misbehaving. It could be a key metric going crazy, or exceptions starting to fire. You’re troubleshooting, beating around the bush, just to realize that one of the team’s deployments was the one messing things up. Sounds familiar? If you’re practicing continuous deployment, you probably experience that several times a week, if not more. Users report that 50% of their outages are due to infrastructure and code changes, namely deployments.
Over 90% of Redfin’s metric data will be represented in Circonus’ log linear OpenHistograms, which will reduce their metric footprint by 50-60%. We’re pleased to announce today that Redfin, the technology-powered real estate brokerage, has selected Circonus to replace its existing metrics platform.
Since the revolutionization of the concept by Docker in 2013, containers have become a mainstay in application development. Their speed and resource efficiency make them ideal for a DevOps environment as they allow developers to run software faster and more reliably, no matter where it is deployed. With containerization, it’s possible to move and scale several applications across clouds and data centers. However, this scalability can eventually become an operational challenge.
Kubernetes — a popular open source container orchestration system — enables you to easily deploy, monitor, and scale cloud-native application workloads in both private and public cloud environments. In other words, Kubernetes does the hard work of managing containerized applications, giving you more time to spend building it.
Prometheus is a free and open-source software for real-time systems and event monitoring and alerting. Originally developed at SoundCloud, Prometheus became a project of the Cloud Native Computing Foundation in 2016, alongside other popular frameworks such as Kubernetes. To start using Prometheus, you’ll need a solid understanding of all of the tool’s functionality.
While Kubernetes is a very powerful and comprehensive application, it can also be very complicated and confusing to new users. Thankfully, the community is great at pulling together to try to tame the Kubernetes beasts, and as more users join the platform, more handy tools to help you manage your cluster are developed. Kubernetes Resources range from everyday helper tools to development tools to troubleshooting tools, and in this article we’ll discuss fifteen of the best ones.
AWS Lambda extensions enable you to seamlessly integrate third-party tooling with your Lambda environment so you can run custom code or monitoring agents alongside your functions. We’ve partnered with AWS to create a Lambda extension that offers a more cost-effective, simplified process for collecting data from your functions.
In October 2020 we released the community-driven roadmap for 2021. It's time to revisit and see all the things we have completed from the list! I am very proud to say that at Civo we have taken the community suggestions and implemented most of them during the launch on May 4th 2021. Let's dive into each of the features listed in the original blog post and see where we are with the 2021 Civo Roadmap.
HAProxy generates over a hundred metrics to give you a nearly real-time view of the state of your load balancers and the services they proxy, but to get the most from this data, you need a way to visualize it. InfluxData’s InfluxDB suite of applications takes the many discrete data points that make up HAProxy metrics and turns them into time-series data, which is then collected and graphed, giving you insight into the workings of your systems and services.
In a single, monolithic repository, also known as a monorepo, you keep all your application and microservice code in the same source code repository (usually Git). Typically, teams split the code of various app components into subfolders and use Git workflow for new features or bug fixes. This approach is natural for most applications or systems developed using a monolithic architecture. Code in such a monorepo typically has a single build pipeline that produces the application executable.
There are multiple phases in the software development process that need to be completed before the software can be released into production. Those phases, which are typically iterative, are part of what we call the software development life cycle, or SDLC. During this cycle, developers and software analysts also aim to satisfy nonfunctional requirements like reliability, maintainability, and performance.
Chia cryptocurrency is based on Proof Of Space, and distribute tokens according to a mechanism called plotting. In the last weeks, we've seen a lot of new accounts trying to launch and operate Chia miners from FlashDrive's infrastructure. Most of those accounts where created with fake/stolen credit cards for the sole purpose of getting Chia coins for free.
Datadog’s infrastructure list provides a central, high-level view of every host in your environment and pulls together metadata and relevant metrics from across Datadog to help you get the full picture of each one. You can easily filter and sort the list using any host tags, letting you quickly view the status of the parts of your infrastructure you need.
Incidents and outages caused by animals highlight the importance of flexibility and out-of-the-box thinking when it comes to SRE.
Setting and measuring latency Service Level Objectives (SLOs) is a critical responsibility for engineers monitoring the performance and health of their applications and systems. SLOs are an agreement on an acceptable level of availability and performance and are key to helping engineers properly balance risk and innovation.
This article will take you through setting up CI/CD integration for building, testing, and publishing libraries to Maven Central using Gradle. With jCenter shutting down, Maven Central is once again the primary destination for all Android and Java libraries. Library publishers will need to port their libraries over to Maven Central to keep their libraries available after jCenter shuts down. This article focuses on CI/CD integration.
The Atlanta Startup Podcast is the briefing room for the innovation ecosystem, featuring the investors, founders and activators creating the fastest emerging venture capital ecosystem in the country. Ken Ahrens, founder and CEO of Speedscale sits down with Valor Investor, William Leonard in this episode.
Nobody wants to deal with annoying neighbors. Whether it’s the neighbor who always knows everyone’s business or the one who turns up their music late at night, both types of neighbors can have a negative impact on your living environment and daily life. Obnoxious neighbors aren't exclusive to just your physical living space, but in the public cloud where there are multiple Kubernetes clusters (EKS, AKS, or GKE) and multiple users (or tenants) with the need for cluster access.
We’re happy to announce our integration with Google Meet to create incident bridges automatically. Using the power of FireHydrant Runbooks, a Google Meet can be added with fully customizable titles and agendas based on your incident details.
Dashboards are a crucial tool in your monitoring arsenal, as they allow you to visualize and correlate telemetry data from across your stack in a single place. Historically, Datadog offered two dashboard types: Screenboards, for pixel-level control on a canvas, and Timeboards, for troubleshooting a specific point in time. Now, we’re excited to introduce a new dashboard layout that combines the best of Timeboards and Screenboards in a single, seamless editing experience.
When Kubernetes launches and schedules workloads in your cluster, such as during an update or scaling event, you can expect to see short-lived spikes in the number of Pending pods. As long as your cluster has sufficient resources, Pending pods usually transition to Running status on their own as the Kubernetes scheduler assigns them to suitable nodes. However, in some scenarios, Pending pods will fail to get scheduled until you fix the underlying problem.
Datadog Notebooks simplify the way teams across an organization find and share knowledge. By bringing together live data and rich Markdown text, Notebooks help teams create powerful, data-driven documents—from runbooks and support playbooks to incident postmortems and data reports. And with collaboration functionalities like real-time editing and commenting, team members can simultaneously make changes to a document and gather feedback along the way.
Network Policy is a critical part of building a robust developer platform, but the learning curve to address complex real-world policies is not tiny. It is painful to get the YAML syntax right. There are many subtleties in the behavior of the network policy specification (e.g., default allow/deny, wildcarding, rules combination, etc.). Even an experienced Kubernetes YAML-wrangler can still easily tie their brain in knots working through an advanced network policy use case.
In this post we’ll explore the concepts of data lake, data hub and data lab. There are many opinions and interpretations of these concepts, and they are broadly comparable. In fact, many might say they’re synonymous and we’re just splitting hairs. But let’s look again carefully. We can discern some subtle trends in the way people are doing things, and find distinctions in these expressions.
Your users access your application from a wide range of browsers, which have their own implementations of HTML, CSS, and JavaScript. For instance, many modern JavaScript features such as Promises and Arrow Functions are unsupported by some browsers. These inconsistencies can lead to missing elements and malfunctioning workflows that affect some—but not all—of your user base.
IT spending grew to an impressive $3.8 trillion in 2019. With 2020 giving enterprises a reality check on remote working, the spending on digital transformation is expected to grow even further. It goes without saying that IT is an integral part of any company, big or small. When the stakes are so high, there’s very little room for mistakes. However, we’re all humans and do make mistakes.
We have some exciting news for you about Puppet's support for the PowerShell DSC configuration framework for Windows. In short, content from the PowerShell Gallery will simply appear on the Puppet Forge and can be added to your Puppetfile and used just like any other Puppet module. This makes it by far the most flexible and maintainable iteration of DSC integration we've ever had. Pick and choose whatever DSC Resources you want and get all the VSCode IntelliSense magic you've come to expect.
If you’ve already connected your GitHub integration via OAuth in GitKraken, you’re good to go! GitHub is changing its security policy and will no longer allow username/password-only access. This change goes into effect on August 13th, 2021, and affects all desktop Git applications that offer a GitHub integration, including GitKraken. Users who have already authenticated to GitHub using OAuth will be unaffected. OAuth is the default connection method within your GitKraken profile settings.
The value of Mattermost is significantly enhanced with third-party tool integrations and customization. Today, we are releasing the developer preview of a new Apps Framework for creating application integrations and customized workflows. The Apps Framework complements the existing ecosystem of plugins and allows apps to be written in any language and deployed with serverless hosting.
At Cloudsmith, we believe that packaging should be at the centre of any modern build and deployment process. In fact, we think that Continuous Packaging is the glue that ties Continuous Integration and Continuous Deployment or Delivery together. So with that in mind, in this blog, we will take a walk through how easy it is to integrate Cloudsmith with a Semaphore CI workflow and push the artifacts and packages that you build to a private repository. TL:DR – It’s super easy.
Ubuntu 21.04 is the latest release of Ubuntu and comes at the mid-point between the most recent Long Term Supported (LTS) release of Ubuntu 20.04 LTS and the forthcoming 22.04 LTS release due in April 2022. This provides a good opportunity to take stock of some of the latest security features delivered in this release, on the road to 22.04 LTS. Ubuntu 21.04 brings with it a vast amount of improvements and features across a wide variety of packages.
Knowing how to deploy and run applications has become a key part of modern app development, meaning that developers need expertise in a number of areas beyond their core application code. Whether it’s container orchestration, networking, scaling, or load balancing, there is a steep learning curve to being able to deploy and run an application at scale.
Managed service providers (MSPs) are becoming increasingly important in the IT management industry. The role of an MSP does not just stop with monitoring, managing, and maintaining the IT services of their clients; it extends to keeping a close watch on everyday IT developments and proactively securing clients’ IT networks against cyberthreats. To balance all these responsibilities, MSPs need comprehensive IT management and monitoring solutions that can cater to all their needs.
Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Break Things on Purpose is a podcast for all-things Chaos Engineering. Check out our latest episode below. You can subscribe to Break Things on Purpose wherever you get your podcasts. If you have feedback about the show, find us on Twitter at @BTOPpod or shoot us a note at podcast@gremlin.com!
Spot by NetApp’s Ocean continuously optimizes Kubernetes clusters with a wide feature set tackling different aspects of running and managing Kubernetes containers in a cloud environment. To help users improve the efficiency and performance of their cloud environments, Ocean’s rightsizing capabilities provide recommendations that target over-provisioning and underutilization.
Serverless computing, a model in which the provider manages the server, lets developers focus on writing dedicated pieces of application logic. Serverless computing has been adopted by many development teams because it auto-scales. Auto-scaling relieves developers of allocation management tasks, so they do not need to worry about the allocation of server resources or being charged for resources they are not consuming.
Today’s world is a connected one. The technologies we have allow us to connect with people half a world away. Companies can work with people from different locations. Cloud technologies, for example, are helping people collaborate from afar.
As your DevOps and IT Service Management (ITSM) teams continue to increase, so should your ability to provide self-service infrastructure capabilities. Whether you are a ServiceNow or Puppet administrator looking to expand automation to other groups such as Site Reliability Engineering or the IT Service Desk, the integration of Puppet Enterprise and ServiceNow is now at your fingertips.
Prometheus is an open-source tool that’s meant to monitor and collect metrics from applications. The point of this system is to make it easy for users to see and understand important metrics that let them know how well an application is doing. In fact, Prometheus is able to collect over one million metrics per second, and then store them until you’re ready to retrieve them.
When you hear the word "Thanos," your first thought might be the Marvel Cinematic Universe villain from the Avengers: Infinity War film who seeks to collect the Infinity Stones and end half of all life in the universe. But if you mention the word to a data nerd, you're likely to get a very different response. Prometheus is a free and open-source platform for real-time systems and event monitoring and alerting.
Be aware that we’re not saying that you are in cloud nine, but that you may most likely be using the cloud. That is, if you use Google mail, Microsoft Office 365 office suite or you take a photo with your cell phone and then it gets automatically uploaded to iCloud or something similar, you are using the cloud.
Five worthy reads is a regular column on five noteworthy items we have discovered while researching trending and timeless topics. This week, we explore Anywhere Operations as a new business model.
The spring release of Puppet Enterprise is now available. Puppet continues to build out its flagship product to help organizations scale DevOps initiatives, meet compliance requirements, and deliver on cloud and hybrid initiatives. With this release, we’ve focused on delivering key enhancements to help boost productivity, giving organizations the ability to automate faster and more securely at scale.
Security is essential. It’s top of mind for organizations of all sizes and it’s certainly a top priority for Puppet. The latest release of Puppet Enterprise 2021.1 now offers support for SAML 2.0 providing a more secure and efficient authentication path for our customers to access their Puppet environments, applications and tooling.
How available is your website, service, or platform? What must you monitor and measure to ensure availability? How do you translate uptime into availability? This chart has numbers that every Site Reliability Engineer (SRE) should know. Below the chart, you will find answers to commonly asked questions about SRE and associated metrics.
In most cases, when users start to access and use a new application or a new release, app performs pretty well. As the user base grows and usage increases, the app can outgrow its infrastructure. Users can start experiencing a dip in performance. Latency increases, bandwidth and memory get exhausted quickly, and some code architectures start to fail because they do not scale well with the increased amount of users.
When I asked Charlie for permission to attend this year’s AICon (virtual, natch) I thought it would be a shoo-in; learning’s part of my OKRs after all. But he never makes things easy and his ‘yes’ came with a caveat that’s typical when dealing with him. This time, he claimed he didn’t have the budget for the ticket (a likely story!) and I’d have to find another way to get one.
That's a wrap! We hosted "WTF is Incident Management" on May 12, 2021. We invited four very knowledgeable panelists to discuss how they define incident management, what changes they'd make if they could start again from scratch, how to manage team stress after an incident, and other subjects. Our panelists were: host Matt Stratton (Staff Developer Advocate at Pulumi), Emily Ruppe (Incident Commander at Twilio), Alina Anderson (Sr.
It’s so easy these days to set-up your own WiFi network. You order a router online, plug it into the electrical socket, define a password and you’re good to go. WiFi is fast, reliable and easy to use. But if you want to cover a wider area or connect hundreds of small devices it quickly becomes inefficient and expensive. Is the only way to go to your local mobile network operator and sign a contract? No! Thanks to open source technology, you can build your own LTE or 5G network.
Containers and Kubernetes have changed the way we operate applications. This has been a boon for Site Reliability Engineers (SREs) and DevOps professionals who handle infrastructure management. Yet, it has come at a cost to many who develop and operate applications. Their experience has become more complicated and cumbersome.
Whether you’re new to the cloud native space or an accomplished practitioner, you’re probably aware that there are many Kubernetes distributions to choose from. Maybe you’ve heard about the challenges of getting up and running with Kubernetes. Guess what? It doesn’t have to be hard. This blog provides an introduction to K3s, a lightweight CNCF-certified Kubernetes distribution. We’ll look at what makes K3s different from other Kubernetes distributions.
Security has always been a wide and complex topic. A recent survey from StackRox about the state of containers and Kubernetes security provides some interesting data on these topics. In this blog post, I’ll dive into some of the findings in that survey and introduce you to Kubewarden, an open source policy engine. A staggering 66 percent of the survey participants do not feel confident enough in the security measures they have in place.
In the past year of covid related working from home, we are increasingly seeing more burnouts in engineering teams worldwide. More and more devs are partially checked out and may not be putting their 100% in team activities (planning, grooming, code review, quality checks). In these testing times, we have found some of the ways to keep your team motivated.
Introduced in 2008, Microsoft’s virtualization platform Hyper-V has become a well-known tool for administrators. Hyper V offers users with a wide range of management options. It includes GUI-based Hyper V tools such as Hyper V Manager, and command-line tools like Windows Powershell. Hyper V versions have been released ever since with Windows Server.
The Gradle Build Cache is designed to help you save time by reusing outputs produced by previous builds. It works by storing (locally or remotely) build outputs, and allowing builds to fetch these outputs from the cache when it determines that inputs have not changed. The build cache gives you the ability to avoid the redundant work and cost of regenerating time-consuming and expensive processes.
Your applications rely on third-party software running throughout your infrastructure, and it can be challenging to monitor each of these technologies individually. To give you the visibility you need, Datadog Live Processes now monitors all of your third-party workloads in one place.
The more tools that your teams need to execute their workflows, the more friction and lost productivity there can be, especially if each tool requires a different CLI or set of APIs. Retool is a low-code platform that allows you to build internal web applications using a drag-and-drop interface. By integrating with a number of key backend databases and APIs, Retool enables you to create custom, centralized management tools to serve a wide range of employee-facing use cases.
Service Level Objectives (SLOs) are a key component of any successful Site Reliability Engineering initiative. The question is, what are SLOs; and how do you determine what your SLOs should be? Once you've done that, how should you use them?
In software development, processing and storing data in different states reflects the business rules an application is built on. The heart and soul of any software application is the data that is persisted in databases for retrieval and further processing. The database system (SQL or Non-SQL) chosen for an application must serve the required data processing and storage needs of the application.
The container ecosystem is moving very fast and new tools designed specifically for Kubernetes clusters are introduced at a very fast pace. Even though several times a new tool is simply implementing a well-known mechanism (already present in the VM world) with a focus on containers, every once in a while we see tools that are designed from scratch rather than adapting a preexisting idea. One such tool is Komodor.
We recently announced the development of our new dashboard builder and associated release of several new turnkey service dashboards. The new dashboard builder provides a vastly improved user experience, enabling users to create dashboards in a fraction of the time it took them previously. As of this month, the dashboard builder, which was previously only accessible internally, is now available to all Circonus customers.
HAProxy 2.4 adds exciting features such as support for HTTP/2 WebSockets, authorization and routing of MQTT and FIX (Financial Information Exchange) protocol messages, DNS resolution over TCP, server timeouts that you can change on the fly, dynamic SSL certificate storage for client certificates sent to backend servers, and an improved cache; it adds a built-in OpenTracing integration, new Prometheus metrics, and circuit breaking improvements.
The question I am regularly asked is, “What use is OTN, when services are all IP, and routers handle connectivity directly across optical fiber connections, or wavelengths on optical fiber (IPoWDM)?”
Artificial intelligence is transforming software development. From the code to the deployment, AI is slowly but surely upping its game and helping us discover a brand new paradigm for inventing technology. Algorithm-based machine learning is being used to accelerate the software development lifecycle and AI is supporting developers to optimize software workflow at every stage of the development process.
Now you can interact with your GitHub pull requests directly from GitKraken We opened our hailing frequencies and heard your communications. Devs from across the galaxy have asked us to help increase the speed of their workflows and we are happy to report on some major activity in that quadrant. 🚀🌃🌠 Announcing GitKraken v7.6 You no longer have to leave the bridge GitKraken to work with your GitHub Pull Requests.
In microservices architecture, to identify performance issues—including latency—it’s important to monitor each service and all inter-service communication. Jaeger and VMware Tanzu Observability can help. Jaeger is an open source, distributed tracing system released by Uber Technologies. VMware Tanzu Observability is a high-performance streaming analytics platform that supports 3D observability (e.g., metrics, histograms, and traces/spans).
We are very excited to announce a key integration between VMware Tanzu and VMware Cloud on AWS that provides a significantly enhanced experience for our customers who want to deploy, run, and manage Kubernetes in VMware Cloud on AWS. With this integration, VMware Tanzu Mission Control now supports full lifecycle management—provisioning, upgrading, scaling, and deleting—of Tanzu Kubernetes clusters deployed on VMware Cloud on AWS.
Many industries have taken up Augmented Reality (AR) serious in order to increase production, sales and customer education. Organizations today are making use of several AR applications for multiple purposes to grab the attention of users. Surveys have predicted a vast growth in the AR VR markets in the near future as much money has been induced into the field. The application of AR technology is establishing a firm hold over the pharmaceutical industry as well.
A dark launch is a deployment strategy for testing new versions of a service in production. When running a dark launch, you deploy a new version of a service and route a copy of production traffic to it without returning responses to users. This lets you see how a new version of a service handles production load, watch for errors, and compare performance between the old and the new versions—without affecting users.
This blog is the first in a four-part series on infrastructure automation for government agencies that are modernizing digital systems while grappling with budget and staffing constraints and the challenges of COVID-19. The last 12 months have been a turning point for many government agencies. The COVID-19 pandemic has accelerated the drive towards modernization and, with it, the need to ensure security and compliance requirements across a host of legacy systems and processes.
Virtual machines (VM) offer great flexibility for hosting web applications. A developer/engineer is able to configure and control every piece of software and every setting that the application needs to run. Azure, one of the largest cloud hosting platforms, has virtual machine offerings for both Linux and Windows-based operating systems. In this tutorial, you will learn how to set up a continuous deployment pipeline to deploy a Node.js application to an Azure virtual machine.
To continuously innovate, many organizations are anchoring their infrastructure on container management solutions. The open source project Kubernetes is now the de facto standard for container management, and its popularity is growing in a number of ways. Here are some stats from a recent Cloud Native Computing Foundation (CNCF) survey.
I have recently seen quite a few articles and talks covering why organizations are aiming at implementing a developer platform to help speed up the adoption of microservices within their organizations but before we get started on discussing what a developer platform is, the developer experience and productivity on Kubernetes, and how different teams are working through it, let’s define some common ground.
Firstly, super excited to share that CloudHedge has successfully completed the IBM Cloud Paks certification, IBM has some stringent requirements for achieving this certification which includes having an Operator certified product listed on Red Hat Marketplace. CloudHedge’s intelligent App Modernization Platform enables enterprise customers to transform their legacy workloads to OpenShift container platform efficiently and effortlessly.
Template variables enable you to use tags to filter your Datadog dashboards to the hosts, containers, or services you need for faster troubleshooting. However, there are some cases where it may be difficult to use a standard set of template variables to aggregate all of the data you need without creating a complicated, difficult to manage set of variables. For example, you may use tag values that are a subset of another tag.
The HAProxy Data Plane API 2.3 expands its service discovery mechanisms and introduces native support for discovering AWS EC2 instances and auto-scaling groups. It also adds a new configuration file that supports HCL and YAML, an Inotify configuration watcher, and Syslog support. HAProxy Data Plane API version 2.3 is now available and you will find it in the 2.3 version of the Alpine Docker image.
Today we announced our Series F round of $100M led by Greenspring Associates, with Eleven Prime, IVP, Sapphire Ventures, Top Tier Capital Partners, Baseline Ventures, Threshold, Scale, Owl Rock, and Next Equity Partners. Thank you to our customers, community, partners, investors, and team. This latest investment allows us to invest as well; in our product, our community, and in our people. We build for the builders of the digital age: developers.
We are proud to announce that CircleCI has acquired Vamp, the first cloud-native release orchestration platform. This paves the way for CircleCI customers to have first-class release orchestration and continuous validation, all natively within the CircleCI platform.
In today’s data-centric world, metrics or numbers define all performance benchmarks. The time between when an event starts and ends shows how well a system can handle and process such events. One of such metrics is MTTR. MTTR usually stands for Mean Time To Resolution, but it has held several meanings over the years. MTTR is a metric used to measure how well a system can bounce back from errors and provide long-lasting solutions.
Congratulations, you finally consider moving your apps to Kubernetes. It is a big day! Here is a checklist to ensure you did not forget anything essential to increase your chances of success using Kubernetes. We divided those points into three sections, from the most important to the least. Let’s go.
Observability, which originated from control theory, measures how well you can understand a system’s internal states from its external outputs. Observability uses instrumentation to provide insights that aid monitoring. In DevOps, gaining observability is achieved through a set of monitoring solutions. The shift to use one vendor platform to do so, versus multiple solutions, make sense as.
Speedscale was built primarily to provide engineering teams with better insight into their applications over time, replaying single transactions for root cause analysis that give developers and SREs confidence that tomorrow’s application code will work just as well in production as it did yesterday.
Spark is one of the most widely-used compute tools for big data analytics. It excels at real-time batch and stream processing, and powers machine learning, AI, NLP and data analysis applications. Thanks to its in-memory processing capabilities, Spark has risen in popularity. As Spark usage increases, the older Hadoop stack is on the decline with its various limitations that make it harder for data teams to realize business outcomes.
It is very important to create backups for your crutial EC2 instances. While AWS provides mechanisms to increase availability, the cloud is not infallible. EC2 provides a native backup format for your EC2 instances in the way of AMI images. But storage costs of the AMI images can build over time.
Software development is changing rapidly. On one hand, you must quickly adapt to evolving requirements, while on the other, your applications need to operate continuously without downtime. DevOps helps you quickly adapt to changes. Among other initiatives, continuous integration (CI) and continuous delivery (CD) are intgegral to any DevOps practice.
SQL injection is one of the most destructive ways an application can be attacked. This kind of attack is targeted toward the application database, which can result in consequences that are irreversible, lead to loss of money, and reduce user trust in your company. There are far too many application data breaches happening every day, usually when a malicious agent attacks the database.
VMware Marketplace is a one stop-shop for VMware customers to discover, try, and deploy various third-party and open source solutions onto their VMware environments. All deployable assets on VMware Marketplace are pre-tested on their respective VMware environments, which empowers users to deploy them with confidence.
If technology applications are the building blocks of enterprises today, developers comprise the masonry team. At VMware, we seek to empower application developers, architects, platform and digital teams alike by giving them the ability to choose the right set of tools for their unique development needs and goals. We build deep, meaningful partnerships with industry peers to support our customers’ choices across their full technology stacks.
We are happy to announce that VMware Tanzu SQL with MySQL for Kubernetes 1.0 is generally available! Tanzu customers can easily run MySQL at scale on Kubernetes with this new release, which complements our existing Postgres engine for Kubernetes. Even better, with this new release Tanzu Advanced customers now have the two most popular open source operational databases included with their purchase.
In order to manage complex containerized applications, modern devops teams need to have deep visibility into the status of their Kubernetes resources. By listening directly to the Kubernetes API, the open source kube-state-metrics service generates key metrics about your Kubernetes objects, including pods, nodes, and deployments, which are essential for understanding the status and performance of your clusters.
A big part of ensuring the availability of your applications is establishing and monitoring service-level metrics—something that our Site Reliability Engineering (SRE) team does every day here at Google Cloud. The end goal of our SRE principles is to improve services and in turn the user experience. The concept of SRE starts with the idea that metrics should be closely tied to business objectives. In addition to business-level SLAs, we also use SLOs and SLIs in SRE planning and practice.
Whether you are new to AWS or have been to every re:Invent since 2012 you may have questions about cloud security and how it impacts your valuable technology and data. In particular, you might be wondering where AWS’s security responsibilities end and where yours begin? Which parts of the cloud can you rely on Amazon’s security team and technology to keep safe and which parts must you take care of?
Our previous post, “Monitoring for Success: What All SREs Need to Know,” discusses how today’s complex IT environments — virtualization, cloud computing, continuous delivery and integration — coupled with pressures to deploy faster while meeting demands for “always on” customer expectations – have placed greater strains on monitoring teams.
What do Google’s DevOps Research and Assessment (DORA) and Rollbar have to do with each other? DORA identified four key metrics to measure DevOps performance and identified four levels of DevOps performance from Low to Elite. One way for a team to become an Elite DevOps performer is by focusing on Continuous Code Improvement.
Deep specialization of IT administrators is a luxury only the largest organizations can typically afford. Smaller organizations rely on IT administrators with a more generalist skill set because they are—by necessity—responsible for a wide array of different technologies, and there simply isn’t time to specialize in the intricacies for any one of them. Yet modern IT is intricate.
That’s a wrap! Gremlin hosted Failover Conf 2: Fail Smarter on April 27, 2021. In attendance were over 500 SREs, developers, sales engineers, product managers, DevOps experts, C-level execs, and other reliability pros from around the globe! This year’s conference included discussions around the future of DevOps, strategies for building reliable teams, analyzing human error to create better systems, and more.
Today, almost every service now is offered in a “Cloud” variant. But what does that really mean? Are all clouds services equal? It’s easy to see why so many vendors rush to add a Cloud edition/variant of established software they sell. Undoubtedly, there has been a move to Cloud services across the industry, as more and more organizations seek to take advantage of the higher reliability and lower total cost of ownership that Cloud platforms promise.
Let's all face it, on call work isn't fun. But it can be better. Even if you have to work on call, it would be nice to have at least some of the work done for you, before you drag yourself out of bed at 3am to respond to an incident.
When we launched Qovery in January 2020, our product was still a prototype, and we onboarded 53 developers to help them deploy their apps in the cloud. At the time, we were only 2 on the team, and our first employee (Patryk Jeziorowski) decided to join us after being one of our first users. 18 months later, 3004 developers from more than 110 countries use Qovery to deploy their apps on their AWS and Digital Ocean account.
In August 2020, Amazon announced Bottlerocket OS, a new open source Linux distribution that is built specifically for running container workloads. It comes out of the box with security hardening and support for transactional updates, allowing for greater ease in automating operating system updates, maintaining security compliance and reducing operational costs. Bottlerocket is designed to be able to run anywhere and, at launch, has a pre-built variant for Amazon EKS.
Testing in production simply means testing new code changes in production, with live traffic, in order to test the system’s reliability, resiliency, and stability. It helps teams solve bugs and other issues faster, as well as effectively analyze the performance of newly released changes. Its overall purpose is to expose problems that can’t be identified in non-production environments for reasons that may include not being able to mimic the concurrency, load, or user behavior.
In case you missed it, this webinar includes code walkthroughs that help you to add observability to your pipelines (using a free Honeycomb account!) so that you and your team can speed up your deployments to prod. This is also a risk-free way to get started with observability if your team isn’t quite yet ready to change your production apps.
Designing and running workloads in the cloud is complex. Many services need to fit together in just the right way for optimal performance. The opportunity for error lurks around every corner. This is a high-stakes game with a huge premium on getting things right from the beginning. Even small mistakes can snowball. To help, AWS studied the architectures of thousands of its customers and supplemented that learning with insights from experts.
In order to avoid the repetitive tasks performed over the desktop, Microsoft has developed an extended service, the Power Automate Desktop. It has been recently announced by Microsoft, which has been made available to Windows 10 users. It is a new low-code Robotic process automation that enables business empowerment to automate those tasks that are repetitive and other manual tasks to focus better on higher-value work and to establish more in their corresponding areas of work.
Kubernetes enables teams to deploy and manage their own services, but this can lead to gaps in visibility as different teams create systems with varying configurations and resources. Without an established method for provisioning infrastructure, keeping track of these services becomes more challenging. Implementing infrastructure as code solves this problem by optimizing the process for provisioning and updating production-ready resources.
Infrastructure monitoring was difficult enough when entire businesses ran off a few bare metal servers in a dusty, forgotten closet. Other IT infrastructure monitoring tools fell short, unable to provide complete and granular-enough metrics in real time, even when we were only dealing with a handful of systems responsible for running every part of the application stack.
It’s that time again: the latest versions of D2iQ Konvoy and D2iQ Kommander have just been made generally available and the D2iQ Kubernetes Platform (DKP) has some powerful new features. As noted with our last update, DKP is the leading independent Kubernetes platform for enterprise grade production at scale and Konvoy and Kommander are the reason why. You can learn more about Konvoy here, Kommander here, and our general approach here.
Chris Sterling, Shruti Iyer, and Aditya Tripathi contributed to this blog post. APIs—the key component of any company’s microservices model—are driving digital transformation in modern enterprises. Indeed, “66 percent of organizations report using private or B2B APIs,” according to the Gartner report, “Create API Portals That Drive API Adoption Among Internal and External Developer Communities” by Akash Jain and Mark O’Neill, November 2020.
Serverless computing is becoming increasingly popular in software development due to its flexibility of development and the ability it affords to test out and run solutions with minimal overhead cost. Vendors like AWS provide various tools that enable businesses to develop and deploy solutions without investing in or setting up hardware infrastructures. In this post, we’ll cover the many different services that AWS provides for supporting serverless computing.
Kubernetes is one of the current leading technologies. Its adoption has seen tremendous growth in the past few years. The concept of containers is a paradigm that appears to be the predominant medium of software development and deployment in the coming future. Containers help maintain consistency across various platforms, as they pack an application with its dependencies to help move it from one platform to another.
Thank you all for joining us last week for Failover Conf 2! We had a great turnout this year, with over 1,800 participants, 20 sponsors, and 9 amazing sessions. After more than a year of virtual events and video calls, we know that Zoom fatigue is real. We tried to make this event different by finding new ways to bring the community together and thinking of fun new ways to shake up the conference formula.
Argo CD has been skyrocketing in popularity with the CNCF China survey naming Argo as a top CI/CD tool for its power as a deployment automation tool. And it’s no wonder, GitOps is a faster, safer, and more scalable way to do continuous delivery. Most of our own users are embracing GitOps to manage infrastructure and applications at scale in gaming, finance, defense, media, and other industries.
At Speedscale, we’re always trying to find ways to iterate faster and reduce developer toil. In line with that mission, we slant our engineering decisions towards using cutting edge tech because we usually move faster and it also allows us to help our customers later on when they upgrade their own tech stack. Recently, we had the opportunity to upgrade the communication channel between our api-gateway and react front end. This journey provided some unexpected benefits.
Developing a holistic enterprise architecture is the first step to acquiring a wholesome grip over the evolution and management of an organization. Enterprise architecture enhances Business Process Improvement and significantly optimizes costs by standardizing technology – two of the most crucial factors that influence the ROI of an organization. The enormous efficiency and cost-savings, that enterprise architecture brings about, have strengthened the belief in enterprise architecture today.
Let's face facts. Git is not fun. Git is not friendly. No. It's just infuriatingly useful, so we're stuck with it. But what if you could make git more friendly? More convenient? Would that make your day a little less stressful? In this article, Julie Kent shows us how we can do this with just a few simple tweaks.
If we look at server definition, it is a piece of computer software or hardware that provides functionality to other devices or programs called clients. System administrators often come up with a common question over the performance of a server – Why is my server down? If server monitoring and management are inefficient, it often makes it very difficult to correctly analyze complex and unpredictable information in a data center. It’s hard to find a reason for server outage.
We’re proud to announce the release version 1.6 of the HAProxy Kubernetes Ingress Controller. This version provides the ability to add raw configuration snippets to HAProxy frontends, allows for ACL/Map files to be managed through a ConfigMap, and enables complex routing decisions to be made based on anything found within the request headers or metadata.
Here at LogicMonitor, we’re really big on extensibility and automation. We’re constantly adding to our catalog of monitoring coverage, and we spend a lot of our time ensuring that setup is as simple as possible. We also monitor almost any data you can expose on a network. People have done way more with LogicMonitor than we would have ever imagined, and I’m extremely excited to announce our next step in that commitment to extensibility and automation.
Secret management is one of the most critical areas in deploying and running applications. Codefresh already had native support for native Kubernetes secrets or custom secrets on the Codefresh Runner, but more and more customers have asked us for native support for Hashicorp Vault. Today we are pleased to announce our native integration with Hashicorp vault as another secret provider for Codefresh pipelines.
Cloud MSPs or managed service providers are great at helping companies properly leverage the public cloud, typically handling cloud strategy, implementation and day-to-day operations for their customers. However, when it comes to things like customizable billing, analyzing cloud spend per customer, optimizing cost and increasing profit margins, MSPs are over-burdened with complex, manual processes.