Operations | Monitoring | ITSM | DevOps | Cloud

May 2021

Understanding Azure Logic Apps Resource Types

In recent years, businesses innovate and migrate at a faster pace by using Microsoft Azure cloud-native technologies. Azure Integration Services, an industry leading Integration Platform as a Service (iPaas) provides multiple offerings like Logic Apps, Service Bus, API Management, Event Grid, Azure Functions & Data Factory to meet application integration scenarios. Azure Logic Apps gaining good interest with more than 40,000 customers using it.

Comparing The Private Connectivity Offerings Of AWS, Google Cloud & Microsoft Azure

AWS, Google Cloud and Microsoft Azure accounted for an estimated 58% of total cloud spend in Q1 2021. Businesses are considering ways to improve their connectivity to these three leading hyperscale providers - and are increasingly turning to private connectivity. In this blog, we take a look at the private connectivity offerings of AWS, Google Cloud and Microsoft Azure.

The 7 SRE Principles [And How to Put Them Into Practice]

Whether you're just adopting SRE or optimizing your current processes, we can help. We’ll explain the 7 key principles of SRE and how to put them into practice. So, what are the SRE principles? The fundamental SRE principles are: SRE is a method that operates through principles. Instead of prescribing specific solutions, it guides you with best practices. These SRE principles help organizations decide what's best for them. Once you understand the principles, you can apply them in many areas.

Feedback - From Slack to Discord - 13 months later

This post is our third one sharing our real-world experience using Discord for more than one year. I think it is pretty interesting for any company interested to get the pros and cons of using Discord over Slack. At Qovery, we are a remote-first software company. When we decided to move out of Slack to Discord 13 months ago, we were only 3 developers on the team.

Announcing support for Oracle Arm-based Ampere A1 instances

Arm processors have long been at the center of mobile computing, powering billions of smartphones, tablets, smartwatches, and other IoT devices. Today, these processors are beginning to see broader adoption in the cloud as they promise better performance, higher energy efficiency, and lower costs than their x86-based predecessors. Just this week, Oracle announced its new Oracle Cloud Infrastructure Ampere A1 Compute platform, built on the Ampere Altra Arm processor.

Five worthy reads: Distributed cloud is the future of cloud computing

Five worthy reads is a regular column on five noteworthy items we’ve discovered while researching trending and timeless topics. Distributed cloud allows organizations to bring cloud computing closer to their location. This week we look at why it’s the future of cloud computing.

The Confident Commit | Episode 3: Taming infrastructure with HashiCorp's Armon Dadgar

CircleCI CTO and host of The Confident Commit podcast Rob Zuber is joined by HashiCorp co-founder and CTO Armon Dadgar for a conversation about the inspiration of HashiCorp, infrastructure challenges and opportunities, and the future of security. Listen along for the insight story of HashiCorp's origins and early days, as well as keen insights for managing infrastructure and ways to better deliver software to infrastructure environments from two of tech's top leaders.

Announcing Harvester Beta Availability

It has been five months since we announced project Harvester, open source hyperconverged infrastructure (HCI) software built using Kubernetes. Since then, we’ve received a lot of feedback from the early adopters. This feedback has encouraged us and helped in shaping Harvester’s roadmap. Today, I am excited to announce the Harvester v0.2.0 release, along with the Beta availability of the project!

What Happens When I Execute a Query?

To many developers and system administrators—and even to some database administrators—database engines are a black box. They’re complex pieces of software that, in some cases, even have their own operating systems—the database engine manages its own memory, reads and writes to disks, and handles numerous other system functions. In this post, you’ll learn about a specific feature of database engines—query optimization.

How to run ECS Anywhere workloads using Ubuntu on any infrastructure

ECS Anywhere allows you to use Amazon Web Services’ container service outside of the AWS cloud, and Canonical is proud to be a launch partner for this service. Using Ubuntu as the base OS for your ECS clusters on-prem or elsewhere will allow you to benefit from Ubuntu’s world-leading hardware support, professional services, and vast ecosystem, in turn allowing your ECS clusters to run with optimal performance everywhere you need it.

Announcing support for Amazon ECS Anywhere

Amazon Elastic Container Service (ECS) is a managed compute platform for containers that was designed to be simple to configure, with opinionated defaults to help users get started quickly. ECS customers can run containerized workloads on either Amazon EC2 instances or the serverless Fargate platform without having to maintain a control plane—and can easily integrate ECS with other AWS resources, like Network Load Balancers, to architect their infrastructure.

The four best features to look out for in SQL Monitor

I’m a Data Architect and I’ve been working with data and databases for years at companies like LA Fitness, Dell and now Kingston Technology in Fountain Valley, California. Over all of that time, I’ve used SQL Monitor. I loved it from the beginning and the latest updates to the global overview dashboard and other features have stepped it up another few notches.

How to Leverage IT Automation and Cloud To Put Customers First

In the face of unexpected crises or disruptions, maintaining business continuity has become more important than ever. Last year, businesses around the world had to shift to a remote workforce model overnight. Were their IT departments prepared for this massive shift?

Reducing flaky test failures

Testing is vital because it helps you discover bugs before you release software, enabling you to deliver a high-quality product to your customers. Sometimes, though, tests are flaky and unreliable. Tests may be unreliable because of newly-written code or external factors. These flaky tests, also known as flappers, fail to produce accurate and consistent results. If your tests are flaky, they cannot help you find (and fix) all your bugs, which negatively impacts user experience.

Signed Pipelines Build Trust in your Software Supply Chain

Trust isn’t given, it’s earned. As the Russian proverb advises, Доверяй, но проверяй — or as U.S. President Ronald Reagan liked to repeat, “Trust, but verify.” We designed JFrog Pipelines to securely support a large number of teams, applications, users and thousands of pipelines.

FireHydrant May 2021 Product Updates: The summer of integrations

With 50% of the US adult population vaccinated, there’s a lot to look forward to this summer, life no longer feels like it’s on hold, and we’re fully embracing that. Get your fire hoses ready, 'cause extinguishing incidents just got easier. We’re rolling out a summer full of new integrations, product releases, events, and more.

Using LogDNA To Troubleshoot In Production

In 1946, a moth found its way to a relay of the Mark II computer in the Computation Laboratory where Grace Hopper was employed. Since that time, software engineers and operations specialists have been plagued by “bugs.” In the age of DevOps, we can catch many bugs before they escape into a production environment. Still, occasionally they do, and they can spawn all kinds of unexpected problems when they do.

3 Reasons Manufacturers Across Asia Pacific & Japan Are Turning to Modern Apps

Manufacturing is more important than ever as governments, businesses, and individuals rely on the industry to drive innovation and economic prosperity through employment and exports, producing both essential and non-essential products that enhance our daily lives.

The top three insights from the 2021 State of Database DevOps report

Last year was a year of unprecedented challenges for everyone in every part of the world and every industry, and it was also a year of big changes in the IT sector. The pandemic underscored the role of the IT department as an enabler and a critical part of the transition to remote working. While digitalization was well underway before 2020, no one could have predicted the acceleration the pandemic brought on.

Adding IaC security scans to your CI pipeline with Indeni

With CircleCI, there are many different CI/CD flows that can be automated. One such flow is the use of Infrastructure-as-Code (IaC) to build cloud environments. For example, you can use CircleCI to automate the process of building Terraform plans and applying them, in order to create massive production setups in AWS, Azure, GCP, and other cloud environments.

The Industry's First Private Distribution Network

Private Distribution Network (PDN) enables enterprises to easily set up and manage a secure, massively scalable, hybrid distribution network for software updates. This new innovative technology accelerates software distribution 40X to speed up deployments and concurrent downloads across large-scale environments spanning hybrid infrastructure, edges, and IoT devices. PDN provides two integrated network utilization and acceleration technologies - HTTP-based, secure P2P, and CDN - that can be rolled out across large-scale mixed-infrastructure and multi-tiered, customizable network topologies, and are managed as-a-service with usage-based pricing.

How to achieve acceptance testing through abstraction

Beaker is a Puppet testing harness focused on acceptance testing via interactions between multiple (virtual) machines. It provides platform abstraction between different Systems Under Test (SUTs), and it can also be used as a virtual machine provisioner setting up machines, running any commands on those machines, and then exiting. Recently, Vox Pupuli, a collective of Puppet community authors, has taken over responsibility to care and feed Beaker for its continued widespread community use.

Argo Rollouts, the Kubernetes Progressive Delivery Controller, Reaches 1.0 Milestone

Argo Rollouts, part of the Argo project, recently released their 1.0 version. You can see the changelog and more details on the Github release page. If you are not familiar with Argo Rollouts, it is a Kubernetes Controller that deploys applications on your cluster. It replaces the default rolling-update strategy of Kubernetes with more advanced deployment methods such as blue/green and canary deployments.

Crazy Like a Fox: Redis as Your Primary Database

Redis is fast. It’s fast because the data is all in memory. Persistence options are limited. Because of this, many people say, “Redis is for transient data only!” However, sometimes the need for speed and ease of operations can outweigh the durability downsides! In this talk, we look at a real SaaS business using Redis as its (only) datastore. You’ll learn why we decided to go all-in on Redis and the challenges we faced. You’ll learn how we operationalized the setup, handle backups and restores, and how we’ll scale out. Are we making a terrible mistake? You be the judge!

Why the role of the CIO is constantly changing and challenging

Back in the days, the role of the CIO was relatively clear: the focus was on deploying, managing, and maintaining IT systems across the organization. The CIO’s responsibilities started to blur when end-users became more tech-savvy - around the millenium. Reasons were that ‘they can now get their own technology and don’t need IT to do it for them’. This even led to the much-repeated “death of the CIO meme”.

Announcing the Industry's First Private Distribution Network

Today, at our DevOps user conference swampUP, we were thrilled to announce a new groundbreaking innovation from JFrog: The industry’s first Private Distribution Network! Private Distribution Network (PDN) enables enterprises to easily set up and manage a secure, massively-scalable, hybrid distribution network for software updates.

What's New from JFrog: Binary Lifecycle Management at Scale

JFrog’s annual swampUp DevOps conference always brings new, exciting features to further our vision of accelerating releases through liquid software. This year was no exception, as JFrog CTO Yoav Landman and CPO Dror Bereznitsky revealed innovations for the JFrog DevOps Platform that enable end-to-end binary lifecycle management. Enterprise DevOps and large-scale modern application delivery require robust management of binaries, which are the building blocks of applications.

Cloud Economics 101: Here's What You Need To Know

Businesses are increasingly interested in the economics of cloud computing. For instance, what are the financial implications of moving to the cloud versus staying on-premises? And what’s the best strategy for optimizing cloud consumption to get the best value from cloud resources? This article takes a look at some of the key concepts of cloud economics, and how your business can leverage cloud cost intelligence to maximize the value of your investment.

Best Practices to Simplify the Management of Multi-Tenant EKS, AKS, or GKE Clusters

Without a strategy in place, it will introduce a handful of challenges. Platform teams will be unable to do the following: As you’re defining policies for multi-tenant AKS, EKS, or GKE clusters, consider these tips: To help you get started on the right track, we created this cheatsheet for multi-tenancy success.

Four things to consider when evaluating incident management platforms

When you’re feeling the stress and pain around incidents, making the decision to find an incident management tool is a no-brainer. But how do you choose the one that will work for you, your team, and your business? You might be asking yourself: Where do I start? What do I need to know? What questions do I ask? What are the options? How can I be sure we’re choosing the right tool?

5 Challenges in Chatbot Optimization and Maintenance

Chatbots can be like double-edged swords. It can either boost your customer service or turn customers away. Hence, you must make sure that you research and prepare properly before committing to it. This way, you will know how to optimize and maintain your chatbots to ensure its effectiveness. There are many chatbot benefits for business. In fact, 78% of businesses have started integrating such technology into their customer service in the past months.

Finding the Bug in the Haystack: Correlating Exceptions with Deployments

You’re called in. The system is misbehaving. It could be a key metric going crazy, or exceptions starting to fire. You’re troubleshooting, beating around the bush, just to realize that one of the team’s deployments was the one messing things up. Sounds familiar? If you’re practicing continuous deployment, you probably experience that several times a week, if not more. Users report that 50% of their outages are due to infrastructure and code changes, namely deployments.

Overcoming Database DevOps Challenges: Part 1

As part of our research for the 2021 State of Database DevOps report, we asked 3,000+ recipients what they consider to be the greatest challenge when integrating database changes into a DevOps process. According to the respondents, these are the most important challenges facing database professionals when introducing DevOps practices to database development.

What do site reliability engineers do?

Are you considering adopting SRE? We will explain the roles and responsibilities of an SRE team within your organization, and how to start building one. So what does an SRE team do? An SRE team is responsible for building software that improves the resiliency of systems, implementing fixes, responding to incidents, and automating processes whenever possible. Site reliability engineering is a holistic practice that incorporates various types of work.

Redfin Implements Circonus to Scale its Monitoring, Reduce Costs, and Improve Accuracy of StatsD Analysis

Over 90% of Redfin’s metric data will be represented in Circonus’ log linear OpenHistograms, which will reduce their metric footprint by 50-60%. We’re pleased to announce today that Redfin, the technology-powered real estate brokerage, has selected Circonus to replace its existing metrics platform.

What Is Container Orchestration?

Since the revolutionization of the concept by Docker in 2013, containers have become a mainstay in application development. Their speed and resource efficiency make them ideal for a DevOps environment as they allow developers to run software faster and more reliably, no matter where it is deployed. With containerization, it’s possible to move and scale several applications across clouds and data centers. However, this scalability can eventually become an operational challenge.

Kubernetes automation with Relay

Kubernetes — a popular open source container orchestration system — enables you to easily deploy, monitor, and scale cloud-native application workloads in both private and public cloud environments. In other words, Kubernetes does the hard work of managing containerized applications, giving you more time to spend building it.

What is Prometheus Pushgateway?

Prometheus is a free and open-source software for real-time systems and event monitoring and alerting. Originally developed at SoundCloud, Prometheus became a project of the Cloud Native Computing Foundation in 2016, alongside other popular frameworks such as Kubernetes. To start using Prometheus, you’ll need a solid understanding of all of the tool’s functionality.

Top 15 Kubernetes Resources

While Kubernetes is a very powerful and comprehensive application, it can also be very complicated and confusing to new users. Thankfully, the community is great at pulling together to try to tame the Kubernetes beasts, and as more users join the platform, more handy tools to help you manage your cluster are developed. Kubernetes Resources range from everyday helper tools to development tools to troubleshooting tools, and in this article we’ll discuss fifteen of the best ones.

Blameless Runbook Documentation is Now Generally Available!

At Blameless, our mission is to provide teams with the tools they need to operationalize SRE and embrace a culture of resilience. We help teams automate toil and adopt best practices across integrated incident management, comprehensive retrospectives, service level objectives, reliability insights, and more. We are very excited to announce that Blameless Runbook Documentation is now generally available for all customers.

Introducing Datadog's Lambda extension

AWS Lambda extensions enable you to seamlessly integrate third-party tooling with your Lambda environment so you can run custom code or monitoring agents alongside your functions. We’ve partnered with AWS to create a Lambda extension that offers a more cost-effective, simplified process for collecting data from your functions.

June 2021 Civo Roadmap Update

In October 2020 we released the community-driven roadmap for 2021. It's time to revisit and see all the things we have completed from the list! I am very proud to say that at Civo we have taken the community suggestions and implemented most of them during the launch on May 4th 2021. Let's dive into each of the features listed in the original blog post and see where we are with the 2021 Civo Roadmap.

Understanding The Move To Intelligent Networking

More CIOs are seeing the value of network automation, which can improve network efficiency and cost, as well as help them manage increasingly complex IT environments. But like the adoption of any new technology, network automation presents a number of challenges and considerations for CIOs. In our recent webinar, PCCW Global’s CTO Paul Gampe and VP of Development and Operations Jay Turner shared some tips and insights into how to begin the move to intelligent networking…

Visualize HAProxy Metrics with InfluxDB

HAProxy generates over a hundred metrics to give you a nearly real-time view of the state of your load balancers and the services they proxy, but to get the most from this data, you need a way to visualize it. InfluxData’s InfluxDB suite of applications takes the many discrete data points that make up HAProxy metrics and turns them into time-series data, which is then collected and graphed, giving you insight into the workings of your systems and services.

Benefits and challenges of using monorepo development practices

In a single, monolithic repository, also known as a monorepo, you keep all your application and microservice code in the same source code repository (usually Git). Typically, teams split the code of various app components into subfolders and use Git workflow for new features or bug fixes. This approach is natural for most applications or systems developed using a monolithic architecture. Code in such a monorepo typically has a single build pipeline that produces the application executable.

The Role of the DBA Is Changing

For good or for ill, technology is constantly shifting and with it, the roles of those who manage that technology also shift. This is no different for a DBA than it is for a developer, an admin, or analyst. As new technology, like the adoption of the cloud, changes the role, people start to question whether or not there’s even a need for a DBA. The shortest possible answer to that question, in my opinion, is “Yes”.

How to Manage Network Configurations + Best Software to Automate Configuration Management

Businesses both small and large require agility in how they handle device firmware and network configurations. For large networks especially, manual monitoring and change implementation can be inefficient. As such, more and more IT departments turn to automated network configuration management tools with bulk change capabilities.

What Is the Database Server Doing?

One of the most common questions database professionals are asked by their systems and virtual machine (VM) administrators is “Why does the database server need so much memory?” You’ll get a more detailed answer to that question later in this post, but it’s important to understand a database engine is almost like a server within a server.

5 Steps to Starting DevOps with a JFrog Free Subscription

The JFrog Free subscription is a SaaS cloud offering of the JFrog DevOps Platform that provides software developers, DevOps Engineers, System Administrators and students a sandbox environment to explore solutions to common DevOps challenges. Here are examples of common DevOps challenges, where having a free subscription to the JFrog Platform helps.

Why Logging Matters Throughout the Software Development Life Cycle (SDLC)

There are multiple phases in the software development process that need to be completed before the software can be released into production. Those phases, which are typically iterative, are part of what we call the software development life cycle, or SDLC. During this cycle, developers and software analysts also aim to satisfy nonfunctional requirements like reliability, maintainability, and performance.

FlashDrive and Chia cryptocurrency

Chia cryptocurrency is based on Proof Of Space, and distribute tokens according to a mechanism called plotting. In the last weeks, we've seen a lot of new accounts trying to launch and operate Chia miners from FlashDrive's infrastructure. Most of those accounts where created with fake/stolen credit cards for the sole purpose of getting Chia coins for free.

Use the improved infrastructure list to track your hosts' health

Datadog’s infrastructure list provides a central, high-level view of every host in your environment and pulls together metadata and relevant metrics from across Datadog to help you get the full picture of each one. You can easily filter and sort the list using any host tags, letting you quickly view the status of the parts of your infrastructure you need.

Avoid These 4 Common Mistakes When Setting and Measuring Latency SLOs

Setting and measuring latency Service Level Objectives (SLOs) is a critical responsibility for engineers monitoring the performance and health of their applications and systems. SLOs are an agreement on an acceptable level of availability and performance and are key to helping engineers properly balance risk and innovation.

Continuous deployment for Android libraries to Maven Central with Gradle

This article will take you through setting up CI/CD integration for building, testing, and publishing libraries to Maven Central using Gradle. With jCenter shutting down, Maven Central is once again the primary destination for all Android and Java libraries. Library publishers will need to port their libraries over to Maven Central to keep their libraries available after jCenter shuts down. This article focuses on CI/CD integration.

US Executive Order on Cybersecurity: What it Means for DevOps

The United States Government equates cybersecurity with national security. That’s the crux of the recent Executive Order that will mandate that not only must software applications be vetted, but there will be upcoming regulations on providing all of the components that make up the software. As section 1 notes: “prevention, detection, assessment, and remediation of cyber incidents is a top priority and essential to national and economic security.”

Data Warehouse Vs. Data Lake (Vs. Data Mart): A Full Breakdown

Big data analytics help organizations use data to explore both new and improvement opportunities. Whichever cloud data platform you choose, there are two data storage technologies you will want to understand. Data warehouses and data lakes are the two dominant data solutions commonly used for defining how an organization stores, queries, analyzes, and reports on big data. This post will define what a data warehouse and data lake are, how they work, and their differences.

Key Multi-tenancy Challenges in the Public Cloud and How to Solve for Them

Nobody wants to deal with annoying neighbors. Whether it’s the neighbor who always knows everyone’s business or the one who turns up their music late at night, both types of neighbors can have a negative impact on your living environment and daily life. Obnoxious neighbors aren't exclusive to just your physical living space, but in the public cloud where there are multiple Kubernetes clusters (EKS, AKS, or GKE) and multiple users (or tenants) with the need for cluster access.

Create powerful data visualizations with the new Datadog dashboards experience

Dashboards are a crucial tool in your monitoring arsenal, as they allow you to visualize and correlate telemetry data from across your stack in a single place. Historically, Datadog offered two dashboard types: Screenboards, for pixel-level control on a canvas, and Timeboards, for troubleshooting a specific point in time. Now, we’re excited to introduce a new dashboard layout that combines the best of Timeboards and Screenboards in a single, seamless editing experience.

How to debug Kubernetes Pending pods and scheduling failures

When Kubernetes launches and schedules workloads in your cluster, such as during an update or scaling event, you can expect to see short-lived spikes in the number of Pending pods. As long as your cluster has sufficient resources, Pending pods usually transition to Running status on their own as the Kubernetes scheduler assigns them to suitable nodes. However, in some scenarios, Pending pods will fail to get scheduled until you fix the underlying problem.

Use Datadog's Notebooks API to programmatically manage your notebooks

Datadog Notebooks simplify the way teams across an organization find and share knowledge. By bringing together live data and rich Markdown text, Notebooks help teams create powerful, data-driven documents—from runbooks and support playbooks to incident postmortems and data reports. And with collaboration functionalities like real-time editing and commenting, team members can simultaneously make changes to a document and gather feedback along the way.

How Query Sampling Improves Database Performance

Given the overwhelming importance of data to organizations, anything they can do to speed up troubleshooting problems in the databases they use is of great value. If a company can speed up or avoid troubleshooting, it frees up time they can invest in doing and building more with their data. Careful attention to database query construction and execution also pays similar dividends, as improved database performance helps organizations get more done faster.

Untangling Network Policies on K8s

Network Policy is a critical part of building a robust developer platform, but the learning curve to address complex real-world policies is not tiny. It is painful to get the YAML syntax right. There are many subtleties in the behavior of the network policy specification (e.g., default allow/deny, wildcarding, rules combination, etc.). Even an experienced Kubernetes YAML-wrangler can still easily tie their brain in knots working through an advanced network policy use case.

Data Lake, Data Lab, Data Hub: what's the difference?

In this post we’ll explore the concepts of data lake, data hub and data lab. There are many opinions and interpretations of these concepts, and they are broadly comparable. In fact, many might say they’re synonymous and we’re just splitting hairs. But let’s look again carefully. We can discern some subtle trends in the way people are doing things, and find distinctions in these expressions.

Datadog Synthetic Monitoring now supports cross-browser testing

Your users access your application from a wide range of browsers, which have their own implementations of HTML, CSS, and JavaScript. For instance, many modern JavaScript features such as Promises and Arrow Functions are unsupported by some browsers. These inconsistencies can lead to missing elements and malfunctioning workflows that affect some—but not all—of your user base.

The Future of Database DevOps

I work as Director at ThoughtWorks in the database and DevOps space. I’ve been here for 20+ years and I vaguely remember my first project at ThoughtWorks in 1999 when we had just started using Agile software development practices. The basic challenge we faced was how to move database changes at the same pace as application code and keep them in sync so that deployments would work. At the time, we had to invent all the tools, processes, and techniques that we needed.

Resilience in Action Episode 7: Killing Ops with Tony Hansmann

Resilience in Action is a podcast about all things resilience, from SRE to software engineering, to how it affects our personal lives, and more. Resilience in Action is hosted by Kurt Andersen. Kurt is a practitioner and an active thought leader in the SRE community. He speaks at major DevOps & SRE conferences and publishes his work through O'Reilly in quintessential SRE books such as Seeking SRE, What is SRE?, and 97 Things Every SRE Should Know.

10 Biggest Mistakes IT Professionals Make And How to Avoid Them

IT spending grew to an impressive $3.8 trillion in 2019. With 2020 giving enterprises a reality check on remote working, the spending on digital transformation is expected to grow even further. It goes without saying that IT is an integral part of any company, big or small. When the stakes are so high, there’s very little room for mistakes. However, we’re all humans and do make mistakes.

PowerShell DSC: The next generation

We have some exciting news for you about Puppet's support for the PowerShell DSC configuration framework for Windows. In short, content from the PowerShell Gallery will simply appear on the Puppet Forge and can be added to your Puppetfile and used just like any other Puppet module. This makes it by far the most flexible and maintainable iteration of DSC integration we've ever had. Pick and choose whatever DSC Resources you want and get all the VSCode IntelliSense magic you've come to expect.

AIOps for IT Ops - Part One

Industry analyst firm Gartner recently released a new report entitled Market Guide for AIOps Platforms. It’s a 20-page document that offers their perspective on the AIOps market. Unlike a Gartner Magic Quadrant, the Market Guides are not vendor comparisons. Market Guides are often precursors to MQs - they are used for emerging markets that may eventually have an MQ.

Automate Self-Service Workflows from ServiceNow with the Puppet Spoke

Puppet Spoke empowers your team to deliver changes in a faster, more efficient manner by creating a single platform experience between Puppet and ServiceNow. Users can trigger automation workflows such as installing packages, rebooting machines, and orchestrate changes across their environments without ever leaving ServiceNow.

Build Automated, Scalable Enterprise Integration Workflows by Using the Enterprise Integration Pack (EIP) with Azure Logic Apps

In business-to-business (B2B) solutions and seamlessly creating communication between organizations, establishing a standard format to create communication across different services and enabling the security trust fact across various services is the most challenging parameter to achieve. But today, you can build automated, scalable enterprise integration workflows by using the Enterprise Integration Pack (EIP) with Azure Logic Apps at ease.

GitHub Authentication Policy Changes Coming August 2021

If you’ve already connected your GitHub integration via OAuth in GitKraken, you’re good to go! GitHub is changing its security policy and will no longer allow username/password-only access. This change goes into effect on August 13th, 2021, and affects all desktop Git applications that offer a GitHub integration, including GitKraken. Users who have already authenticated to GitHub using OAuth will be unaffected. OAuth is the default connection method within your GitKraken profile settings.

DevOps Training: Enabling Continuous Cloud Capacity Optimization

Cloud elasticity makes capacity management irrelevant in the cloud, right? Wrong! In fact, with effectively no limits on potential capacity, optimization and management becomes critical. Watch as we explore opportunities for DevOps practitioners like you—linchpins for enabling cloud business strategies—to achieve greater agility while providing increased cloud ROI.

Announcing developer preview of the Mattermost Apps Framework and serverless hosting

The value of Mattermost is significantly enhanced with third-party tool integrations and customization. Today, we are releasing the developer preview of a new Apps Framework for creating application integrations and customized workflows. The Apps Framework complements the existing ecosystem of plugins and allows apps to be written in any language and deployed with serverless hosting.

Cloud 66 Feature Highlight: Tag Propagation

What is tag propagation? Some cloud providers support the propagation of your Cloud 66 tags into their own tagging systems. Tag Propagation allows you to easily identify and link components between your Cloud 66 account and supported cloud provider platforms. Currently, tag propagation supports cloud servers and load balancer components.

A Guide to Enterprise Cost Containment for Monitoring Pros

In this enterprise cost containment series, we’ve tackled a range of topics from cloud to professional services and more. Now, I want to dive into the topic you may have expected us to cover from the start: monitoring. After all, at SolarWinds, we create monitoring software. The goal of this post is not to present or sell software, though. It’s our intention to help you have conversations with management and stakeholders—no matter the monitoring you use.

A DBA's Habit for Success: CMMI (Part Two)

Welcome back to our five-part series in which we discuss a top habit for DBAs to increase business functionality. In part one of this series, we discussed the importance of a capability maturity model (CMM) and more specifically Level 1 of the information management maturity model (IMMM) and how the framework can provide a step-by-step process for DBAs to follow while also allowing businesses to gain skills along the way.

Integrating a Cloudsmith repository with a Semaphore CI workflow

At Cloudsmith, we believe that packaging should be at the centre of any modern build and deployment process. In fact, we think that Continuous Packaging is the glue that ties Continuous Integration and Continuous Deployment or Delivery together. So with that in mind, in this blog, we will take a walk through how easy it is to integrate Cloudsmith with a Semaphore CI workflow and push the artifacts and packages that you build to a private repository. TL:DR – It’s super easy.

What's new in security for Ubuntu 21.04?

Ubuntu 21.04 is the latest release of Ubuntu and comes at the mid-point between the most recent Long Term Supported (LTS) release of Ubuntu 20.04 LTS and the forthcoming 22.04 LTS release due in April 2022. This provides a good opportunity to take stock of some of the latest security features delivered in this release, on the road to 22.04 LTS. Ubuntu 21.04 brings with it a vast amount of improvements and features across a wide variety of packages.

Monitor AWS App Runner with Datadog

Knowing how to deploy and run applications has become a key part of modern app development, meaning that developers need expertise in a number of areas beyond their core application code. Whether it’s container orchestration, networking, scaling, or load balancing, there is a steep learning curve to being able to deploy and run an application at scale.

5 factors for evaluating an RMM tool for the modern MSP

Managed service providers (MSPs) are becoming increasingly important in the IT management industry. The role of an MSP does not just stop with monitoring, managing, and maintaining the IT services of their clients; it extends to keeping a close watch on everyday IT developments and proactively securing clients’ IT networks against cyberthreats. To balance all these responsibilities, MSPs need comprehensive IT management and monitoring solutions that can cater to all their needs.

Podcast: Break Things on Purpose | Jose Nino, Staff Software Engineer at Lyft

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Break Things on Purpose is a podcast for all-things Chaos Engineering. Check out our latest episode below. You can subscribe to Break Things on Purpose wherever you get your podcasts. If you have feedback about the show, find us on Twitter at @BTOPpod or shoot us a note at podcast@gremlin.com!

Enhanced Kubernetes right-sizing features for fine-tuned application efficiency

Spot by NetApp’s Ocean continuously optimizes Kubernetes clusters with a wide feature set tackling different aspects of running and managing Kubernetes containers in a cloud environment. To help users improve the efficiency and performance of their cloud environments, Ocean’s rightsizing capabilities provide recommendations that target over-provisioning and underutilization.

Keep OSS supply chain attacks off the menu: Tidelift catalogs + JFrog serve known-good components

How does your organization keep track of all of the open source components being used to develop applications and ensure they are secure and properly maintained? Our recent survey data shows that the larger an organization gets, the less confident they are in in their open source management practices. In companies over 10,000 employees, 39% are not very or not at all confident their open source components are secure, up to date, and well maintained.

Continuous deployment for Azure functions

Serverless computing, a model in which the provider manages the server, lets developers focus on writing dedicated pieces of application logic. Serverless computing has been adopted by many development teams because it auto-scales. Auto-scaling relieves developers of allocation management tasks, so they do not need to worry about the allocation of server resources or being charged for resources they are not consuming.

Accelerate operational tasks with Puppet and ServiceNow

As your DevOps and IT Service Management (ITSM) teams continue to increase, so should your ability to provide self-service infrastructure capabilities. Whether you are a ServiceNow or Puppet administrator looking to expand automation to other groups such as Site Reliability Engineering or the IT Service Desk, the integration of Puppet Enterprise and ServiceNow is now at your fingertips.

Boost Productivity Through Optimized Azure Storage Management

In the fast-growing Azure world, the significant need for organisations would be productivity. Any enterprise that invests in a tool for their serverless needs aims for optimised management, which boosts productivity. This article will explain how to increase productivity while having a critical resource like Azure Storage, which is essential for organisations with their business developed in Azure. For an effective performance for such substantial resources, optimised management is mandatory.

What is Prometheus - Use cases

Prometheus is an open-source tool that’s meant to monitor and collect metrics from applications. The point of this system is to make it easy for users to see and understand important metrics that let them know how well an application is doing. In fact, Prometheus is able to collect over one million metrics per second, and then store them until you’re ready to retrieve them.

What Is Thanos - Use Cases

When you hear the word "Thanos," your first thought might be the Marvel Cinematic Universe villain from the Avengers: Infinity War film who seeks to collect the Infinity Stones and end half of all life in the universe. But if you mention the word to a data nerd, you're likely to get a very different response. Prometheus is a free and open-source platform for real-time systems and event monitoring and alerting.

SaaS vs onPremise: Pros, Cons and Cost Analysis

Be aware that we’re not saying that you are in cloud nine, but that you may most likely be using the cloud. That is, if you use Google mail, Microsoft Office 365 office suite or you take a photo with your cell phone and then it gets automatically uploaded to iCloud or something similar, you are using the cloud.

Careers at a Crossroad: Staying Technical vs. Heading into Management

There’s a point in every IT professionals’ career where they inevitably ask themselves,“do I want to stay technical, or get into management networking jobs?” Sometimes this point occurs when they find themselves already are in management, either by design, or as I like to say, “by accident”.

SRE vs. DevOps [Understanding Differences & Similarities]

Site Reliability Engineering (SRE) and DevOps share a goal of building a bridge between development and operations. We'll explore and compare both approaches. Wondering to yourself, which is better for your company, SRE or DevOps? Neither SRE or DevOps is “better,” exactly, since they’re similar yet different in a few key ways: SRE, or site reliability engineering, is a methodology developed by Google engineer Ben Treynor Sloss in 2003.

Make your Onboarding Experience Better with a Murder Mystery Game

Onboarding a new tool can be boring. Or stressful. Or both. When onboarding an incident response tool, it can be difficult to make sure that your team is getting the most from the experience. Do you opt for a run-of-the-mill meeting, or try to learn while in an incident? Neither option is ideal. That’s why Petal’s DevOps Engineer Michael Cole found a new way to get his team using Blameless for their incident response process.

Understanding The Benefits & Risks Of The Interconnected Enterprise Model

The interconnected enterprise model is a natural response to the high frequency change that we are seeing across multiple industries. Today’s enterprises are now find themselves interconnected with a vast pool of cloud and SaaS providers as well as other business partners, which allows organisations to adapt more rapidly to changing market conditions and opportunities. But like any business model, this presents risks and challenges as well as its rewards…

New Puppet Enterprise release delivers security, performance updates

The spring release of Puppet Enterprise is now available. Puppet continues to build out its flagship product to help organizations scale DevOps initiatives, meet compliance requirements, and deliver on cloud and hybrid initiatives. With this release, we’ve focused on delivering key enhancements to help boost productivity, giving organizations the ability to automate faster and more securely at scale.

Puppet Enterprise 2021.1 release includes support for SAML 2.0

Security is essential. It’s top of mind for organizations of all sizes and it’s certainly a top priority for Puppet. The latest release of Puppet Enterprise 2021.1 now offers support for SAML 2.0 providing a more secure and efficient authentication path for our customers to access their Puppet environments, applications and tooling.

AdventureWorks, classified in under 20 minutes with SQL Data Catalog

It's hard to know where to start with Database Classification and many choose to go column-by-column. In this video join Chris as they show you how to get AdventureWorks classified in under 20 minutes using clever rules, bulk filters and a logical approach with Redgate SQL Data Catalog.

SRE Availability Metrics

How available is your website, service, or platform? What must you monitor and measure to ensure availability? How do you translate uptime into availability? This chart has numbers that every Site Reliability Engineer (SRE) should know. Below the chart, you will find answers to commonly asked questions about SRE and associated metrics.

Automating load tests for APIs

In most cases, when users start to access and use a new application or a new release, app performs pretty well. As the user base grows and usage increases, the app can outgrow its infrastructure. Users can start experiencing a dip in performance. Latency increases, bandwidth and memory get exhausted quickly, and some code architectures start to fail because they do not scale well with the increased amount of users.

Messy AWS Tags? Confidently Allocate Costs Without a Perfect Tagging Strategy

AWS tags are a bit like flossing your teeth every day — or getting eight hours of sleep a night. Everyone agrees they’re good habits that will make life easier down the road. But sometimes life gets in the way, and those habits fall a little short. Most teams set out with the intention to tag their infrastructure, but in our experience, it’s rare that companies have a perfect, thorough AWS tagging strategy.

A Day in the Life: Intelligent Observability at Work with our SRE, Dinesh

When I asked Charlie for permission to attend this year’s AICon (virtual, natch) I thought it would be a shoo-in; learning’s part of my OKRs after all. But he never makes things easy and his ‘yes’ came with a caveat that’s typical when dealing with him. This time, he claimed he didn’t have the budget for the ticket (a likely story!) and I’d have to find another way to get one.

WTF is Incident Management? Post-Panel Wrap-Up

That's a wrap! We hosted "WTF is Incident Management" on May 12, 2021. We invited four very knowledgeable panelists to discuss how they define incident management, what changes they'd make if they could start again from scratch, how to manage team stress after an incident, and other subjects. Our panelists were: host Matt Stratton (Staff Developer Advocate at Pulumi), Emily Ruppe (Incident Commander at Twilio), Alina Anderson (Sr.

Introduction to open source private LTE and 5G networks

It’s so easy these days to set-up your own WiFi network. You order a router online, plug it into the electrical socket, define a password and you’re good to go. WiFi is fast, reliable and easy to use. But if you want to cover a wider area or connect hundreds of small devices it quickly becomes inefficient and expensive. Is the only way to go to your local mobile network operator and sign a contract? No! Thanks to open source technology, you can build your own LTE or 5G network.

Introduction to K3s

Whether you’re new to the cloud native space or an accomplished practitioner, you’re probably aware that there are many Kubernetes distributions to choose from. Maybe you’ve heard about the challenges of getting up and running with Kubernetes. Guess what? It doesn’t have to be hard. This blog provides an introduction to K3s, a lightweight CNCF-certified Kubernetes distribution. We’ll look at what makes K3s different from other Kubernetes distributions.

Introducing Kubewarden, an Open Source Policy Engine

Security has always been a wide and complex topic. A recent survey from StackRox about the state of containers and Kubernetes security provides some interesting data on these topics. In this blog post, I’ll dive into some of the findings in that survey and introduce you to Kubewarden, an open source policy engine. A staggering 66 percent of the survey participants do not feel confident enough in the security measures they have in place.

Preventing your teams from burning out while working from home

In the past year of covid related working from home, we are increasingly seeing more burnouts in engineering teams worldwide. More and more devs are partially checked out and may not be putting their 100% in team activities (planning, grooming, code review, quality checks). In these testing times, we have found some of the ways to keep your team motivated.

Five Tips for Optimizing Hyper V

Introduced in 2008, Microsoft’s virtualization platform Hyper-V has become a well-known tool for administrators. Hyper V offers users with a wide range of management options. It includes GUI-based Hyper V tools such as Hyper V Manager, and command-line tools like Windows Powershell. Hyper V versions have been released ever since with Windows Server.

KubeCon Operator Day keynote with Mark Shuttleworth

Operators, Models, Kubernetes, Hybrid Clouds, massive scale and bootstrapping quickly - Kubernetes is taking the the world by storm. So what's next? Mark Shuttleworth (one time astronaut, founder of Canonical, the company behind Ubuntu) talks with David Booth (VP Cloud Native Applications at Canonical) about the past and lays down a vision for the future. Miro board Juju website The Kubernetes and Cloud Native operations survey.

AI-powered API operations with Apigee

APIs are packages of data and functionality that contain business-critical information. However - as API programs scale - it becomes impossible to individually manage each API. In this video, we demo how Apigee helps simplify API operations and allows you to deliver seamless and connected experiences for your customers.

Turbocharging your Android Gradle builds using the build cache

The Gradle Build Cache is designed to help you save time by reusing outputs produced by previous builds. It works by storing (locally or remotely) build outputs, and allowing builds to fetch these outputs from the cache when it determines that inputs have not changed. The build cache gives you the ability to avoid the redundant work and cost of regenerating time-consuming and expensive processes.

Correlate software performance and resource consumption with new saved views in Live Processes

Your applications rely on third-party software running throughout your infrastructure, and it can be challenging to monitor each of these technologies individually. To give you the visibility you need, Datadog Live Processes now monitors all of your third-party workloads in one place.

Add Datadog monitoring to your Retool apps

The more tools that your teams need to execute their workflows, the more friction and lost productivity there can be, especially if each tool requires a different CLI or set of APIs. Retool is a low-code platform that allows you to build internal web applications using a drag-and-drop interface. By integrating with a number of key backend databases and APIs, Retool enables you to create custom, centralized management tools to serve a wide range of employee-facing use cases.

Simplifying MLOps with model-driven operators

In early markets such as MLOps, solutions to parts of a large problem arise from multiple open source communities, startups and industry leaders. For the consumer, this entails one problem - integrating pieces of a software puzzle in a maintainable way. Model-driven operators promise a solution by connecting the ops of a single application with declarative integration in a standard that empowers providers.

Introduction to database testing

In software development, processing and storing data in different states reflects the business rules an application is built on. The heart and soul of any software application is the data that is persisted in databases for retrieval and further processing. The database system (SQL or Non-SQL) chosen for an application must serve the required data processing and storage needs of the application.

Troubleshooting Kubernetes Clusters as a Developer with Komodor

The container ecosystem is moving very fast and new tools designed specifically for Kubernetes clusters are introduced at a very fast pace. Even though several times a new tool is simply implementing a well-known mechanism (already present in the VM world) with a focus on containers, every once in a while we see tools that are designed from scratch rather than adapting a preexisting idea. One such tool is Komodor.

New Dashboard Builder Now Available to Circonus Customers

We recently announced the development of our new dashboard builder and associated release of several new turnkey service dashboards. The new dashboard builder provides a vastly improved user experience, enabling users to create dashboards in a fraction of the time it took them previously. As of this month, the dashboard builder, which was previously only accessible internally, is now available to all Circonus customers.

Announcing HAProxy 2.4

HAProxy 2.4 adds exciting features such as support for HTTP/2 WebSockets, authorization and routing of MQTT and FIX (Financial Information Exchange) protocol messages, DNS resolution over TCP, server timeouts that you can change on the fly, dynamic SSL certificate storage for client certificates sent to backend servers, and an improved cache; it adds a built-in OpenTracing integration, new Prometheus metrics, and circuit breaking improvements.

6 Ways Artificial Intelligence Improves Software Development

Artificial intelligence is transforming software development. From the code to the deployment, AI is slowly but surely upping its game and helping us discover a brand new paradigm for inventing technology. Algorithm-based machine learning is being used to accelerate the software development lifecycle and AI is supporting developers to optimize software workflow at every stage of the development process.

GitKraken v7.6: Hook a Warp Speed Drive To GitHub Pull Requests

Now you can interact with your GitHub pull requests directly from GitKraken We opened our hailing frequencies and heard your communications. Devs from across the galaxy have asked us to help increase the speed of their workflows and we are happy to report on some major activity in that quadrant. 🚀🌃🌠 Announcing GitKraken v7.6 You no longer have to leave the bridge GitKraken to work with your GitHub Pull Requests.

CloudWatch Pricing: What You Need To Know

To make sure your company’s cloud-based resources remain continuously available, you need a way to monitor all your applications and quickly detect when something goes wrong — especially if you are running multiple instances and using a variety of products. Amazon’s inbuilt tool, CloudWatch, allows you to do just this. In this article, we’ll cover exactly what AWS CloudWatch is, how it works, and how much it costs to use.

2 Ways to Integrate the Jaeger App with VMware Tanzu Observability Without Code Changes

In microservices architecture, to identify performance issues—including latency—it’s important to monitor each service and all inter-service communication. Jaeger and VMware Tanzu Observability can help. Jaeger is an open source, distributed tracing system released by Uber Technologies. VMware Tanzu Observability is a high-performance streaming analytics platform that supports 3D observability (e.g., metrics, histograms, and traces/spans).

Tanzu Mission Control Supports Lifecycle Management of Tanzu Kubernetes Clusters on VMware Cloud on AWS

We are very excited to announce a key integration between VMware Tanzu and VMware Cloud on AWS that provides a significantly enhanced experience for our customers who want to deploy, run, and manage Kubernetes in VMware Cloud on AWS. With this integration, VMware Tanzu Mission Control now supports full lifecycle management—provisioning, upgrading, scaling, and deleting—of Tanzu Kubernetes clusters deployed on VMware Cloud on AWS.

Augmented Reality in Pharma Industry: How the Game Changing?

Many industries have taken up Augmented Reality (AR) serious in order to increase production, sales and customer education. Organizations today are making use of several AR applications for multiple purposes to grab the attention of users. Surveys have predicted a vast growth in the AR VR markets in the near future as much money has been induced into the field. The application of AR technology is establishing a firm hold over the pharmaceutical industry as well.

Best practices for monitoring dark launches

A dark launch is a deployment strategy for testing new versions of a service in production. When running a dark launch, you deploy a new version of a service and route a copy of production traffic to it without returning responses to users. This lets you see how a new version of a service handles production load, watch for errors, and compare performance between the old and the new versions—without affecting users.

Redgate's roadmap for cross-database DevOps

At Redgate, we strongly believe that all databases should be managed and orchestrated in the same way, with the same standards of security and quality in releases. For the past few years, we’ve been leading the adoption of database DevOps by focusing on the most challenging parts of the process like version control, continuous integration and making deployments consistent, predictable and repeatable.

Automating Government Compliance and Security

This blog is the first in a four-part series on infrastructure automation for government agencies that are modernizing digital systems while grappling with budget and staffing constraints and the challenges of COVID-19. The last 12 months have been a turning point for many government agencies. The COVID-19 pandemic has accelerated the drive towards modernization and, with it, the need to ensure security and compliance requirements across a host of legacy systems and processes.

Continuous delivery with Ketch, GitHub Actions, and k3d

Can we combine the simplicity of deploying applications with Ketch with GitHub Actions and accomplish a fully automated continuous delivery pipeline? Here's what we'll do. We'll create GitHub Actions that will fully automate all the tasks starting from creating a pull request all the way until a release is deployed to production.

Continuous deployment of Node.js to Azure VM

Virtual machines (VM) offer great flexibility for hosting web applications. A developer/engineer is able to configure and control every piece of software and every setting that the application needs to run. Azure, one of the largest cloud hosting platforms, has virtual machine offerings for both Linux and Windows-based operating systems. In this tutorial, you will learn how to set up a continuous deployment pipeline to deploy a Node.js application to an Azure virtual machine.

How Database Performance Analysis Can Inform Database Selection

There’s a big difference between the MySQL database powering your internet of things (IoT) lightbulb and the one powering your website. There’s a chasm between the SQL Server Express database in your lightly used application’s VM and the monster multi-region SQL cluster you have running on Microsoft Azure. This kind of database diversity is everywhere today. But what are the differences between these databases, and how can administrators justify database spend?

7 Strategies to Contain Network Costs (Layer 6 Will Amaze You)

First, thanks for indulging the clickbait title joke. Serialization is the unsung hero of harmonizing network and application relations and deserves the occasional, snarky callout. Moreover, identifying how the unique mix of network clients in your environment consume your carefully manicured infrastructure is critical for managing network cost. Because today, a rapidly expanding, diverse pile of new technologies all assume the network is a magic grid, no tuning required.

D2iQ Kommander and AWS Kubernetes Services: A Winning Combination

To continuously innovate, many organizations are anchoring their infrastructure on container management solutions. The open source project Kubernetes is now the de facto standard for container management, and its popularity is growing in a number of ways. Here are some stats from a recent Cloud Native Computing Foundation (CNCF) survey.

The rise of the developer platform

I have recently seen quite a few articles and talks covering why organizations are aiming at implementing a developer platform to help speed up the adoption of microservices within their organizations but before we get started on discussing what a developer platform is, the developer experience and productivity on Kubernetes, and how different teams are working through it, let’s define some common ground.

Automate Containerization of Apps to OpenShift using CloudHedge's App Modernization Platform

Firstly, super excited to share that CloudHedge has successfully completed the IBM Cloud Paks certification, IBM has some stringent requirements for achieving this certification which includes having an Operator certified product listed on Red Hat Marketplace. CloudHedge’s intelligent App Modernization Platform enables enterprise customers to transform their legacy workloads to OpenShift container platform efficiently and effortlessly.

Celebrities Explain WTF is Incident Management

Our friends Felicia Day, Steve Wozniak, and Brian Baumgartner help us explain what the heck incident management is. FireHydrant is the only comprehensive incident management platform that allows you to create consistency for the entire incident response lifecycle to focus on fighting fires faster. From alert to retrospective, tracking, communicating, and reporting on results: FireHydrant will automate the process so you can focus on resolution. Visit firehydrant.io to learn how you can manage the mayhem.

CircleCI acquires Vamp, adding release orchestration to their CI/CD platform to help engineering teams deliver business growth

CircleCI, the leading continuous integration and continuous delivery (CI/CD) platform, today announced the acquisition of release orchestration platform, Vamp. Combining Vamp's industry-leading release orchestration capabilities with CircleCI's robust CI/CD platform will be transformative for engineering teams amid a growing need for increased change validation in the industry.

Speed up your dashboard workflow with dynamic template variable syntax

Template variables enable you to use tags to filter your Datadog dashboards to the hosts, containers, or services you need for faster troubleshooting. However, there are some cases where it may be difficult to use a standard set of template variables to aggregate all of the data you need without creating a complicated, difficult to manage set of variables. For example, you may use tag values that are a subset of another tag.

SRE Leaders Panel: Business Agility is what matters, SRE can help you get there

Blameless recently had the privilege of hosting SRE leaders Garima Bajpai, Founder at Community of Practice - DevOps Canada and Jason Fraser, Delivery Lead at VMware Tanzu to discuss the value of crisis during incident response, the best and worst tech transformations they’ve seen, how reliability impacts the flow of value, and more.

Announcing HAProxy Data Plane API 2.3

The HAProxy Data Plane API 2.3 expands its service discovery mechanisms and introduces native support for discovering AWS EC2 instances and auto-scaling groups. It also adds a new configuration file that supports HCL and YAML, an Inotify configuration watcher, and Syslog support. HAProxy Data Plane API version 2.3 is now available and you will find it in the 2.3 version of the Alpine Docker image.

We raised $100M in our Series F: here's what we're building next

Today we announced our Series F round of $100M led by Greenspring Associates, with Eleven Prime, IVP, Sapphire Ventures, Top Tier Capital Partners, Baseline Ventures, Threshold, Scale, Owl Rock, and Next Equity Partners. Thank you to our customers, community, partners, investors, and team. This latest investment allows us to invest as well; in our product, our community, and in our people. We build for the builders of the digital age: developers.

Concrete Steps to Reducing MTTR

In today’s data-centric world, metrics or numbers define all performance benchmarks. The time between when an event starts and ends shows how well a system can handle and process such events. One of such metrics is MTTR. MTTR usually stands for Mean Time To Resolution, but it has held several meanings over the years. MTTR is a metric used to measure how well a system can bounce back from errors and provide long-lasting solutions.

A guide for CTO: 8 questions to ask before using Kubernetes

Congratulations, you finally consider moving your apps to Kubernetes. It is a big day! Here is a checklist to ensure you did not forget anything essential to increase your chances of success using Kubernetes. We divided those points into three sections, from the most important to the least. Let’s go.

Observability: It's the User Experience, Stupid!

Observability, which originated from control theory, measures how well you can understand a system’s internal states from its external outputs. Observability uses instrumentation to provide insights that aid monitoring. In DevOps, gaining observability is achieved through a set of monitoring solutions. The shift to use one vendor platform to do so, versus multiple solutions, make sense as.

Replay Single Transactions for Root Cause Analysis

Speedscale was built primarily to provide engineering teams with better insight into their applications over time, replaying single transactions for root cause analysis that give developers and SREs confidence that tomorrow’s application code will work just as well in production as it did yesterday.

What's New with JFrog Xray and DevSecOps

As we look to improve the quality and capabilities of the JFrog DevOps Platform, especially in the world of DevSecOps, we have added powerful new features to further enhance the award-winning JFrog Xray. The capabilities detailed below cement Xray’s position as a universal software composition analysis (SCA) solution trusted by developers and DevSecOps teams globally to quickly and continuously identify open source software vulnerabilities and license compliance violations.

Snowflake vs. Redshift: Which One Is Best For You?

Data is a key business intelligence tool. Successful businesses rely on data to make decisions. And every business needs a secure destination for storing collected data for later analysis. Cloud-based data warehouses are increasingly becoming the go-to destination. Redshift and Snowflake are two of the big names in this space and they provide similar services. They are big data analytics databases that can read and analyze huge amounts of data.

The technology challenge of mergers and acquisitions in the insurance sector

Mergers and acquisitions are going on all over the market at the moment … and de-mergers as well, actually. Typically in a merger or acquisition, there’s some knowledge that it’s going to happen in advance. But until the Heads of Terms have been signed and there’s a Transition Service Agreement in place, people don’t really get moving with the activity needed to support the move, particularly on the technology side.

Kubernetes vs YARN for scheduling Apache Spark

Spark is one of the most widely-used compute tools for big data analytics. It excels at real-time batch and stream processing, and powers machine learning, AI, NLP and data analysis applications. Thanks to its in-memory processing capabilities, Spark has risen in popularity. As Spark usage increases, the older Hadoop stack is on the decline with its various limitations that make it harder for data teams to realize business outcomes.

Virtual Reality in Healthcare

VR in healthcare is more of an inevitable part of patient care today. VR has numerous use cases in healthcare such as diagnosis, treatment, medical training and rehabilitation treatment. Also, VR enhanced the way medical practitioners work on diagnosis and train medical care. ZiniosEdge is a major player in developing Virtual Reality Solutions for industries and enabling healthcare practitioners to explore the best from VR technology, helping healthcare facilities seamlessly develop appropriate VR applications.

Continuous integration with GitOps

Software development is changing rapidly. On one hand, you must quickly adapt to evolving requirements, while on the other, your applications need to operate continuously without downtime. DevOps helps you quickly adapt to changes. Among other initiatives, continuous integration (CI) and continuous delivery (CD) are intgegral to any DevOps practice.

Preventing SQL injection attacks with automated testing

SQL injection is one of the most destructive ways an application can be attacked. This kind of attack is targeted toward the application database, which can result in consequences that are irreversible, lead to loss of money, and reduce user trust in your company. There are far too many application data breaches happening every day, usually when a malicious agent attacks the database.

Continuous and Automated Validation for Tanzu Solutions on VMware Marketplace: What, Why, & How

VMware Marketplace is a one stop-shop for VMware customers to discover, try, and deploy various third-party and open source solutions onto their VMware environments. All deployable assets on VMware Marketplace are pre-tested on their respective VMware environments, which empowers users to deploy them with confidence.

VMware Introduces Continuous and Automated Validation for Its ISV and Ecosystem Solutions

If technology applications are the building blocks of enterprises today, developers comprise the masonry team. At VMware, we seek to empower application developers, architects, platform and digital teams alike by giving them the ability to choose the right set of tools for their unique development needs and goals. We build deep, meaningful partnerships with industry peers to support our customers’ choices across their full technology stacks.

VMware Tanzu SQL: MySQL at Scale Made Easy for Kubernetes

We are happy to announce that VMware Tanzu SQL with MySQL for Kubernetes 1.0 is generally available! Tanzu customers can easily run MySQL at scale on Kubernetes with this new release, which complements our existing Postgres engine for Kubernetes. Even better, with this new release Tanzu Advanced customers now have the two most popular open source operational databases included with their purchase.

Monitor kube-state-metrics v2.0 with Datadog

In order to manage complex containerized applications, modern devops teams need to have deep visibility into the status of their Kubernetes resources. By listening directly to the Kubernetes API, the open source kube-state-metrics service generates key metrics about your Kubernetes objects, including pods, nodes, and deployments, which are essential for understanding the status and performance of your clusters.

Top SRE Toolchain Used By Site Reliability Engineers

We have compiled a list of the most popular and sought out tools (some you may have heard of) that SREs need in their toolkit - at every phase of a production system to keep up with SRE best practices Site reliability engineering (SRE) practices help organizations by ensuring smooth functioning of their deliverables with utmost reliability and resilience. These can be achieved by a set of well-defined tools that are deployed at every phase of the production system to keep up with SRE best practices.

SRE fundamentals 2021: SLIs vs. SLAs. vs SLOs

A big part of ensuring the availability of your applications is establishing and monitoring service-level metrics—something that our Site Reliability Engineering (SRE) team does every day here at Google Cloud. The end goal of our SRE principles is to improve services and in turn the user experience. The concept of SRE starts with the idea that metrics should be closely tied to business objectives. In addition to business-level SLAs, we also use SLOs and SLIs in SRE planning and practice.

Understanding The AWS Shared Security Model

Whether you are new to AWS or have been to every re:Invent since 2012 you may have questions about cloud security and how it impacts your valuable technology and data. In particular, you might be wondering where AWS’s security responsibilities end and where yours begin? Which parts of the cloud can you rely on Amazon’s security team and technology to keep safe and which parts must you take care of?

4 Key Characteristics of Modern Monitoring

Our previous post, “Monitoring for Success: What All SREs Need to Know,” discusses how today’s complex IT environments — virtualization, cloud computing, continuous delivery and integration — coupled with pressures to deploy faster while meeting demands for “always on” customer expectations – have placed greater strains on monitoring teams.

Accelerating Code Quality with DORA Metrics

What do Google’s DevOps Research and Assessment (DORA) and Rollbar have to do with each other? DORA identified four key metrics to measure DevOps performance and identified four levels of DevOps performance from Low to Elite. One way for a team to become an Elite DevOps performer is by focusing on Continuous Code Improvement.

Diagnosing Database Performance Problems When You Aren't a Database Administrator

Deep specialization of IT administrators is a luxury only the largest organizations can typically afford. Smaller organizations rely on IT administrators with a more generalist skill set because they are—by necessity—responsible for a wide array of different technologies, and there simply isn’t time to specialize in the intricacies for any one of them. Yet modern IT is intricate.

Failover Conf 2021 Wrap-Up

That’s a wrap! Gremlin hosted Failover Conf 2: Fail Smarter on April 27, 2021. In attendance were over 500 SREs, developers, sales engineers, product managers, DevOps experts, C-level execs, and other reliability pros from around the globe! This year’s conference included discussions around the future of DevOps, strategies for building reliable teams, analyzing human error to create better systems, and more.

Cloud-Hosted of Cloud-Native? Discover Why Cloudsmith Was Born in the Cloud

Today, almost every service now is offered in a “Cloud” variant. But what does that really mean? Are all clouds services equal? It’s easy to see why so many vendors rush to add a Cloud edition/variant of established software they sell. Undoubtedly, there has been a move to Cloud services across the industry, as more and more organizations seek to take advantage of the higher reliability and lower total cost of ownership that Cloud platforms promise.

Using Distributed Tracing in Microservices Architecture

With the rise of microservices based cloud applications & its corresponding complexities, the need for observability is greater than ever. This blog looks into the what-why of distributed tracing along with few best practices to adopt for the same in microservices architecture. Distributed tracing for Microservices architecture is an emerging concept that is gaining momentum across internet-based business organizations.

Kubernetes Ingress 101: Services, Ingresses, Load Balancers, and Certificate Management

In this video, Oleg, CTO at Kublr, will explain the basics of Kubernetes (K8s) Ingress traffic management functionality and how it can be used to simplify managing applications across different environments – in the cloud or on premise. Oleg will use a demo environment with clusters in different clouds to show K8s Ingress in practice.

Register to the Qovery v2 beta now!

When we launched Qovery in January 2020, our product was still a prototype, and we onboarded 53 developers to help them deploy their apps in the cloud. At the time, we were only 2 on the team, and our first employee (Patryk Jeziorowski) decided to join us after being one of our first users. 18 months later, 3004 developers from more than 110 countries use Qovery to deploy their apps on their AWS and Digital Ocean account.

Run Codefresh pipelines on a Bottlerocket Kubernetes cluster

In August 2020, Amazon announced Bottlerocket OS, a new open source Linux distribution that is built specifically for running container workloads. It comes out of the box with security hardening and support for transactional updates, allowing for greater ease in automating operating system updates, maintaining security compliance and reducing operational costs. Bottlerocket is designed to be able to run anywhere and, at launch, has a pre-built variant for Amazon EKS.

Testing in Production: How Did We Get Here?

Testing in production simply means testing new code changes in production, with live traffic, in order to test the system’s reliability, resiliency, and stability. It helps teams solve bugs and other issues faster, as well as effectively analyze the performance of newly released changes. Its overall purpose is to expose problems that can’t be identified in non-production environments for reasons that may include not being able to mimic the concurrency, load, or user behavior.

ICYMI: How Honeycomb Can Help You Achieve the Deployment Part of CI/CD

In case you missed it, this webinar includes code walkthroughs that help you to add observability to your pipelines (using a free Honeycomb account!) so that you and your team can speed up your deployments to prod. This is also a risk-free way to get started with observability if your team isn’t quite yet ready to change your production apps.

Understanding the AWS Well-Architected Framework

Designing and running workloads in the cloud is complex. Many services need to fit together in just the right way for optimal performance. The opportunity for error lurks around every corner. This is a high-stakes game with a huge premium on getting things right from the beginning. Even small mistakes can snowball. To help, AWS studied the architectures of thousands of its customers and supplemented that learning with insights from experts.

What is Microsoft Power Automate Desktop? (Benefits Included)

In order to avoid the repetitive tasks performed over the desktop, Microsoft has developed an extended service, the Power Automate Desktop. It has been recently announced by Microsoft, which has been made available to Windows 10 users. It is a new low-code Robotic process automation that enables business empowerment to automate those tasks that are repetitive and other manual tasks to focus better on higher-value work and to establish more in their corresponding areas of work.

Automatically create and manage Kubernetes alerts with Datadog

Kubernetes enables teams to deploy and manage their own services, but this can lead to gaps in visibility as different teams create systems with varying configurations and resources. Without an established method for provisioning infrastructure, keeping track of these services becomes more challenging. Implementing infrastructure as code solves this problem by optimizing the process for provisioning and updating production-ready resources.

Kubernetes monitoring and troubleshooting made simple

Infrastructure monitoring was difficult enough when entire businesses ran off a few bare metal servers in a dusty, forgotten closet. Other IT infrastructure monitoring tools fell short, unable to provide complete and granular-enough metrics in real time, even when we were only dealing with a handful of systems responsible for running every part of the application stack.

Highly available Kubernetes in IoT: MicroK8s on RaspberryPi

Learn how to set up a Pi-Hole instance with a single command and a cluster of Raspberry Pis on MicroK8s. High availability, load balancing and Kubernetes configuration included. The Raspberry Pi 4 brings the graphics, RAM and connectivity needed for a Linux workstation, so why not use a cluster to set up your own Pi-Hole, the open source network-level ad blocker that acts as a DNS sinkhole or DHCP server.

The fastest SQL Change Automation, Azure DevOps and AzureSQL DB pipeline

Is it possible to set up an end-to-end deployment pipeline from Dev through to Production in Azure DevOps, with SQL Change Automation and Azure SQL DB in just 10 minutes? Join Chris Unwin, a Redgate Solution Engineer, as they try to break the record for setting up a migrations-first pipeline with SQL Change Automation from Dev to Continuous Integration and finally, deployment.

Fresh Springtime Product Updates: D2iQ Kommander 1.4 and D2iQ Konvoy 1.8 Are GA!

It’s that time again: the latest versions of D2iQ Konvoy and D2iQ Kommander have just been made generally available and the D2iQ Kubernetes Platform (DKP) has some powerful new features. As noted with our last update, DKP is the leading independent Kubernetes platform for enterprise grade production at scale and Konvoy and Kommander are the reason why. You can learn more about Konvoy here, Kommander here, and our general approach here.

API Discovery Is Now at Your Fingertips: API Portal for VMware Tanzu Is GA

Chris Sterling, Shruti Iyer, and Aditya Tripathi contributed to this blog post. APIs—the key component of any company’s microservices model—are driving digital transformation in modern enterprises. Indeed, “66 percent of organizations report using private or B2B APIs,” according to the Gartner report, “Create API Portals That Drive API Adoption Among Internal and External Developer Communities” by Akash Jain and Mark O’Neill, November 2020.

Civo official launch!

Countdown to our official production launch! We'll be giving you a behind the scenes look at how we build and provision a new CivoStack region - our custom Kubernetes platform based on K3s. Including a specially recorded time-lapse build of our latest location. Featuring an introduction from our CEO Mark Boost, and Director of Innovation Dinesh Majrekar who will run you through our zero-touch region configuration.

Secure container orchestration at the edge

The cloud-native way of building software allows for consistency across developer environments and massive scalability of application deployments. Both these attributes are useful for edge, but create new challenges related to security and resilience. Watch this demo to see how Canonical’s modular technology stack addresses these challenges by using well-known cloud primitives.

Choosing the Best AWS Serverless Computing Solution

Serverless computing is becoming increasingly popular in software development due to its flexibility of development and the ability it affords to test out and run solutions with minimal overhead cost. Vendors like AWS provide various tools that enable businesses to develop and deploy solutions without investing in or setting up hardware infrastructures. In this post, we’ll cover the many different services that AWS provides for supporting serverless computing.

Splunk Log Observer: Log analysis built for DevOps

Log analysis is a key part of getting answers from your stack, and Splunk Log Observer, part of the Splunk Observability Cloud, is built for fast, powerful log analysis. Trust the industry-leading expert on logs to help you draw insights fast from any volume of data, in real-time, without having to write any queries by hand.

Kubernetes: Weighing Advantages and Disadvantages

Kubernetes is one of the current leading technologies. Its adoption has seen tremendous growth in the past few years. The concept of containers is a paradigm that appears to be the predominant medium of software development and deployment in the coming future. Containers help maintain consistency across various platforms, as they pack an application with its dependencies to help move it from one platform to another.

Announcing Ribbon Voice Sync for Regional and Rural Service Providers

Regional and rural providers are challenged every day to keep their network together. Time, obsolescence, razor-thin margins, and changing customer expectations are conspiring to pull it apart. Watch on-demand to learn how Ribbon is synchronizing multiple elements of our portfolio into a new and better solution for regional and rural providers.

Failover Conf follow-up: Your team and culture questions answered!

Thank you all for joining us last week for Failover Conf 2! We had a great turnout this year, with over 1,800 participants, 20 sponsors, and 9 amazing sessions. After more than a year of virtual events and video calls, we know that Zoom fatigue is real. We tried to make this event different by finding new ways to bring the community together and thinking of fun new ways to shake up the conference formula.

Launching Argo CD Autopilot: An Opinionated Way to Manage Your Applications Across Environments Using Gitops at Scale

Argo CD has been skyrocketing in popularity with the CNCF China survey naming Argo as a top CI/CD tool for its power as a deployment automation tool. And it’s no wonder, GitOps is a faster, safer, and more scalable way to do continuous delivery. Most of our own users are embracing GitOps to manage infrastructure and applications at scale in gaming, finance, defense, media, and other industries.

Out with GraphQL, in with gRPC

At Speedscale, we’re always trying to find ways to iterate faster and reduce developer toil. In line with that mission, we slant our engineering decisions towards using cutting edge tech because we usually move faster and it also allows us to help our customers later on when they upgrade their own tech stack. Recently, we had the opportunity to upgrade the communication channel between our api-gateway and react front end. This journey provided some unexpected benefits.

Managing Users and Groups with SCIM in the JFrog Platform

When your organization becomes bigger, managing the users and groups lifecycle becomes a significant challenge. Your company grows rapidly, hiring new employees, and giving them access to more and more applications that your organization uses. This means that there are many employee-related actions that need to be taken when an employee changes their team, role, leaves temporarily or permanently (otherwise you may end up with operational, security or compliance issues).

DevOps vs. Agile

DevOps is a term for, “a cross-disciplinary practice dedicated to the study of building, evolving and operating, rapidly-changing resilient systems at scale.” (Jez Humble) There is no wall between development and operations so they work simultaneously and without silos. The system focuses on uniting the developmental and operations teams in a continuous process. Agile is a software development strategy that focuses on responding to change with cross-functional team communication.

What is Enterprise Architecture & How to Develop .Net Based Enterprise Architecture?

Developing a holistic enterprise architecture is the first step to acquiring a wholesome grip over the evolution and management of an organization. Enterprise architecture enhances Business Process Improvement and significantly optimizes costs by standardizing technology – two of the most crucial factors that influence the ROI of an organization. The enormous efficiency and cost-savings, that enterprise architecture brings about, have strengthened the belief in enterprise architecture today.

Monitor these Metrics to Keep your Servers Controlled

If we look at server definition, it is a piece of computer software or hardware that provides functionality to other devices or programs called clients. System administrators often come up with a common question over the performance of a server – Why is my server down? If server monitoring and management are inefficient, it often makes it very difficult to correctly analyze complex and unpredictable information in a data center. It’s hard to find a reason for server outage.

Announcing HAProxy Kubernetes Ingress Controller 1.6

We’re proud to announce the release version 1.6 of the HAProxy Kubernetes Ingress Controller. This version provides the ability to add raw configuration snippets to HAProxy frontends, allows for ACL/Map files to be managed through a ConfigMap, and enables complex routing decisions to be made based on anything found within the request headers or metadata.

LogicMonitor's Certified Ansible Content Collection Allows You To Do More With Less

Here at LogicMonitor, we’re really big on extensibility and automation. We’re constantly adding to our catalog of monitoring coverage, and we spend a lot of our time ensuring that setup is as simple as possible. We also monitor almost any data you can expose on a network. People have done way more with LogicMonitor than we would have ever imagined, and I’m extremely excited to announce our next step in that commitment to extensibility and automation.

Announcing Native Integration for Hashicorp Vault Secrets

Secret management is one of the most critical areas in deploying and running applications. Codefresh already had native support for native Kubernetes secrets or custom secrets on the Codefresh Runner, but more and more customers have asked us for native support for Hashicorp Vault. Today we are pleased to announce our native integration with Hashicorp vault as another secret provider for Codefresh pipelines.

Model-driven audit trail infrasructure

Graylog is one of the most popular tools for opensource monitoring and log management in telco environments. We will show how Charmed OSM with the help of juju eases its deployment and integration with MongoDB, elastic search, and other charmed telco network function elements. The same goes for a basic LMA stack with Prometheus and Grafana.

Improve your Reliability with Blameless SLOs, Now Generally Available

Blameless is excited to announce that our SLO Manager is now generally available! SLO Manager is a new service added to the Blameless platform. This service helps SRE and engineering teams proactively make data-driven decisions about reliability efforts. According to a survey Blameless conducted, over 80% of organizations use SLOs or will in the next 1-2 years.

CostOps: The Overlooked Developer Responsibility

Developers are the kingmakers. Millions of decisions made by tens of thousands of developers are ultimately responsible for the triumph or tragedy of IT. Developers for commercial vendors, open-source projects, cloud and software as a service (SaaS) solutions, managed service providers (MSPs), and internal teams make most technology decisions far upstream from IT pros. This essentially defines ops’ role as the crew who finds a way to make washing machines fly in formation.

How to Connect the Dots: Creating Complex CI/CD with JFrog Pipelines

As software gets more complex, so do software builds. With applications being composed of multiple services — often developed by separate teams — it can be challenging to automate a unified continuous integration process. JFrog Pipelines is unique among DevOps CI/CD solutions in empowering developers to create highly complex DevOps Pipeline workflows. Pipelines can be defined with multiple paths, trigger points and trigger types.

Optimized billing and customer management for AWS MSPs

Cloud MSPs or managed service providers are great at helping companies properly leverage the public cloud, typically handling cloud strategy, implementation and day-to-day operations for their customers. However, when it comes to things like customizable billing, analyzing cloud spend per customer, optimizing cost and increasing profit margins, MSPs are over-burdened with complex, manual processes.