Operations | Monitoring | ITSM | DevOps | Cloud

October 2023

How to Manage, Unmanage, and Move Clusters Between Cluster Groups

Learn how to move a cluster using the VMware Tanzu Mission Control UI and how to unmanage and manage a cluster. A managed cluster can be lifecycle managed by Tanzu Mission Control even if it was created outside of the tool. This option goes beyond attaching a cluster because it includes lifecycle management (LCM) capabilities, while attached clusters do not include LCM support.

Building on Chaos Toolkit's Foundation: New Features for Resilience Engineering

On October 26th 2023, we had the pleasure of receiving Manuel Castellin, a seasoned expert in chaos engineering and Terraform, who took us through two real-world examples demonstrating how to overcome the challenges of implementing chaos engineering when your infrastructure isn’t initially prepared for it and securely experiment on production systems. In the second part of the meetup, Sylvain Hellegouarch, Chaos Toolkit lead developer and Reliably CEO, showed a quick demo of how to use Reliably to build your experiments in a less code-centric and more visual way.

Replay messages in Azure Service Bus dead-letter queue

When working and dealing with asynchronous messaging patterns – in this case, using Azure Service Bus – depending on the requirements, you will find many scenarios when you need to reprocess messages. Sometimes, a message failed because a system was offline for a certain period, there was a bug with the service, and we needed to resend specific messages and many other reasons.

Spot Connect: Reflecting on progress and innovation

As we make Spot Connect available for all users today, we’re taking a moment to look back at the journey we’ve embarked upon since its beta release. Spot Connect made its debut with a clear vision—to revolutionize cloud operations by automating workflows, and the response from the Spot by NetApp community has been instrumental in shaping its evolution. Here’s a retrospective on the key milestones and lessons we’ve gained during this transformative year.

Harnessing PowerShell to Optimize SharePoint Online Management

In the realm of collaborative digital environments, SharePoint Online emerges as a quintessential platform fostering seamless interactions, data sharing, and project coordination among teams. As part of the Microsoft 365 suite, it encapsulates a rich array of features tailored to meet the diversified needs of modern organizations.

Webinar: Shining a light on developer productivity

In the last 5 years, we’ve watched the world's fastest growing engineering teams ditch development monoliths in favor of service-oriented architectures that speed time to market. And as microservices multiplied—making it harder to track ownership and quality—Internal Developer Portals (IDPs) emerged to help. But while the prospect of a single portal for developer productivity sounds enticing, veteran leaders know the perception of “one more tool” can make org-wide adoption challenging.

How do you measure software health?

Just like personal health, software health is best managed proactively so you can prevent issues before they occur and avoid costly, stressful outages. Cortex helps you track and improve the health of your software with Scorecards and Initiatives. Scorecards quantify software health by aggregating data from multiple sources to give you a continuous view into the health of your system. Initiatives use Scorecards to drive organizational improvement.

Implementing Backstage 3: Integrating with Existing Tools Using Plugins

This third part of the “Implementing Backstage” series explains how to integrate Backstage with existing tools and plugins. If you’re at an earlier stage of your Backstage implementation, the two previous installments in this series focus on getting started and using the core features. If you’re looking for a more general introduction to Backstage, you can read the first article in the “Evaluating Backstage” series.

AWS vs GCP: Choosing the Best Cloud Provider for Your Needs

If you are confused about whether to go for AWS or Google Cloud (GCP) in 2023, then you should read this article. Both AWS and GCP have made tremendous progress over the past few years. There was a time when AWS was much ahead of Google, but not anymore. Google has reduced the gap to a great extent; it even surpasses AWS in some areas, especially data analytics. However, there is no one-size-fits-all answer to this problem.

Logic Apps standard The developer experience

In this session titled " Logic Apps standard: The developer experience" by Wagner Silveira, Senior program manager at Microsoft, offers an insightful dive into Logic Apps Standard from a developer's lens. He delves into the existing Logic App model, differentiating between Built-in and Azure connections, and highlighting the distinction between portal and local Logic App workflow development. Uncover the latest developer tools, including the enhanced designer 2.0, data mapper for visual component mapping, custom code support, and the convenient export tool.

Running MongoDB on Kubernetes

Containers are a lightweight, portable, and consistent way to package applications and their dependencies. Containers provide an isolated environment, ensuring an application runs reliably across different environments. Enterprises and tech-savvy individuals are using container technologies because of their benefits. However, container orchestration tools have become necessary to manage clusters with the rise in container usage.

5 Reasons Why You Should Migrate to the Cloud in 2023

Even though cloud migration is rapidly increasing globally, larger enterprises may hesitate to adopt this technology. This is typically the result of imagined obstacles like possible dangers, complex migrations, or a deficiency of specialized knowledge. Organizations that choose not to go to the cloud, however, run the risk of suffering a far more significant cost for their inaction. This article outlines the top five arguments for moving your company toward cloud computing and overcoming reluctance.

How environmental parity accelerates automotive software development

A lot of people, like myself, believe automotive is the most innovative sector today, especially when it comes to software. We are living a critical moment in automotive, where evolution is being pushed onto the market. This rapid software shift is posing challenges that are complicating developers’ progress. Most of these challenges are hardware-related. Gaining access to target development hardware has become an impossible task, the recent global microchip shortage did not help.

Weighing the Costs and Returns of Colocation

As businesses continue to expand their digital footprints, the demand for efficient and secure data management solutions has never been greater. Colocation, the practice of housing servers and IT infrastructure in third-party data centres, has emerged as an attractive option for enterprises seeking enhanced scalability, reliability, and cost-effectiveness. However, before making a decision on colocation, it is vital to carefully assess the financial aspects associated with this service.

Lessons Learned from a Digital Transformation in the Finance Industry

In May 2023, Thomas Kronawitter, Head of Data-Driven Applications & Services at Grenke AG, joined Redgate CPO David Gummer at the Gartner Data & Analytics Summit to provide insights and advice based on his own digital transformation journey in financial services. This post highlights the key take-aways from David’s conversation with Thomas, including the strategies, challenges, and successes that shaped Grenke’s experience.

The Significance of Event-Driven Architecture in IT Monitoring

In today's fast-paced digital world, the reliability and performance of IT infrastructure are critical to business success. Monitoring technology plays a pivotal role in ensuring that systems and networks operate seamlessly, and one such technology is Zenoss. This blog provides an in-depth look at Zenoss technology and sheds light on the importance of event-driven architecture in modern monitoring.

Migrating from Travis to Github Actions

For CFEngine we manage several public and private repositories of code in GitHub for our Open Source and Enterprise products. In order to ensure quality we run many checks on the code both with nightly builds as well as on each pull request. We use a Jenkins server for nightlies which also includes more extensive deployment tests on all of the platforms we support. Previously we had used Travis for many of these checks but that system started to show its age and limitations.

Plan new architectures and track your cloud footprint with Cloudcraft by Datadog

In a rapidly expanding, highly distributed cloud infrastructure environment, it can be difficult to make decisions about the design and management of cloud architectures. That’s because it’s hard for a single observer to see the full scope when their organization owns thousands of cloud resources distributed across hundreds of accounts. You need broad, complete visibility in order to find underutilized resources and other forms of bloat.

Interlink's Service Chain Mapping solution: Helping Banking & Finance Organizations Strengthen Operational Resilience and Meet Regulatory Requirements

Operational resilience is an increasing area of focus and scrutiny for regulators of the banking and financial services industry. In the European Union, the Digital Operational Resilience Act (DORA) looms on the near horizon - with equivalent regulatory frameworks slowly but surely rolling out across the globe.

Azure Functions Distributed Tracing

In today’s cloud-centric, serverless computing landscape, applications are increasingly distributed and complex, composed of numerous microservices, functions, and external dependencies. Azure Functions, a serverless compute service offered by Microsoft Azure, plays a pivotal role in building scalable, event-driven applications.

Introducing Squadcast's Global Event Rulesets | Incident Management | Squadcast

With video will give you a walkthrough of Squadcast's new feature 'Global Event Rulesets' that helps you simplify alert Routing and boost efficiency Global Event Rulesets enable you to manage alert routing across services and automate actions based on predefined global event rulesets.

Dev X and Platform Engineering | Scaling Enterprise Agility | Atlassian

Guest host Nick Polce, Business Agility Practice Lead at Accenture, speaks with Mirco Hering, Global Offering Lead for Developer Experience and Platform Engineering at Accenture. The conversation, which was recorded live at the Agile Australia Conference, dives into the differing career models that can help engineers thrive in an Agile environment, with practical advice for leadership and project managers alike.

5 Ways to Streamline IT Operations With the UK's Northern Connectivity Hub

The Edinburgh South Gyle colocation data centre is the largest and most connected in Scotland, and one of the most connected north of London. In fact, there are 27 carriers available at the centre — more than double the number that any of our other data centres have. In fact, it’s so large and well-connected that we refer to it as the UK’s Northern Connectivity Hub. But what does that mean for your business - and how can you benefit from the colocation services available?

How to Grow, Nurture, Engage, & Measure by Kim McMahon - Navigate Europe 23

Join Kim McMahon's engaging talk from Navigate Europe 23 as she delves into the dynamic world of open source communities and revenue generation. Learn how to strike the right balance between nurturing tech communities, respecting open source principles, and driving business growth. Kim shares practical examples and valuable insights, making this a must-watch for anyone interested in the intersection of open source and sustainable business strategies.

Secret to Flawless Deployments: Real-Time Canary Deployment tracking with Argo CD & Levitate!

Most of your outages are probably caused by a change, and having observability around that will make a lot of difference. Dive into this walkthrough, where we showcase tracking Canary deployments in Argo CD, correlating events and metrics seamlessly with Levitate. For Site Reliability Engineers, DevOps engineers, Software Engineers, and Product Managers seeking to elevate their observability and ensure smooth deployments every time.

Discover Civo's Latest Cloud Solutions: Kubernetes, GPUs, and More! - Navigate Europe 23

Discover Civo's latest product announcements in this engaging conversation with CTO Dinesh Majrekar, Field CTO Saiyam Pathak, and Chief Innovation Officer Josh Mesout. Learn about fast and affordable cloud solutions, including Kubernetes, managed databases, machine learning with GPUs, sustainable data centers, and cutting-edge WebAssembly services.

Netdata vs Prometheus: Performance Analysis

In an era dominated by data-driven decision making, monitoring tools play an indispensable role in ensuring that our systems run efficiently and without interruption. When considering tools like Netdata and Prometheus, performance isn't just a number; it's about empowering users with real-time insights and enabling them to act with agility.

Key Principles of Successful DevOps Implementation

Software development, DevOps has emerged as a game-changer. It’s not just a buzzword; it’s a cultural and technological shift that allows organizations to accelerate their software delivery while maintaining high quality and reliability. However, successful DevOps implementation is not merely about adopting a set of tools or following a predefined set of rules. It’s a holistic approach that requires a deep understanding of key principles.

Tips To Never Miss An Incident Notification With Squadcast Escalations Policies

Companies implement an Incident Response process to promptly resolve critical issues. Setting up escalation policies to notify engineers is a key step in this process. With traditional escalation policies, alert notifications still get missed which results in higher response times and failure to meet SLAs. So, how can one ensure incident notifications are never missed?
Sponsored Post

Opsgenie Alternatives: Finding the Right Fit for your Incident Management Teams

In the dynamic landscape of modern IT operations and Incident Management, choosing the right tool is paramount to ensuring the resilience of your organization. Opsgenie, a popular Incident Response and Alerting platform, has been a go-to choice for many. However, as businesses grow and requirements evolve, exploring Opsgenie alternatives becomes essential in the quest to find the perfect fit for your unique operational needs. In this blog, we'll embark on a journey to uncover and evaluate some compelling alternatives to Opsgenie, helping you navigate the vast sea of options and make an informed decision that aligns perfectly with your team's workflows and objectives.

Automation on the go with the Kelverion Portal Mobile App

The shift in remote working has seen an increase in demand to utilize smartphones to host meetings, check emails and message colleagues on Microsoft Teams. Keeping up with this trend, the Kelverion team have created a mobile app version of our popular Self-Service Automation Portal, which previously has been a web-based solution typically used on a desktop device. The portal is available for iPhone devices and can be downloaded from iOS app store.

Optimizing SharePoint Online Costs

SharePoint Online (SPO) has become a cornerstone for many organizations seeking a robust, scalable, and collaborative platform. It’s a place where teams can seamlessly work together, share documents, and enhance their workflow efficiency. However, while SPO offers a plethora of benefits, the cost associated with its usage can be a potential hurdle, especially for businesses with large volumes of data.

DevOps & DORA Metrics: The Complete Guide

In in order to achieve DevOps success, you must measure how well your DevOps initiatives work. Tracking the right DevOps metrics will help you evaluate the effectiveness of your DevOps practices. In this article, I’ll explain many DevOps metrics, including their significance, the key metrics for various goals, and — best of all — tips for improving the score of each DevOps metric discussed here.

Webinar: Streamlining Incident Management With Automation and Contextual Awareness

In the modern context of distributed teams & complex digital infrastructure, major incidents having a negative impact spanning multiple teams and services can cause a barrage of alerts. While a meticulously designed incident response strategy can aid in restoring order, it's essential to underscore the significance of providing responders with effective tools that offer contextual understanding and facilitate the identification of actionable alerts.

2023 State of DevOps Report Takeaways

Don: The debate is over - how should you structure your software teams? That question is now answered in this year's State of DevOps report 2023. Other questions answered include: How does AI affect my company and team performance? How can we quantify the impact of culture on performance burnout? What even is culture in the first place? All these things are included in the State of DevOps report 2023. We have a very special guest, Eric Maxwell from the DORA group, to offer his takes on the report.

Sustainable Futures: Revolutionizing Data Centers and Tech Practices - Navigate Europe 23

Join us at Navigate Europe 23 for a critical conversation on sustainability in the tech industry, focusing on the challenges and innovative solutions related to data centers. Our expert panelists Amanda Brock, Dinesh Majrekar, John Ridd, Mark Bjornsgaard & Mike Paisley share their insights and experiences, discuss the role of governance and regulation, and emphasize the power of individual and collective action in driving change. Whether you're in the tech industry or simply interested in sustainability, this discussion is not to be missed!

MSP's As NOC's, Handling Multiple Clients

A Managed Service Provider (MSP) should invest in an Incident Management platform to ensure seamless service delivery and customer satisfaction. Such a platform streamlines Incident Response, improves service reliability, and enhances communication among teams. It helps MSPs in reducing Mean Time to Acknowledge (MTTA) and Mean Time to Resolve (MTTR) incidents, thereby minimizing downtime and service disruptions.

Unleashing the Full Potential of FinOps: Going Beyond the Cloud

FinOps is beginning to take the enterprise by storm, but many enterprise IT leaders may be taking too narrow a view and risk falling into a trap. There’s a good reason for this upswing in FinOps attention: runaway cloud costs are becoming a significant challenge for enterprise IT leaders, particularly as they move legacy workloads to the cloud in earnest.

How to measure operational maturity

All of the most reliable software is driven by great operations. Your organization’s operational maturity is a measure of how consistently you apply best practices for building reliable software. Without tracking your operational maturity, it’s extremely difficult to know where and how to improve—before it’s too late and an incident causes you to lose a customer.

How to Setup up SharePoint Alerts

Microsoft SharePoint stands as a robust web-based collaboration platform that has become indispensable for a myriad of organizations aiming for a streamlined and effective workflow management. Its seamless integration with Microsoft 365 unfolds a vista of functionalities, among which the feature of real-time notifications or alerts shines prominently.

AI Explainer: Glossary of Artificial Intelligence Terms

I speak with customers and partners pretty much every week about artificial intelligence. The knowledge levels can differ quite dramatically — some are quite AI savvy while others find the jargon bewildering. This is quite understandable as AI is a rapidly evolving field with its own set of specialized terminology. This blog post is purely meant to provide a beginner-friendly reference for some essential AI terms to make it easier to navigate conversations and articles on the topic.

AppDynamics Talks Optimized Self-healing with Full-stack Observability, Auto-remediation

From an IT perspective, technologists generally agree that the ability to monitor and have visibility into the IT stack across every one of their applications is essential with the now-permanent remote and hybrid work models. It also stems from the fact that digital transformation and IT growth has accelerated by seven years since the pandemic in 2020, analysts say.

The DevOps Security and Compliance Guide

The fast-paced nature of modern software development means developers are capable of deploying changes to production multiple times a day. But, while DevOps allows development teams to deliver new features faster, increased deployment frequency can make it more difficult to stay on top of security threats. It only takes one malicious or incompetent change to dramatically increase the risk exposure of an application.

Running Containers On IPv6 Networks

What better motivation to start adopting a technology than the need to completely replace the alternative. From a world driven on limited resources like gas, coal, and oil to wind, hydro, nuclear, geothermal, and solar… we find ourselves, as a species, evolving past a first generation we envisioned as abundant, but turned sparse. Adopting IPv6 networking is another version of this same story.

Navigating the Future of Technology and Business with Jo Drake & D. Majrekar - Navigate Europe 23

Join Jo Drake and Dinesh Majrekar at Navigate Europe 23 for a comprehensive discussion on the future of technology and business. They share valuable insights on AI, cloud computing, and emerging technologies, emphasizing efficiency, security, and sustainability. Dive into the world of tech with experts and explore the skills and practices needed for the evolving landscape. Don’t miss the wisdom of continuous learning and networking in the fast-paced world of technology!

GitKon 2023 Day 3: Legendary, Free Online Developer Conference

Welcome to GitKon 2023: The Fellowship of Code, hosted by GitKraken. GitKon promises three days of insight, learning, and inspiration specifically designed to help software developers, team leaders, product and project managers, and technical executives up their game. Featuring keynotes from tech titans, Dharmesh Shah of HubSpot and Justin Cormack of Docker, and a special guest appearance by The Lord of the Rings star and mental health advocate, Sean Astin.

RapidSpike + Squadcast: Routing Alerts Made Easy

RapidSpike is a website monitoring solution that focuses on all three key aspects of website health: performance, reliability and security in a single dashboard. If you use RapidSpike for your website monitoring requirements, you can integrate it with Squadcast, an end-to-end Incident Response tool, to route alerts from RapidSpike to the right users in Squadcast with ease.

Why It's So Complex To Build an Internal Developer Platform on Kubernetes?

The modern software landscape thrives on the efficiency and automation that Kubernetes brings to the table. Its orchestration prowess forms the bedrock of an Internal Developer Platform (IDP). However, converting this technical marvel into a developer-friendly haven is a pursuit that demands meticulous attention and a vast amount of unseen effort.

Unveiling the Synergy: SharePoint Online with Azure AD

In the modern digital workspace, collaboration and security are paramount. Microsoft has been at the forefront of providing solutions that enhance these aspects of organizational operations. Two such solutions are SharePoint Online and Azure Active Directory (Azure AD). SharePoint Online, a cloud-based service, empowers organizations to create, share, and manage documents and content in a collaborative environment.

6 Best Azure FinOps Tools for Cost Optimization

FinOps is an evolving concept increasingly practiced in cloud computing organizations to manage and optimize their infrastructure cost. It requires team collaboration among Finance, Engineering and IT Operations to gain a deep understanding of the expenditure, take financial accountability, and make informed decisions to maximize the business performance.

Artificial Intelligence: Friend or Foe?

From the science fiction fantasies of the mid-20th century to today's reality, AI's journey has been a blend of innovation and apprehension. As we contemplate the future of AI, it’s interesting to look back at the early days of AI, how far it’s come and what we might yet expect. AI has the potential to be of huge benefit but could be disruptive in the wrong hands, particularly in the realm of cybersecurity. A Brief History and Development of AI.

What is a Pull Request and Why You Need Them

As an engineer, you're probably familiar with version control systems like Git that let you track changes to your codebase. But are you using one of the most useful features of Git pull requests? If not, you're missing out. Pull requests are one of the best ways to collaborate on projects and create better code. In this article, we'll go over the pull request meaning, why you should be using them, and how to create your own pull requests.📑 What is incident management software?

Unlocking IT Transformation: Synergizing ITIL and AIOps for Enhanced Monitoring and Observability

In the ever-evolving landscape of IT operations, staying competitive and efficient is a paramount concern for organizations. This session aims to be your compass on this transformative journey, offering strategic insights that resonate with both C-suite executives and IT team leaders. Join UnityTech CEO & Founder Jesus Cordoba for a dynamic exploration of how the ITIL framework, DevOps practices, and AIOps technologies can synergize to provide your teams and executives with unprecedented superpowers in the realm of monitoring and observability.

The Dangers Lurking in Open Source Software

Our 1st blog in our series on securely consuming OSS. Today, I'll give an overview of some of the most common types of attacks from consuming OSS. Open-source software (OSS) fuels innovation. Over 96% of commercial applications rely on at least one OSS component (Synopsys, 2023). At Cloudsmith, we champion OSS and understand its indispensable role in today's software landscape. However, the escalating threat of supply chain attacks targeting OSS demands a robust defence.

GitKon 2023 Day 2: Legendary, Free Online Developer Conference

Welcome to GitKon 2023: The Fellowship of Code, hosted by GitKraken. GitKon promises three days of insight, learning, and inspiration specifically designed to help software developers, team leaders, product and project managers, and technical executives up their game. Featuring keynotes from tech titans, Dharmesh Shah of HubSpot and Justin Cormack of Docker, and a special guest appearance by The Lord of the Rings star and mental health advocate, Sean Astin.

What is Continuous Delivery? The Benefits of a Well-Tuned Continuous Delivery Software Pipeline

What is continuous delivery? And what are the benefits of the continuous delivery pipeline? This strategy has evolved in a world where platform engineering is on the rise and more and more organizations rely on automation through code to achieve their goals. Times have changed. Most organizations now rely on continuous delivery as an essential part of their development pipelines.

What Is Lift And Shift? Is It Right For You?

There are several ways to migrate to the cloud today — None of which are an equal path to modernizing on-premises applications and workflows. Lift and shift migration promises cost savings, speed, and less effort compared to other cloud migration strategies. But are these claims true? In this guide, we’ll cover whether a lift and shift migration strategy actually saves you money and effort when migrating to the cloud.

Elevating Document Management with SharePoint Document Libraries

In the digital era, effective document management is a cornerstone for operational efficiency in organizations. With a surge in data generation and collaboration needs, having a robust system to store, manage, and share documents is imperative. SharePoint Document Libraries emerge as a pivotal tool in this regard, offering a myriad of features to streamline document management, enhance collaboration, and uphold information governance standards.

How to fix and prevent ImagePullBackOff events in Kubernetes

You'll often hear the term "containers" used to refer to the entire landscape of self-contained software packages: this includes tools like Docker and Kubernetes, platforms like Amazon Elastic Container Service (ECS), and even the process of building these packages. But there's an even more important layer that often gets overlooked, and that's container images.

Monitoring vs Observability: What Engineers Need to Know

As systems increasingly shift towards distributed architectures to deliver application services, the roles of monitoring and observability have never been more crucial. Monitoring delivers the situational awareness you need to detect issues, while observability goes a step further, offering the analytical depth to understand the root cause of those issues. Understanding the nuanced differences between monitoring and observability is crucial for anyone responsible for system health and performance.

Understanding the EU Green Deal and Its Impact on Data Centers

Organizations in Europe are currently facing the challenge of reducing energy consumption and improving sustainability in light of the European Green Deal. The EU Green Deal has been approved by the European Commission which focuses on decreasing greenhouse gas emissions by 55% compared to the 1990 levels by 2030. Europe is striving to be the first climate-neutral continent by 2050.

Platform Engineering - Paving the Way to Accelerate Eeleases

Platform engineering started appearing in 2017 and was identified as a distinct DevOps role in 2019 in the book Team Topologies. Adoption quickly accelerates, and by 2022, there are platform engineering meetups worldwide, and over 6,000 DevOps professionals attend Platform Con, the first platform engineering conference. With such a fast rise in popularity, many technology professionals still need clarification on the role and how it fits into the DevOps methodology.

Qovery's Vision: Shaping the Future of Internal Developer Platforms

In a landscape inundated with tools and technologies, the real challenge for companies is not just about having an array of options but about ensuring these options harmoniously fit into their unique technical environments. Qovery understands this, and it’s evident in the modularity of its ecosystem. But the journey doesn’t end here. Looking ahead, Qovery envisions a paradigm shift in the way Internal Developer Platforms (IDP) are perceived and utilized.

Policy as Code Tools + Examples to Make Better Infrastructure Easier, Anywhere

You’re scaling your IT infrastructure so you can do more – deploying across clouds and data center, adding servers, coding like crazy. Great! But how do you keep it all from falling apart? Policy as code is an approach to managing IT that strategically leverages infrastructure as code (IaC) and compliance as code to manage consistent policies across complex IT environments. Sounds perfect, right?

Kubernetes Unpacked: Driving Enterprise Success with Cloud-Native

Supercharge your production timelines by watching "Kubernetes Unpacked: Driving Enterprise Success with Cloud-Native Architecture." In this video, Andreas Prins, StackState's CEO, investigates why Kubernetes has emerged as the leading OS of the cloud and delves into why businesses worldwide are choosing it as their container orchestrator.

Why Invest in Tooling? Benefits and Concerns

When looking to invest money in your engineering teams, what gives the best return? Hiring more staff to enable bigger projects and more diversified skill sets? Training engineers to uplevel their ability and productivity? Increasing salaries to retain the best talent? These are all great ideas that should be exercised often. But there’s one other investment worth considering that can offer huge benefits for relatively small amounts of money: tooling.

Introducing enhanced webhook security

We are excited to announce webhook secrets, a powerful new feature that will provide an extra layer of security for your webhook payloads in Bitbucket Cloud. With the ability to add secrets to webhooks, you can now sign webhook payloads to ensure they are coming from Bitbucket Cloud and protect against unauthorized access.

Spot by NetApp leads the way in GigaOm Radar for Cloud Management Platforms

The Spot by NetApp portfolio has again been recognized as a leader and outperformer in the GigaOm 2023 Radar for Cloud Management Platforms (CMPs). This industry analyst report highlights key CMP vendors whose primary purpose is to help organizations manage the increasing complexity of cloud environments and control costs more effectively.

Optimizing SharePoint Security

In today’s digital-first business landscape, collaborative platforms like Microsoft SharePoint are not merely a convenience but a necessity. They facilitate seamless interaction, information sharing, and collective project management across geographically dispersed teams. However, the enhanced connectivity and accessibility come with a set of security challenges.

Getting started on alerts with Escalation Policies

Escalation policies are essential for making sure that incidents are quickly addressed and resolved. They provide a systematic approach to automate alerts, guaranteeing that no incident goes unnoticed. Let’s get you started, shall we? An escalation policy is a way to automate alerts and assure that incidents are never missed. The first point of contact for an incident is through an alert that is sent according to the escalation policy.

5 Ways Companies Gamified FinOps To Drive A Cost-Aware Engineering Culture

Getting engineers to take action to optimize spending can feel like an eternal struggle for many cloud-based companies. In fact, this particular issue is consistently considered to be one of the number-one cost challenges modern companies face. Thankfully, it doesn’t have to be this way. Below are a few creative ways companies have gamified FinOps practices to make the learning process more interesting and rewarding for engineers.

What is Infrastructure as Code? An Introduction to IaC

Infrastructure as Code, or IaC, is the practice of automatically provisioning and configuring infrastructure using code and scripts. IaC allows developers to automate the creation of environments to generate infrastructure components rather than setting up the necessary systems and devices manually.

Platform.sh partners with Elasticsearch to supercharge our applications

We are excited to announce that Platform.sh now offers the latest releases of Elasticsearch to all of our customers going forward! Take full advantage of the new features of Elasticsearch version 8 and integrate more artificial intelligence (AI) into your applications today.

Six Types of Metrics Product Managers Should Know

Setting and tracking appropriate target metrics are an important part of a product manager’s job. Goals must be defined, not just as inspiring vision statements, but also as quantifiable targets that can be objectively measured. Metrics can be deployed in different contexts and for purposes; however, useful metrics in one scenario can be misleading in regards to another. The problem comes when you’re not clear about what kind of metric you’re trying to set.

12 Best Practices to Improve Incident Management

Today’s fast-paced digital world can lead to system breakdown and disruptions that strain organizational resources. What truly distinguishes successful organizations is their response when problems occur. Incident management serves this function. At its core, incident management involves teams managing unexpected disruptions quickly with minimal impact to users or business operations. The process is like a safety net that prevents further problems from developing into trust issues.

Optimize your infrastructure with CloudNatix and Datadog

CloudNatix is an infrastructure monitoring and optimization platform for VMs, containers, and other cloud resources. Customers can use CloudNatix’s Autopilot feature to automatically configure and run infrastructure optimization workflows that allocate and run their resources more efficiently. CloudNatix can take action to auto-size Kubernetes and VM workloads, defragment Kubernetes clusters, and create harvest pods from unused VMs, among other key optimizations.

GitKon 2023 Day 1: Legendary, Free Online Developer Conference

Welcome to GitKon 2023: The Fellowship of Code, hosted by GitKraken. GitKon promises three days of insight, learning, and inspiration specifically designed to help software developers, team leaders, product and project managers, and technical executives up their game. Featuring keynotes from tech titans, Dharmesh Shah of HubSpot and Justin Cormack of Docker, and a special guest appearance by The Lord of the Rings star and mental health advocate, Sean Astin. Learn coding tips and techniques, explore emerging technologies, and get lessons in better team communication and collaboration.

The price of building your own incident management tool is not what it seems.

Build or buy? An age-old decision that gets made dozens of times a year. It’s quite possibly one of the most important decisions you make as an company. It impacts roadmaps, productivity, team structure, and customer satisfaction (you know, just a few little things). There are a lot of factors to consider, one of the most prominent being cost. So, what exactly are the costs you need to consider when building your own incident management solution?

From Development to Deployment: Streamlining Workflows with IDPs

Ever wondered how software development teams can efficiently tackle the complexities of modern development challenges? The answer lies in the Internal Developer Platforms (IDPs), a powerhouse of tools and capabilities for development and deployment. These platforms provide a comprehensive ecosystem for development and deployment, integrating key functionalities such as version control, CI/CD pipelines, container orchestration, and automated testing.

Aurora Vs. RDS: Choosing The Best AWS Database Solution

There are several high-performing database services available on Amazon Web Services (AWS). When you need to handle caching, Online Transaction Processing (OLTP), real-time data, session stores, or personalization, you can choose one of these options. Examples of these high-performance database services include Amazon RDS, Aurora, MemoryDB, DynamoDB, Neptune, and ElastiCache. We often get questions about which AWS database service is best between RDS and Aurora.

How Our FinOps Account Management Team Helps You Achieve Your Cloud Savings Goals

At CloudZero, we’re in the business of driving positive business change. Whenever we work with a client, our goal is to share insights rooted in our FinOps perspective and platform expertise that helps customers save on cloud costs and build strategies for engineering-led optimizations. Each experience is curated for the clients FinOps maturity stage and unique business goals; that’s what personalized service is all about.

Getting started with Azure Integration Services - Stephen W. Thomas

Stephen W. Thomas takes center stage in this session, reflecting on past Integrate experiences and talking about the Azure Integration Services. Labeling himself a "drag-and-drop developer," Stephen emphasizes the timeliness of adopting Azure Integration, vital for seamless business operations in today's virtual landscapes. From Logic Apps to AI Tools like ChatGPT, get a pulse on current market trends, the rising demand for Azure resources, and insights on strategizing BizTalk migration.

Bare metal vs virtual machines vs containers: Which is the right infrastructure for me?

There are three main infrastructure types to consider when hosting and deploying applications: Bare Metal, Virtual Machines (VMs), and Containers, each with its own advantages and disadvantages depending on your use case. The three technologies are not mutually exclusive however, as both VMs and containers run on top of bare metal servers, while containers can also be deployed inside VMs.

Testing GenAI: How to approach nondeterministic software development

Michael Webster, principal engineer at CircleCI, talks to Rob about testing AI-enabled applications. In this episode, learn how to face the unique challenges posed by the probabilistic and non-deterministic nature of AI output, as well as the importance of subjective evaluation criteria. Webster covers how model graded evals can be used to test AI applications, and the importance of caution in using this approach.

Common Nagios Errors and What to Do about Them

Nagios is an open-source monitoring system that has become indispensable for system administrators and DevOps teams across the world. However, like any other software, you’re bound to come across errors with Nagios. In this article, we’re going to take a look at some common errors and how to solve them, along with the pros and cons of Nagios, and why MetricFire is the perfect alternative for monitoring.

Blue Matador + Squadcast: Alert Routing Simplified

Blue Matador is the fastest, easiest way to set up AWS infrastructure monitoring, allowing small teams to fully monitor their cloud operations with no manual setup. If you use Blue Matador for your cloud monitoring requirements, you can integrate it with Squadcast, an end-to-end Incident Response tool, to route alerts from Blue Matador to the right users in Squadcast with ease.

Ceph storage for Kubernetes

Storage and container management systems are almost polar opposites of each other. One deals with permanently storing, and protecting data for as long as it’s needed. The other automatically manages highly dynamic workloads, scaling resources up and down as required. More organisations are taking a container-first approach to application deployment and management, but the underlying challenge of safely and securely storing data still remains the same.

How do you measure software security maturity?

Scorecards are a Cortex feature that allow you to understand how well your services are doing on the metrics you care about. Scorecards are customizable to your needs, however several are common to most organizations. In our previous post, we shared the top three scorecards that we recommend to Cortex customers. Security maturity is one of the first scorecards we recommend organizations create.

Jad Jebara on Reinventing DCIM: Optimizing Hybrid Infrastructures with Hyperview

In an exclusive Digitalisation World podcast, our CEO, Jad Jebara, delves deep into the ever-evolving hybrid infrastructure landscape. Join us as we explore how companies are strategically optimizing application performance and the infrastructure that fuels their digital ambitions.

NOC Success Like Never Before: Automation Strategies for All-new Incident Management

Network Operations might never be the same. But then again, why would anyone want it to be? The power of automation and orchestration can bring incredible value to the Network Operations Center (NOC), including the business-critical call to get proactive and ahead of the incidence response and management game. It’s more than a towering volume of events – it’s the complexities involved, too.

Why Real-Time Debugging Becomes Essential in Platform Engineering

Platform engineering has been one of the hottest keywords in the software community in recent years. As a natural extension of DevOps and the shift-left mentality it fosters, platform engineering is a subfield within software engineering that focuses on building and maintaining tools, workflows, and frameworks that allow developers to build and test their applications efficiently.

Don't just build a dashboard! A DORA cautionary tale

Don't just build a dashboard! A DORA cautionary tale. Software delivery success isn't just about dashboards and metrics. You also need to think about how to improve as an engineering team. The point of the DevOps Research Assessment (DORA) is improvement. Give Sleuth a try and see how we give teams actionable insights on how to improve, no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

A detailed guide on Azure Reservations

Organizations that invest in cloud technologies like Microsoft Azure might notice that cloud costs can easily get out of control. When cloud services use the Pay-as-you-go payment model, small amounts must be paid each time a cloud service is used. However, when you have deployed hundreds of cloud resources, the total spending can end up with a much higher monthly bill than expected. By optimizing your cloud costs, you lower your monthly Azure bill and gain cost efficiency and predictability.

Maximizing Your Cloud Migration Success with Application Owner Interviews

In this second episode, we explore a crucial element of your migration strategy - the Application Owner Interviews, a feature within Tidal Accelerator. These interviews are a game-changer, helping you make your migration not just IT-efficient but also business-driven.

Network Infrastructure Monitoring: Getting Started

The rapid evolution of technology profoundly impacts network infrastructure monitoring. New technologies such as containerization, microservices, and serverless computing introduce complexities that require monitoring solutions to adapt. The shift to DevOps practices, where development and operations teams collaborate closely, emphasizes the need for real-time monitoring and feedback loops to ensure continuous integration and delivery of applications and services.

Global AWS Orchestration with Runbook Automation

It is common for companies to have multiple AWS Accounts, and as it turns out, there are cases where certain operational tasks need to be performed on EC2’s that reside in each account. Examples of this include standardizing practices for auditing, patching, and incident-response – such as retrieving diagnostics or remediation. This demo showcases how Runbook Automation orchestrates commands and scripts on EC2’s spanning numerous AWS accounts through an integration with Systems Manager (SSM).

Keynote: State of Microsoft Integration - Slava Koltovich

In this session, Slava Koltovich, Principal Group Product Manager for Azure Integration Services at Microsoft delves into the rise of digital-first strategies post-COVID-19 and forecasts a software development revolution with 750 million new apps by 2025. Explore Azure Integration Services' vast capabilities, its partnership success story with Royal Mail, and a glimpse into Microsoft’s upcoming tech innovations.

4 Ways to Reduce Your Mean Time to Resolution

Dealing with a high MTTR in your network? Auvik Network Management is a comprehensive network monitoring and troubleshooting solution. With over 50 pre-configured alerts, it keeps you informed about critical network events. Users have the flexibility to customize these alerts and control notification frequency so that they have all the essential context to be able to fix issues.

Simplifying Kubernetes Native Testing with TestKube

As Kubernetes continues to dominate the container orchestration landscape, ensuring the reliability and stability of applications running on this platform is paramount. Testing in a Kubernetes-native environment demands specialized tools that understand the intricacies of containerized deployments. Enter TestKube, a powerful testing framework designed specifically for Kubernetes.

How to Extract Insightful Data From Proxy Protocol Packets

Boosting the transparency of your load balancer traffic is advantageous. Web applications continually pass information back and forth, yet some of this important data is often hard to get during transit. And while the perceived “black box” nature of networking seems overwhelming, what if you could peek behind the curtain to better understand your traffic?

Understanding the Differences Between RabbitMQ and Kafka

This blog was co-written by Howard Twine and Gregory Green. A few years ago, a colleague of ours wrote an informative post to help readers understand when to use RabbitMQ and when to use Apache Kafka. While the two solutions take very different approaches architecturally and can solve different problems, many people find themselves comparing them for situations where there is overlap.

Next Generation of Cost Reallocation: Leveraging APIs to Programmatically Manage Cloud Spend

A couple of weeks ago, I wrote about The Art of True Chargebacks and how VMware Tanzu CloudHealth makes simple work of the cost reallocation of cloud expenses. In addition to being a refresher on cost reallocation with Tanzu CloudHealth, it was also shared as a kickoff of our newly released cutting-edge Cost Reallocation API. This officially available API revolutionizes the seamless automation of chargeback reallocation across diverse geographies, brands, business units (BUs), and product lines.

What is remote access? An open door to productivity and flexibility

What is remote access and how has it transformed work dynamics around the world? Let’s dive in, explore and discover together how this innovative practice has reshaped conventional work structures and opened up a whole range of possibilities!

Navigating the Waters of SharePoint Online Limits

SharePoint Online (SPO) has become a linchpin in fostering collaborative work environments in the modern digital age. Its robust features provide a platform where individuals can share, manage, and collaborate on content seamlessly. As SharePoint Online continues to evolve, understanding its limits and boundaries is crucial for administrators to ensure optimal performance and user satisfaction.

A Deep Dive into Office 365 Enterprise Licensing

Office 365, also known as O365, has become a cornerstone for enterprise productivity. With different plans such as Office 365 E1, E3, and E5, organizations can tailor their licensing to meet specific needs. This article delves into the nuances of Office 365 Enterprise Licensing, aiming to provide a clear understanding to help you make an informed decision.

Behold a brand New Incident Dashboard!

The incidents page, the most visited page on Zenduty, has an all-new look and feel! It's been completely redesigned from the ground up to be faster, easier to use, and more visually appealing. The Incidents list now dedicates more space for important information, such as the title, date, priority, and more. The UI is also more polished, shaving off whitespace where unnecessary. The avatars have been redesigned with more pastel shades, resulting in an overall design far more soothing to the eye.

Managing downtimes with a maintenance window for Azure Logic App integrations

Handle scheduled downtimes in integrated systems: No more disruptions or complex code Managing downtimes is a critical aspect of integrated systems management. In this video, Michael Stephenson provides an in-depth look at managing downtimes, particularly within Azure Logic App integrations. He discusses the common challenges, such as unexpected disruptions and regular maintenance. More importantly, he unveils practical solutions like setting up automated tasks for effectively pausing and resuming integrations during maintenance.

How Much Does Slack Spend On AWS?

The Slack messaging app is popular because it is easy to use, affordable, and highly customizable. With Slack, you can collaborate in teams and with colleagues in real-time, accessing files from anywhere. Slack is also widely used because it enables users to create and join channels. In addition, it integrates with multiple apps and services, working seamlessly with the tools you already use. In addition, Slack offers cost-effective pricing plans.

Introducing Past Incident Feature | Incident Context and History | Squadcast

Introducing Squadcast's Past Incidents feature which helps incident responders by presenting them with past incidents related to the same service. It employs data science techniques to match and display a historical list of similar incidents from the same service you are currently investigating. This aids in expediting issue resolution by offering valuable insights, such as historical context, prior incident details, timing patterns, and past solutions.

An introduction to real-time Linux

In 22.04, Canonical announced a beta version of the Ubuntu kernel with the PREEMPT_RT patchset integrated. The new real-time kernel serves extreme latency-dependent use cases and provides deterministic response times to service events. By meeting stringent preemption specifications, real-time is suitable across a broad range of verticals, from telco applications to dedicated devices in industrial automation and robotics.

How to fix and prevent CrashLoopBackOff events in Kubernetes

It's one of the most dreaded words among Kubernetes users. Regardless of your software engineering skill or seniority level, chances are you've seen it at least once. There are a quarter of a million articles on the subject, and countless developer hours have been spent troubleshooting and fixing it. We're talking, of course, about CrashLoopBackOff.

Containerizing and Deploying a Production Remix App

In the modern web-app space, there’s been a trend going around that I like to describe as “getting back to basics”. It seems as though over the years, the tooling and complexity around building web-apps has gotten more and more complex. In that time, we’ve strayed further from browser primitives into highly abstracted and javascript-heavy solutions to solve problems our browsers solved back in the 90’s.

CloudZero Introduces New Optimization Workflow To Help Engineers Find, Act On, And Track Savings

CloudZero’s overarching goal is to help companies maximize the return on their cloud investment. To do that, it’s essential that engineers be accountable for their cloud costs. They must prioritize cost equal to quality and security and keep costs in check from the earliest days of development through the most advanced stages of production. Year after year, getting engineers to take action tops the list of challenges FinOps practitioners face.

The Power of Automation in DevOps

In the ever-evolving world of software development and operations, DevOps has emerged as a game-changer. DevOps, short for Development and Operations, is a set of practices and principles that bridge the gap between these two traditionally siloed domains, fostering collaboration and accelerating the delivery of high-quality software. At the heart of DevOps lies automation, a powerful force that revolutionizes the way software is developed, tested, and deployed.

How to test a MongoDB NoSQL database

Most development teams know that testing the application layer of a system (a.k.a the codebase) is of vital importance. Testing the data layer (the database) is just as important. To perform database testing, you construct queries to assert and validate the database operations, structures, and attributes required by the application connecting to the database.

How to Transform the Way Your Company Manages Its IT Infrastructure

In today's competitive business world, your company cannot afford to waste time or money when it comes to managing its IT infrastructure. Staying ahead of the curve and ensuring efficiency is key for any organization that wants to see success in the long run. If you're looking to revolutionize how your company manages its IT operations, here are some tips on transforming your tech stack and making sure every department runs like a well-oiled machine.

Puppet 8: The Biggest Changes + How to Get It Now

Puppet 8 is here, and it’s included in the latest release of Puppet Enterprise. It’s the biggest update to Puppet since Puppet 7’s first release in November 2020, and it carries a host of enhancements and improvements to make managing and scaling your infrastructure easier than ever. Read on for a list of the major changes included in Puppet 8, how they benefit you, and how to get going with Puppet 8 fast.

Security and compliance for enterprise collaboration

In today’s increasingly data-driven business landscape, security and compliance are more important for enterprise software than ever before. In an age where high-profile data breaches and regulatory violations seem to make headlines more frequently, enterprises must prioritize the protection of sensitive information while ensuring compliance with an exceedingly complicated labyrinth of legal and industry-specific requirements.

How Much Does Capital One Spend On AWS?

Capital One offers a variety of financial products, such as credit cards, financial accounts, and auto financing. Now, Capital One might appear as a leading financial institution. But it views itself as a technology company that provides financial services — rather than a financial services company that uses technology. Capital One has differentiated itself as an innovator throughout its 30-year history.

Navigating Common SharePoint Pitfalls

Access-related challenges form a significant portion of the common issues encountered within SharePoint environments. These challenges can manifest in numerous forms, including denied access to resources, unresponsive buttons, or dysfunctional links. Unveiling the root causes and solutions to these access challenges is crucial for maintaining a seamless SharePoint user experience.

Service Blueprinting and Orchestration for Elevated Customer Experiences

Chances are, you’re familiar with the strategy of adding an additional “9” to service level agreements (SLAs) to boost the experiences your organization provides. With plenty of ways to do so, there’s one that particularly stands out among the others: Service Blueprinting. Banking executive Lynn Shostack in 1984 first described a service blueprint in a Harvard Business Review publication.

What is Gremlin?

Today’s technology leaders are facing a reliability gap. Customers expect their apps to be fast and available. But with Devops and distributed systems driving more speed and complexity, it’s harder than ever to find and fix the reliability risks that can impact customer experience–before it’s too late. To close the Reliability gap, we need a reliability strategy. One that’s proactive, measurable, built-in and automated. We need a reliability platform.

3 Ways to Sell DORA to Your Boss

3 Ways to Sell DORA to Your Boss. If you've bought into the concept of DORA, and now it's time to get your boss on board, these three tips will help you succeed. Just remember: Give Sleuth a try and see how we give teams actionable insights on how to improve, no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

400x deploy frequency? One team's DORA success

Is 400x deploy frequency possible? One team achieved it with the DORA philosophy and metrics. It doesn't happen overnight, but it's possible if you commit to it. Nathen Harvey shares a DORA success story. Give Sleuth a try and see how we give teams actionable insights on how to improve, no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

Gremlin for DORA compliance: how financial services firms build digital resilience-and prove it

The Digital Operational Resilience Act (DORA) is set to significantly impact the financial sector. Coming into full effect in 2025, this EU regulation will set new standards for information and communications technology (ICT) risk management. In this landscape, how can financial firms ensure they’re not only compliant, but also operationally resilient?

Canonical announces supported solution for Apache Spark on Kubernetes

Today, Canonical announced the release of Charmed Spark – an advanced solution for Apache Spark® that provides everything users need to run Apache Spark on Kubernetes. Apache Spark is suitable for use in diverse data processing applications including predictive analytics, data warehousing, machine learning data preparation and extract-transform-load (ETL).

Empowering Bridgewise's Financial Revolution with 2bcloud's Cloud Expertise

At 2bcloud, we take pride in our commitment to helping visionary companies like Bridgewise transform their business through innovative cloud solutions. Bridgewise is on a mission to redefine the equity research industry by harnessing the power of advanced AI, making financial insights more accessible and reliable than ever before.

Infrastructure as a Service (IaaS): What It Is, And Why You Need to Know

In the ever-evolving landscape of cloud adoption, resource virtualization has emerged as a transformative force, revolutionizing the way businesses operate and offer services. At the heart of this revolution lies the concept of Infrastructure as a Service (IaaS), which offers a spectrum of service models designed to meet the diverse needs of every company.

Learning Flows: Bringing consistency to your post incident processes

To get the most out of your incident response processes, consistency is crucial. The more predictable you can be whenever issues crop up, whether a small bug or a major outage, the quicker and more confidently you can respond. In practice, incident response is equal parts knowing how to actually resolve the issue and having the confidence that the processes in place will help get you through without added stress.

Introducing GitKraken's New Suite of Dev Tools

Hey, Matt from GitKraken here. I’ll admit, we’re a tad obsessed with developer productivity. Tools like GitKraken Client and GitLens are great for helping any developer go further, faster. But building software is most often a team sport. So over the past year, customers of all sizes have been imploring us to fill two important voids.

Securely Connect Cloudsmith to your CI/CD using OIDC Authentication

Are your CI/CD pipelines at risk? They might be if you use long-lived, static credentials and tokens. Long-lived, static credentials and tokens are one of the most common causes of data breaches in cloud environments. CI/CD tools need access to cloud services to publish artifacts, deploy software, and access resources on their cloud provider. So, they need credentials. It's tempting to hard-code them. But that's a bad idea.

Partner Watch: CI/CD Build Systems for Embedded Development

To excel in embedded development in 2023, it is essential to have a solid understanding of build systems, continuous integration, and deployment strategies. This workshop by Percepio training partner Jacob Beningo aims to provide a comprehensive primer on these practices, equipping participants with the knowledge and skills necessary to tackle complex firmware projects with confidence.

What is Prometheus Alertmanager?

Prometheus Alertmanager is a powerful tool designed to handle various alerts generated by Prometheus. It plays a vital role in the overall monitoring ecosystem, acting as a centralized hub for managing alert notifications. With Prometheus Alertmanager and its robust notification management capabilities, you can efficiently define alert routing and notification policies. This empowers you to take timely actions and mitigate potential issues before they impact your service availability.

SharePoint against Onedrive

In today’s digital age, the need for efficient document storage and collaboration tools is more pressing than ever. Microsoft, being a leader in the enterprise solutions sector, offers two standout products in this category: SharePoint and OneDrive. While both tools hail from the same Microsoft family and integrate seamlessly with other Microsoft 365 apps, they serve distinct purposes. Let’s delve deeper into the nuances of each and understand their primary differences.

A call for community

Open source projects are a testament to the possibilities of collective action. From small libraries to large-scale systems, these projects rely on the volunteer efforts of communities to evolve, improve, and sustain. The principles behind successful open source projects resonate deeply with the divide-and-conquer strategy, a universal approach that has proven effective across multiple disciplines.

Monitoring CPU Temperature with Hosted Graphite

Monitoring CPU temperature is crucial for ensuring the smooth and efficient functioning of computer systems. As processors become more powerful, they generate more heat, which can lead to performance issues, system instability, and even hardware damage. Overheating is a common problem faced by many computer users, especially those who engage in resource-intensive tasks like gaming or running complex software.

Grafana and Graphite Best Practices

Efficient monitoring and visualization of performance metrics are paramount for ensuring seamless user experiences and reliable system operations. Grafana and Graphite, two powerful open-source tools, form an unbeatable combination when it comes to monitoring and analyzing time-series data. Grafana provides a robust and flexible platform for visualizing data, while Graphite acts as a scalable and efficient backend for storing and retrieving metric data.

Internal Developer Platform: What's the ROI?

Internal Developer Platform (IDP) is a game-changing innovation that has transformed the technology landscape. In the previous article, we discussed in detail the effectiveness of these developer platforms in terms of developer efficiency, faster product releases, frequent collaboration, etc. However, one aspect that is of paramount importance, especially from the business perspective, is the ROI. The first question any manager will ask is, "What will be the ROI of investing in an IDP"?

Customization vs. Standardization: Striking the Right Balance in Developer Platforms

Internal developer platforms (IDPs) have become a necessity for software development in today's ever-changing technological landscape. These platforms not only support engineering team velocity and business product strategies but also enhance communication and information flow, impacting successful product launches.

Implementing Backstage: Kubernetes Plugins

This second last part of the “Implementing Backstage” series explains how to use the Kubernetes plugin in Backstage using real-world scenarios. The previous installments covered getting started, using the core features, integrating with existing tools using plugins, and security and compliance. If you’re entirely new to Backstage and want to learn more, you can read the first entry in the “Evaluating Backstage” series.

Implementing Backstage: Kubernetes Deployment

This final part of the “Implementing Backstage” series focuses on how to deploy Backstage on Kubernetes. This tutorial is a direct continuation of Using the Kubernetes Plugin in Backstage, which you should complete before tackling this one. The other installments in this series covered getting started, using the core features, integrating with existing tools using plugins, and security and compliance.

Challenges to Anticipate When Transitioning to an Internal Developer Platform

Internal Developer Platforms (IDPs) are gaining significance in contemporary software development because they can transform an organization's software delivery by facilitating automation and productivity across large teams or by permitting smaller teams without dedicated DevOps engineers the ability to deploy at scale. The migration of existing projects, protocols, and infrastructure to the new platform can make the transition to an IDP challenging for businesses.

Azure API Management monitoring

Azure API Management is a Microsoft Azure cloud-based solution that helps businesses effortlessly create, publish, secure, and analyze APIs (Application Programming Interface). APIs are the building blocks of any business and play an essential role in data exchange. Azure API Management monitoring is vital to enable the business to function seamlessly. It helps early problem detection, resource optimization, and data-driven decision-making to increase the quality of the API ecosystem.

What Are S3 Lifecycle Rules And When Should You Use Them?

Amazon Simple Storage (S3) has become a cornerstone in the world of cloud computing, offering scalable and secure object storage solutions for a wide range of applications. However, as data accumulates over time, managing it in an efficient manner becomes a challenge. This is where S3 Lifeycle Rules shine. These rules allow us to automate the transition of data between different storage classes and even schedule automatic deletions, which allows users to optimize costs and operational efficiency.

GenAI in production: how we built AI into CircleCI

In this episode, you’ll learn how to empower your team to do the most challenging thing when it comes to AI - getting started! Rob is joined by Kira Muehlbauer and Ryan Hamilton, two engineers who worked on building a groundbreaking feature at CircleCI called the AI error summarizer. Discover their insights into the process of building AI products, the challenges they faced, and the valuable lessons they learned along the way.

Monitoring systemd logs with Netdata using the systemd journal Function

The systemd journal plugin by Netdata makes viewing, exploring and analyzing systemd journal logs simple and efficient. It automatically discovers available journal sources, allows advanced filtering, offers interactive visual representations and supports exploring the logs of both individual servers and the logs on infrastructure wide journal centralization servers.

Streamlining Kubernetes Operations with Enterprise Workload Automation

Kubernetes integrations are now available for AutoSys, dSeries, and Automic Automation. It wasn’t that long ago that teams in many organizations started dipping their toes into the world of containers and microservices. It didn’t take long for this approach to application development and orchestration to take hold, and for Kubernetes to emerge as a dominant, broadly used technology.

The Power of Data Correlation: Troubleshooting Made Easy

As software engineers, we all know that troubleshooting often involves sifting through heaps of data points — scanning metrics, reading logs, checking resource status and analyzing events. We manually connect the dots, and if we're experienced enough, we might spot an issue that's about to become a problem. At StackState, we've faced these same challenges.

Semper vigilans: how Platform.sh stays ahead of emerging cybersecurity threats (so you don't have to)

October is Cybersecurity Awareness month. So, we’ve asked Diogo Sousa, Platform.sh Security team manager to share how his team contributes to helping customers protect their websites and applications from external threats, 24x7.

9 ChatOps tips your team should adopt today

Pandora FMS is an excellent monitoring system that helps collect data, detect anomalies, and monitor devices, infrastructures, applications, and business processes. However, more than monitoring alone is needed to manage the entire incident lifecycle. ilert complements Pandora FMS by adding alerting and incident management capabilities. While Pandora FMS detects anomalies, ilert ensures that the right people are notified and can take action quickly.

Implementing Backstage 4: Security and Compliance

This is the fourth part of the “Implementing Backstage” series and explores how to ensure your Backstage application is secure and how Backstage can contribute to more secure practices in general. The previous installments focused on how to get started, using the core features, and integrating with existing tools using plugins. If you’re unfamiliar with Backstage and need an introduction, check out part one of the “Evaluating Backstage” series.

Using Rails with SES, SNS and SQS to avoid bounce rate

Amazon Simple Email Service (SES) is a cost-effective email service provided by AWS. It is by far the cheapest option available out there. Comparing the cost of sending emails with SES and other services like SendGrid or Mailchimp, it can be 100x cheaper. However there is a catch. Using SES directly, you will not get some of the features you might need to control the bounce rate of your emails.

Security Considerations for Your Internal Developer Platform

In today's world, where cloud resources and data management tools play an increasingly critical role, the concept of an Internal Developer Platform (IDP) is gaining momentum. Imagine a platform where developers seamlessly design, build, and deploy applications. That's precisely the promise of IDPs. But here's the highlight: with great power comes greater responsibility. Security within IDPs isn't just an optional add-on; it's the core essence.

G2 Fall Report Positions Squadcast among the leading Incident Management, and IT Alerting Tools

Squadcast established itself as a Momentum Leader and High Performer across different regions in the Incident Management and IT Alerting tool categories. We have solidified our leadership in the Mid Market segment across various regions, this recognition stems from our dedicated customer base.

Exploring the 2023 Enhancements in SharePoint Online

In the realm of collaborative platforms, SharePoint Online stands as a robust solution that continually evolves to meet the dynamic needs of modern enterprises. The 2023 updates have notably elevated the platform’s capabilities, particularly in terms of file and document management and integration with other Microsoft 365 offerings.

Staying Ahead of Threats with Continuous Security Monitoring Tools for DevOps

According to the latest Crowdstrike report, in 2022 cloud-based exploitation increased by 95%, and there was an average eCrime breakout time of 84 minutes. Just as significantly, in 2021, the Biden administration passed an executive order to improve the nation’s cybersecurity standards. There are also upcoming laws like DORA in the European Union. So, increased cyber attacks and legislative pressures mean you need to (a) actively protect against threats and (b) prove that you are doing so.

What Is Continuous Security Monitoring Software?

Many DevOps teams work proactively to meet security and compliance standards. They consider security best practices when developing software with open source components, scanning code for vulnerabilities, deploying changes, and maintaining applications and infrastructure. Security is a key feature of many of the tools they’re using, and the policies and industry standards they’re following.

Spot Eco for Azure now supports Cloud Solution Provider (CSP) accounts

Azure Reservations and Savings Plans are discounted pricing models for certain Azure services. By purchasing a reservation or savings plan, you can reduce your costs in Azure by committing to specific usage terms in advance. The commitment gives Azure more visibility into your one-year or three-year resource needs while guaranteeing your usage for the term, and you will enjoy discounts on those core services.

Ubuntu Desktop 23.10: Mantic Minotaur deep dive

The last interim release before an LTS (Long Term Supported release for those new to the Ubuntu terminology) is a particularly exciting time. This is the release where the team aims to land as many major changes as possible to ensure that the community has the chance to take them for a spin and provide feedback for further refinement ahead of Ubuntu 24.04 LTS. These features span the entire Ubuntu Desktop stack, from the user interface, to software management, to core security and architectural changes.

Canonical releases Ubuntu 23.10 Mantic Minotaur

Today Canonical announced the release of Ubuntu 23.10, codenamed “Mantic Minotaur”, available to download and install from https://ubuntu.com/download. “In this release we’ve raised the bar for what secure by default means for Ubuntu and set the stage for our next Long Term Supported release.” said Oliver Smith, Senior Product Manager for Ubuntu at Canonical.

Top three scorecards every organization needs for operational efficiency

Efficiency has always been a goal for organizations, but recent economic headwinds have made it a priority. Budgets have been stretched especially thin recently, leading many organizations to focus on improving operational efficiency. Bugs, security incidents and unreliable services can all slow your organization down and distract from delivering on your priorities. Cortex helps you minimize these distractions with its scorecard feature.

Discover the Root Cause of Your Cloud Spend Issue

If you’re sick with a cold then measuring your body temperature is a wise move or maybe if things are really bad a visit to a doctor might result in testing vs. what are considered “normal” levels in order to diagnose the issue; seasonal flu or infection? To improve our health after picking up a bad bug, we do things that affect our situation back to normal levels once again where we can then declare ourselves healthy.

Introduction to Internal Developer Platforms: What, Why, and How?

For top-notch organizations, staying ahead of the curve is not just a choice; it's a necessity. To meet the growing demands of modern development, organizations are increasingly turning to Internal Developer Platforms (IDPs) as a solution to fine-tune their workflows. This article aims to demystify IDPs, shed light on their benefits, and guide you on how to embark on your IDP journey.

Why Your Load Balancer Should Be Fast & Flexible

The intersection of economic uncertainty and digital transformation presents a unique challenge for businesses. With the fear of a recession looming overhead, there’s no doubt that choppy waters await, but what does this mean for IT when tech spending can significantly impact the bottom line? While IT spending is a priority for many, businesses are still seeking ways to reduce non-essential spending and upgrade outdated infrastructure.

Advanced Access Controls with Mattermost Enterprise Edition

While some smaller companies may only need to use standard access controls to shore up systems, large organizations — particularly those with strict security, confidentiality, and compliance requirements — often require advanced functionality that gives them more authority over which users can access what systems and when.

Automation Cheat Sheet: How Telco Leaders Ace the Test of Driving Process Efficiency

Telecommunication (Telco) companies everywhere share a similar vision: future-proofing their organizations for an unpredictable era of challenges and opportunities in an unreliable economy. Rebounds from the pandemic started out slow and patchy, and leading up to present day, moves like inflation-laced price increases and merger and acquisition (M&A) deals have ramped up share prices across the global telecoms sector to climb back up from 2020’s rock bottom.

What Happens to DevOps when the Kubernetes Adrenaline Rush Ends?

Kubernetes has been around for nearly 10 years now. In the past five years, we’ve seen a drastic increase in adoption by engineering teams of all sizes. The promise of standardization of deployments and scaling across different types of applications, from static websites to full-blown microservice solutions, has fueled this sharp increase.

Webinar: The Top 5 Use Cases for Internal Developer Portals

Internal developer portals (IDPs) have received a lot of attention lately. Internal Developer Portals serve as the engineering system of record—providing developers with the context and tools they need to ensure services and resources they own align with best practices for deployment readiness, operational maturity, security compliance, and more. But they do more than just act as a system of record for your whole stack. They also help drive alignment, improve MTTR, and can even reduce cloud spend.

Kafka Monitoring Using Prometheus

In this article, we are going to discuss how to set up Kafka monitoring using Prometheus. Kafka is one of the most widely used streaming platforms, and Prometheus is a popular way to monitor Kafka. We will use Prometheus to pull metrics from Kafka and then visualize the important metrics on a Grafana dashboard. We will also look at some of the challenges of running a self-hosted Prometheus and Grafana instance versus the Hosted Grafana offered by MetricFire.

Complete Guide To Grafana Dashboards

Grafana is one of the most popular dashboarding and visualization tools for metrics. The Grafana Dashboards are a very important part of infrastructure and application instrumentation. In this post, we will deep dive into Grafana dashboards. We will create a Grafana dashboard for a VM’s most important metrics, learn to create advanced dashboards with filters for multiple instance metrics, import and export dashboards, learn to refresh intervals in dashboards and learn about plugins.

Exploring systemd journal logs with Netdata

Today, we released our systemd journal plugin for Netdata, allowing you to explore, view, search, filter and analyze systemd journal logs. Like most things about Netdata, this is a zero-configuration plugin. You don’t have to do anything apart from installing Netdata on your systems.This is key design direction for Netdata, since we want Netdata to be able to help even if you install it mid-crisis, while you have an incident at hand.

The Evolution of DevOps From Concept to Best Practice

Software development, the evolution of DevOps has been nothing short of revolutionary. What began as a simple concept has transformed into a best practice that is reshaping the way organisations develop, deploy, and maintain their software. In this blog post, we will take a journey through the evolution of DevOps, from its humble beginnings to its current status as an indispensable part of modern software development.

A Detailed Guide to Setting Up Effective On-Call Rotations

On-Call Schedules are predefined rotations/shifts assigning team members to be available for incident response at specific times. They are essential for ensuring round-the-clock support, swift issue/incident resolution, and continuous service availability. For a robust On-Call system, proper schedules are essential serving as the backbone of reliable Incident Response, and ensuring your team is well-prepared to address technical challenges effectively.

A Conversation on Smart Infrastructure Management

In a recent episode of the Millennium Live Podcast, Galileo partner Charles Araujo, an industry analyst, author, and recognized authority on digital transformation, joined the host, Conor Tuohy, to delve deep into the world of smart infrastructure management. The interview provides valuable insights into the evolving landscape of IT operations, the challenges posed by digital transformation, and the role of Galileo Suite in revolutionizing infrastructure management.

What Is SaaS Architecture? 10 Best Practices For Efficient Design

As an engineer, engineering supervisor, or CTO, you are responsible for making architectural decisions that help your team create innovative products and optimize technology costs. The type of architecture you select affects how much control you have over data, infrastructure, and customization options. The Software-as-a-Service (SaaS) model is one of the major architectures you can use to deliver services to customers — anytime and anywhere.

Aggregation mapping pattern in BizTalk to Azure Integration Services migration

Let’s embark on a new journey as we begin a series of blog posts dedicated to the migration of BizTalk Server to Azure Integration Services. I’d like to highlight that when I mention the migration to Azure Integration Services (AIS), I’m making a clear distinction from Logic Apps. This differentiation is important because, contrary to what some consultants and salespeople may suggest, migrating BizTalk Server entirely to Logic Apps is not a viable path!

5 Tips For Managing Your Internal Developer Platform

Internal Developer Platforms (IDPs) have become the cornerstone of efficient development, serving as the central hub where development teams access the tools and resources necessary for coding, testing, deploying, and maintaining software applications. As software development continues to evolve rapidly, IDPs are crucial in maintaining a competitive edge. This introduction sets the stage for the technical insights that will follow, sharing 5 tips for effective Internal Developer Platform management.

DORA for measuring developers? Beware!

Should you use DORA for measuring developers? Beware! It could lead to unhealthy behaviors that harm the team and organization. DORA metrics are meant to assess application or service-level health and stability, which cross-functional teams, not individual developers, are responsible for. Give Sleuth a try and see how we give teams actionable insights on how to improve, no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

SmartNICs in telco: benefits and use cases

In our previous blog, we introduced smartNICs as technology enablers for next-generation converged data centres. We covered how smartNICs can increase efficiency and drive return on investment. In this blog post, we explain how this innovative technology can help the telecom industry. SmartNICs use cases for the telecom sector are still emerging. However, when they arrive, it will be big for the sector, especially at edge clouds where speed in user plane packet processing matters the most.

Greener Cloud Computing with Deep Green and Civo

With more companies switching to cloud service providers, we are seeing a drastic increase in the amount of electricity required to run the data centers that are hosting all the machines required to run these platforms. Currently, data centers produce 3% of global carbon emissions, which is roughly 1.5% of the worldwide electricity demand.

VMware Was Named an Overall Leader in Cloud Security Posture Management by KuppingerCole Analysts AG

KuppingerCole AG published its report assessing Cloud Security Posture Management (CSPM) solutions in the market for 2023. Their leadership compass helps cloud users find an appropriate solution to meet CSPM needs of an organization to monitor, assess, and manage risks associated with the use of cloud services. Fifteen vendors were assessed based on responses to a questionnaire, strategy briefing, and demo.

Grafana vs. Zabbix

Grafana is a visualization tool that allows you to see and analyze all of your metrics in one unified dashboard. Grafana can pull metrics from any source, display that data, and then enable you to annotate and understand the data directly in the dashboard. Grafana dashboards are designed to allow you to visualize information in a ton of ways, from histograms and heatmaps to world maps. Grafana also has an alerting feature that can communicate with you through Slack, PagerDuty, and more.

A guide to post-mortem meetings and how we run them at incident.io

You've just made it through a particularly tough incident. It was a short outage affecting a subset of customers, so not exactly the end of the world, but bad enough that it involved multiple people across a number of teams to resolve. Either way, the incident was well managed, and the dust has settled. Now what? Most guidance would say that putting together a post-mortem document is a good idea, given the severity of the incident. You've also done this, so what's next?

Five database DevOps practices for boosting team productivity

Developing and deploying database changes can be a complex task, made more challenging by the fact that development teams need to move fast, while also protecting an organization’s crown jewels: its data. Speed of delivery and protecting data can often feel incompatible, but there are industry-proven database DevOps practices that bring them together in harmony.

Automate deployment of Java Spring Boot apps to AWS Elastic Beanstalk

The benefits of automating deployments for your Java Spring Boot application are undoubtable. Not only is it possible to set up images and run tests or compatibility checks before updating the production environment, but CI/CD providers like CircleCI take a step further by streamlining the entire delivery process from code changes to deployment. Many teams assume that the specifics of their development stack or deployment process will make automation difficult to achieve.

What Is AWS Glue? A Newbie-Friendly Guide

More enterprises continue to adopt managed data integration services like AWS Glue. According to the Data Pipelines Market Study Report, 65% of organizations now prefer cloud-based or hybrid cloud data integration solutions. And how exactly do they stand to benefit? This article explains the craze by covering AWS Glue in detail, which happens to be one of the most popular cloud data integration services today. We’ll cover.

Mastering Group Creation in SharePoint

In the digital world, SharePoint holds a significant position, being a highly robust platform that caters to a multitude of collaborative needs. Among the rich features it offers, SharePoint Groups stand out as a fundamental building block for creating a conducive collaborative environment. These groups not only enhance interaction among team members but also provide a structured framework for managing permissions and access to resources.

Releasing Icinga Ansible collection v0.3.0

This release of the collection will feature a whole set of possibilities to deploy a complete Icinga 2 environment. Before diving deep into the collection, a quick recap of all roles which were available and which are included in the current release v0.3.0. New Roles in v0.3.0 To further enhance the Icinga 2 installation process via Ansible those roles are vital for a successful deployment. The Icinga DB is the future backend of Icinga 2, this can be handled with our icingadb and icingadb_redis roles.

Logic App Best Practices, Tips, and Tricks: #37 How to handle special characters inside Logic Apps actions?

Today, I will speak about another useful Best practice, Tips, and Tricks that you must consider while designing your business processes (Logic Apps): How to handle special chars inside Logic Apps actions.

HAProxy's Growth Continues with Rave Reviews and Powerful Capabilities

The G2 Fall 2023 Reports are in! And while leaves are on the verge of tumbling downward, HAProxy's acclaim across multiple categories, market levels, and global segments has only risen. For companies looking for—or migrating from—a load-balancing solution, HAProxy delivers simple adoption, scalable performance, strong security, and observable operation. Altogether, across six product categories HAProxy featured in 58 reports and won 30 badges, including “Momentum Leader”.

Everything you need to know about data sovereignty

In today’s digital age, the most effective organizations are using data to fuel innovation and accelerate business strategies. Data continues to be at the heart of business growth. Organizations increasingly rely on technology to manage and store their data. Questions about ownership, control, and security have emerged — leading to the rise of a concept known as data sovereignty. In this post we’ll explore.

The Path to a Dark NOC: Actionable Initiatives to Achieve Full Autonomy

A Dark Network Operations Center (NOC) is one that runs with no IT staff … at least that’s how it’s been defined up until now. But there’s more to interpret. Large, complex networks rely on the NOC — the core of network infrastructure — to keep them healthy and resilient. The NOC’s function allows employees, customers, partners, and other network users to rest a bit easier, and its integrity and accuracy gives them peace of mind.

Three Ways to Better Appreciate your SREs and DevOps Engineers

DevOps engineers and Site Reliability Engineers are vitally important to the continued health of your product and business. We all know it’s true, and yet people in these roles often feel underappreciated and undervalued. This sort of work runs into the issue of “when process and infrastructure break, it gets shoved in the spotlight; but when everything works perfectly, no one notices.” ‍

Do these 5 things to get started with DORA

Do these 5 things to get started with DORA. If you're sold on the philosophy of DORA but don't know how to get started, follow our five tips: Give Sleuth a try and see how we give teams actionable insights on how to improve, no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

Spot by NetApp achieves AWS Spot EC2 Service Ready status: Unlocking cost savings and scalability

Spot is thrilled to announce that we have achieved AWS Spot EC2 Service Ready status. This achievement validates Spot’s continued commitment to delivering a reliable and seamless experience for users by leveraging AWS Spot Instances.

Continuous integration for Yii2 APIs with Codeception

Continuous integration (CI) is the process of integrating changes from multiple contributors to create a single software project. A key component for a smooth CI pipeline is testing. Tests prove that the code does exactly what it says on the tin and that it’s safe to merge the code into the central repository. Tests also anticipate edge cases and ensure that the code handles such cases in a deterministic manner.

How California's New Emissions Disclosure Law Will Affect Data Centers

The new law, SB 253, aims to bring more transparency and accountability to the public about how big businesses contribute to climate change. It also hopes to encourage companies to reduce their emissions and align with the state’s ambitious climate goals. By 2030, California plans to lower its greenhouse gas emissions by 40% below what they were in 1990.

Ensuring consistent Kubernetes container versions

One of Kubernetes' killer features is its ability to seamlessly update applications no matter how large your deployment is. Did a developer make a code change, and now you need to update a thousand running containers? Just run kubectl apply -f manifest.yaml and watch as Kubernetes replaces each outdated pod with the new version.

6 Things Critical to Cloud Optimization Programs

Enterprises face many challenges when FinOps programs turn to infrastructure optimization to find cloud savings. Cloud catalog complexity, reluctant engineers or even having the staff and time to make the many changes required can all slow down or impede progress and savings. In this 20-min session we will cover the 6 things leading large scale cloud users have learned are critical to cloud optimization programs.

A Guide to Assessing Cloud Server Pricing

Businesses and individuals wanting to leverage cloud services must pay prime attention to it. They will need to boost their understanding of how cloud servers work and practice informed decision-making in areas of cloud usage and optimization of costs. To address this, we'll seek to understand the key components of cloud computing as a resource for cloud server pricing guide and explore computing resources. After that, consideration shall also be given to data transfer costs, network bandwidth, and latency.

Bridging the ITIL vs DevOps Mindset: CI/CD Best Practices for ITIL Organizations

DevOps practices in software development have revolutionized the way updates are released. However, many companies entrenched in ITIL practices find it challenging to seamlessly integrate with the DevOps practice of Continuous Integration and Continuous Delivery/Deployment (CI/CD). This is because ITIL focuses on stability, which suits older systems, while DevOps is ideal for modern setups with its agile, automated practices.

Unveiling SharePoint

In the realm of enterprise solutions, Microsoft SharePoint has emerged as a cornerstone for fostering collaboration and managing content. Originating back in 2001, SharePoint has meticulously evolved, aligning itself with the changing dynamics of the corporate world. Today, it stands as a robust platform enabling organizations to create, manage, and share content in a highly secure environment.

Re-Imagining Cloudsmith.io

When a headhunter reached out to me about the CEO role at Cloudsmith (where I started in August!), one of the first things I did was sign up for a trial account. The product's depth and sophistication really impressed me, and contributed to my decision to go ahead with the interviews. (Glad I did.) They were right; our web interface is still largely a Django web app, tightly coupled to the back end, and you can see the Bootstrap showing everywhere.

Regulating hyperscalers: How the CMA investigation could alter cloud computing

In 2022, Ofcom, a UK regulator, began its market study into the cloud industry to investigate the dominance that hyperscalers, especially AWS and Microsoft, hold over the industry and the limits this creates for customers. This investigation follows concerns surrounding customers feeling “locked in” to a single provider, potentially leading to inflated prices in the market¹.

Cloud Repatriation: Example + 9 Considerations for Migrating from Public Cloud to Private, On-Prem + Hybrid

If you moved to the cloud hoping for cost savings and scalability only to find that your cloud costs are ballooning, your cloud performance isn’t up to snuff, or you’re always struggling to align compliance regulations with your cloud deployment, it might be time to look into cloud repatriation as an alternative to public cloud infrastructure.

DORA myth debunked: You ARE ready for the metrics

DORA myth debunked: You ARE ready for the metrics. Even if your software development team doesn't deploy that frequently, you can still benefit from tracking DORA metrics, because they help teams focus on improving their software delivery performance. Give Sleuth a try and see how we give teams actionable insights on how to improve, no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

Restricted unprivileged user namespaces are coming to Ubuntu 23.10

Ubuntu Desktop firmly places security at the forefront, and adheres to the principles of security by default. This approach caters to both everyday users and organisations with specific compliance requirements. As such, Ubuntu ensures that its recommended security configurations are equally robust, easy to understand and readily accessible as part of the default user experience.

9 VS Code Extensions that Use Artificial Intelligence

Over the last several months, AI has been everywhere in the technology space and far beyond. Since it directly affects the tech ecosystem, however, it’s no surprise that developers have harnessed artificial intelligence to create tools that boost productivity and enhance workflows. Artificial intelligence is essentially a computer’s ability to perform tasks at the same level (and often beyond) as intelligent beings.

systemd journal logs: A Game-Changer for DevOps and Developers

“Why bother with it? I let it run in the background and focus on more important DevOps work.”— a random DevOps Engineer at Reddit r/devops In an era where technology is evolving at breakneck speeds, it's easy to overlook the tools that are right under our noses. One such underutilized powerhouse is the systemd journal. For many, it's a mere tool to check the status of systemd service units or to tail the most recent events (journalctl -f).

Use Grafana to Monitor Flask Apps With Graphite

Monitoring the performance and health of web applications is paramount for ensuring a seamless user experience. Flask offers developers the flexibility to build dynamic applications. However, as applications grow in complexity, so does the need for efficient monitoring solutions. This is where Grafana and Graphite come into play.

OpenShift monitoring: Five crucial elements to look out for

Most IT firms build their empire on Kubernetes, for its amazing flexibility and super scalability. RedHat OpenShift Container Platform (formerly OpenShift Enterprise) is a hybrid cloud application platform powered by Kubernetes, which initially only operated on-premise, and has been open to service for more than nine years.

Unified Incident Management: Merits of Combined On-Call and Incident Response | Squadcast

In this session, we explore the crucial aspects of effective on-call management and incident response in product organizations. Squadcast combines On-Call and Incident Response into a single platform using automation capabilities for enhanced reliability, continuous learning, and better productivity. 🔍 Timestamps.

AI is not intellignece: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Bugs in NASAs codebase : Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Office 365 Services Overview

Microsoft Office 365, often simply referred to as Office 365, represents a significant shift in the way we approach work in a digital environment. It’s a subscription service that ensures you always have the most up-to-date modern productivity tools from Microsoft. This suite of services and applications goes beyond the traditional software suite to offer a wide range of tools and solutions to modern-day challenges faced by organizations and individuals alike.

Top 15 Azure Cost Management Tools in 2023

Azure, Microsoft’s cloud platform, has become an essential part of modern businesses, offering a vast array of services and resources. However, effective cost management in Azure is crucial to avoid unexpected expenses and optimize spending. While Azure provides its native tools for cost management, several third-party solutions offer advanced features and capabilities to help you make the most of your Azure resources. This blog will explore the top 15 Azure cost management tools.

What is a smartNIC and how is the technology shaping modern data centres?

Data centres are going through a transformation. Gradually, we will see a new type of equipment attached to servers in almost every data centre: smartNICs are here. They will be the enablers of converged data centres where common infrastructure tasks are offloaded from a host server to attached network interface cards (NIC).

The Top Cloud Cost News From September 2023

When it comes to September’s cloud cost news, cloud-based organizations will have to take a little bit of sour news along with the sweet: Cloud costs continue to rise with inflation, and IBM cloud service users will soon see price hikes, but AWS and Azure have both recently released updates that may help you keep your costs under control. Read on to see how these changes may affect your company in the coming months.

Choosing the Right Career Path in Tech: Software Engineering vs. Site Reliability Engineering (SRE)

The tech industry is booming, and there are many different career paths. But, two of the most popular and in-demand roles are Software Engineering and Site Reliability Engineering (SRE). Site Reliability Engineering (SRE) blends elements of software engineering with IT operations, focusing on reliability. On the other hand, SWE Software Engineering involves designing, developing, testing, and deploying software applications.

How to reduce Azure costs? | The ultimate tool for Azure cost optimization

Are you struggling to manage and understand your Azure spending? Many businesses face the same struggle in reducing and optimizing their Azure spending. This video discusses the challenges associated with Azure cost optimization and shows how Serverless360 can help you save money on your Azure bill. With Serverless360, you can.

Simplifying Azure for business and support users

Due to Azure's technical complexity, support and business users often face challenges in using the Azure portal. Serverless360 steps in to mitigate these challenges by providing a user-friendly experience for both IT operations and support users, bridging the common challenges in managing Azure-based solutions. In this video by Michael Stephenson, he contrasts the operational and business process views offered by Serverless360 with Azure Portal's deployment-focused perspective.

Using Helm and Terraform for Codefresh Gitops Installations

Last year we launched the Codefresh delivery platform powered by Argo. After the initial launch we started collecting feedback from all companies that tried it (as well as existing customers) and cataloged all feature requests and implementation ideas. The main goal is always to iterate quickly and address the most common issues in the most efficient way possible.

Atlassian Intelligence features for Bitbucket Cloud are now in beta!

We're excited to announce that Atlassian Intelligence features are now in beta and are available to all workspace admins to activate in their workspace settings. Generative AI in the editor lets you generate, transform, and summarize content while you're writing Pull Request descriptions or comments in the Bitbucket Cloud code review experience.

Tutorial: Monitoring MySQL Server Performance with Prometheus and sql_exporter

Databases in one form or another are almost an inseparable part of modern applications. A popular one among them is MySQL on which this article will focus. But how to monitor MySQL? This article will give an introduction to this topic.

Prometheus Dashboards

Prometheus is a very popular open-source monitoring and alerting toolkit originally built in 2012. Its main focus is to provide valid insight into system performance by providing a way for certain variables of that system to be monitored. Prometheus displays the performance of these variables as a graph to allow its users to see their system’s performance at a glance.

Multi-Service Progressive Delivery with Argo Rollouts

In the previous article of the series, we explained how to use Configmap generators in order to use Progressive Delivery for your configuration (and not just the container images). In this post, we will also cover another popular question: how to use Argo Rollouts with multiple services. Argo Rollouts is a Kubernetes controller that allows you to perform advanced deployment methods in a Kubernetes cluster. By default, it only supports a single service/application.

Faster, Stronger, Better: How Top Telcos Map Innovative Actions to Real Results

The telecommunications industry today is focused on delivering advanced, reliable connectivity and the highest possible performance to consumers, all while getting ahead of the cutthroat competition. And accomplishment in these key areas comes with its fair share of challenges for communications service providers (CSPs) to meet customer expectations.

Breaking Out of Analysis Paralysis: Streamlining Cloud Migration with Tidal Accelerator

In Episode 1, we delve into the critical topic of overcoming “Analysis Paralysis” during cloud migration. We will explore how Tidal Accelerator can help organizations break free from the shackles of over-analysis and guide them towards successful cloud migration.

7 Best Azure Service Bus Monitoring Tools in 2023

Azure Service Bus is a cloud messaging service that transfers information between services running in both the cloud and on-premises. So, it becomes essential to ensure the performance and availability of Service Bus as it might be used in applications and integrations for transferring business-critical messages. To help you with that, we have listed and compared the top Azure Service Bus monitoring tools with their features.

Securing open source software dependencies in the public cloud

I recently recorded a Lightboard presentation on securing open source software dependencies in the public cloud. This blog summarises, and expands upon, some of the key elements from that presentation: I think about this topic through two lenses: software supply chains and updating software dependencies while maintaining stability.

Internal Developer Platform vs Internal Developer Portal: Solving for a Central System of Record, and Action

What support do developers need at large enterprises to be productive? We often fall into the trap of evaluating coders on output, maybe even innate talent. We think that the best way to build secure and efficient software is to hire 10X developers, and get out of their way. But even if the individuals have massive intellectual firepower, operational work grows like entropy in the system.

Build Your Own Network with Linux and Wireguard

Last Christmas, I bought my wife “Explain the cloud like I am 10” after she told me many times that it was hard for her to relate to what I am doing in my daily work at Qovery. While so far, I have been the sole reader to enjoy the book, I was wondering during my lecture if there were any resources to explain how to build all that. Most topics are software oriented.. So, in this article, I am going to explain how to build your own cloud network 🎊

Building Strong Linux Security and Compliance: CIS Benchmarks and More

What makes Linux security unique? What special considerations does Linux have across security standards like those set by The Center for Internet Security (CIS)? Every OS has their own unique considerations, and Linux is no different. We’ll also explore how Puppet can fit within your broader Linux security plan to help make hardening Linux that much easier.

Is this key finding from DORA Report 2023 holding back your team?

What's a key finding from the 2023 State of DevOps report? Nathen Harvey shared with us that teams' change review time is holding back their software delivery performance. You can use the DORA metrics to alleviate this bottleneck. Give Sleuth a try and see how we give teams actionable insights on how to improve, no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

How to detect and prevent memory leaks in Kubernetes applications

In our last blog, we talked about the importance of setting memory requests when deploying applications to Kubernetes. We explained how memory requests lets you specify how much memory (RAM for short) Kubernetes should reserve for a pod before deploying it. However, this only helps your pod get deployed. What happens when your pod is running and gradually consumes more RAM over time?

The Machine Learning Magic Suite: Anomaly Detection

Cloud computing and AI/machine learning (ML) are two powerful technologies that are even more impactful when used together. Cloud computing provides the infrastructure and resources needed to support AI/ML applications; while AI/ML enhances cloud computing by providing intelligent automation and decision-making capabilities.

Join the ITOps AI Revolution: Actionable Insights with VMware Tanzu Insights

Many organizations struggle with managing thousands of services and applications. A typical environment consists of a combination of modern cloud applications, on-premises workloads, and workloads that are in the process of being moved to the cloud. IT and operations teams can easily be overwhelmed by the large volume of data and activity that is generated across these systems.

Writing code with empathy: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

The job of a backend dev: Build good ACs: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Losing customers because of bad software: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

What you do in practice is what you do in a game: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Bugs in NASAs codebase and importance of QA in engineering : Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Auto Optimize Your Observability with a Time-Based Collection Strategy

Observability has become one of the largest line items in the IT budget, second only to cloud costs. A main reason for this is teams are often stuck collecting significantly more data than they need. This is where Circonus Passport helps. Rather than filter data after it’s collected like current observability data pipeline management tools, Passport is used to filter data before it’s collected.

Our Favorite Grafana Dashboards

Grafana is an open-source visualization and analytics tool that lets you query, graph, and alert on your time series metrics no matter where they are stored - Grafana dashboards provide telling insight into your organization. All data from Grafana Dashboards can be queried and presented with different types of panels ranging from time-series graphs and single stats displays to histograms, heat maps, and many more.

Monitoring your infrastructure with StatsD and Graphite

Collecting metrics about your servers, applications, and traffic is a critical part of an application development project. There are many things that can go wrong in production systems, and collecting and organizing data can help you pinpoint bottlenecks and problems in your infrastructure. In this article, we will discuss Graphite and StatsD, and how they can help form the basis of monitoring infrastructure.

Testing a Spring Boot API with SpringBootTest and CircleCI

When it comes to building and delivering modern web applications, the importance of continuous integration cannot be overemphasized. With the rapid pace of software development, ensuring that every change in your codebase is thoroughly tested and seamlessly integrated into your project is essential for maintaining a robust and dependable application.

Alerting, Incident Management and the SDLC | Better Incidents Podcast Ep. 8

In this episode we chat with veteran cloud architect Masaru Hoshi about the challenges of alert fatigue, the importance of effective alerting systems, and fostering ownership in software teams. Masaru shares insights from his 30-year career, emphasizing the need for balance, trust, and collaboration in incident response.

Modernize as you Migrate Series: Break Out of Analysis Paralysis with a 6 R Portfolio Assessment

Unlock the secrets to streamlining cloud migration and breaking free from Analysis Paralysis with Tidal Accelerator from @tidal4774. Discover how Tidal Accelerator empowers organizations to make informed decisions, optimize costs, and accelerate their cloud migration journey.

Global Event Rulesets: Streamlining Alert Routing Across Services

In the fast-paced world of organizations handling numerous microservices and projects, tackling the challenges that arise can be a daunting task. As many of our customers come with infrastructures that included a large number of microservices we set out to make it easier for them to streamline alert source management. Enter Global Event Rulesets (GER). This feature is designed to redefine the way you manage alerts.

The Ultimate SaaS Unit Economics Guide: Calculating Your Unit Costs

Measuring and monitoring unit economics can help your SaaS brand make informed business and engineering decisions. But how do you get that data and what exactly are SaaS unit economics? We’ll cover exactly what SaaS unit economics are, metrics you should monitor, how to calculate your unit economics, and the tools you can use to be successful.

CloudZero Launches Automated Savings Insights, Helping Engineers Prioritize Cost Efficiency

Getting to cloud efficiency is both an art and a science. The art has long been CloudZero’s specialty: giving engineers complete visibility into the costs associated with their cloud infrastructure, putting it all in a business context, and helping them make proactive decisions about how to build and scale efficiently. In other words, the art is on the human side. Giving human engineers the visibility and context to understand and control cloud costs.

Implementing Backstage 2: Using the Core Features

This article is the second installment of the “Implementing Backstage” series and focuses on how to use Backstage’s core features. Backstage has an extensible plugin architecture in active development and large community support and offers simplified tool management, workflow optimization, and time-saving features. However, to reap these benefits, you need to know how to use Backstage’s core features, including its software catalog, templates, documentation, and search.

Announcing HAProxy Enterprise 2.8 & HAProxy ALOHA 15.5

HAProxy Enterprise 2.8 and HAProxy ALOHA 15.5 are now available. Users of our enterprise-class software load balancer and hardware/virtual load balancer appliance who upgrade to the latest versions will benefit from all the features announced in the community version, HAProxy 2.8, plus some features that enhance the flexibility of our enterprise security options to meet even more use cases.

What is a SharePoint Site Collection?

SharePoint, born from the tech giant Microsoft, is not just another application; it’s a robust platform that’s been transforming the way businesses handle their internal processes for years. At its core, SharePoint is designed to streamline collaboration and document management. But what does that mean in layman’s terms? Imagine a vast digital library, where instead of books, you have documents, images, videos, and other digital content.

Configuration Drift: Understanding, Avoiding, Managing and Resolving in Kubernetes

If you work with Kubernetes, you know that any number of issues can pose a serious threat to the stability and security of your deployments. One that's subtly damaging is configuration drift, which occurs when the actual state of how your system is set up — its configuration — strays from the way you defined. Configuration drift in Kubernetes can happen when people make changes manually, systems aren't synchronized properly or monitoring falls short.

Charmed Kubeflow 1.8 Beta is here

Have you heard the news? Charmed Kubeflow 1.8 is available in Beta. Kubeflow is the foundation of Canonical MLOps. The latest release brings improved capabilities to personalise different components of the platform, including the images that can be used in Notebooks. We are looking for data scientists, machine learning engineers, creators and AI enthusiasts to take Charmed Kubeflow 1.8 Beta for a test drive and share their feedback with us.

Whose fault was it anyway? On blameless post-mortems

No one wants to be on the receiving end of the blame game—especially in the wake of a major incident. Sure, you know you were the one who made the final change that caused the incident. And hopefully, it was a small one that didn’t cause any SEV-1s. Still, the weight of knowing you caused something bad should be enough, right? Unfortunately, sometimes fingers get pointed, your name gets called, and suddenly, everyone knows that you’re the person who created more work for everyone.

Production vs Local in engineering: Piyush Verma - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

What is Zero Trust Reliability in engineering: Piyush Verma - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Stacked Pull Requests | GitKon 2022 Rewind | Tomas Reimers, Graphite

Step back into the vibrant world of GitKon 2022 with GitKon Rewind, spotlighting the captivating session, "Stacking PRs: How to be a 10x Engineer." We're thrilled to re-stream this eye-opening session, illuminating the game-changing benefits of stacked pull requests in today’s developer workflows. With Tomas Reimers steering the discourse, delve into the art of stacking—a smart branching maneuver off preceding feature branches, bypassing the traditional trunk route—unveiling a pathway to dodge development blockers with finesse.
Sponsored Post

Better CI/CD with GitHub Actions and deployment tracking

Understanding the impact of each of your deployments is crucial, especially as they become increasingly frequent. Chances are, your team is either aiming to increase shipping velocity or has already started deploying "continuously" (which is to say, multiple times a day). The biggest tech teams at the likes of Amazon and Google deploy thousands of times daily, and Atlassian has found that 75% of enterprise DevOps teams call deployment frequency their most important success criteria. And while CD comes with a host of well-established benefits, it also introduces a heightened risk of introducing new errors and issues.

The Best Cloud Infrastructure Automation Tools

The past decade has seen a drastic growth in the adoption of public cloud. One of the primary reasons for this is its cheaper infrastructure and ease of scale. With such rapid adoption of public cloud, the need for infrastructure automation also arises. This is because teams want to quickly provision infrastructure and automate tasks that previously took weeks in the case of traditional data centers, down to minutes in the public cloud.

What are Prometheus Functions?

Prometheus is a platform for real-time systems and event monitoring and alerting. The Prometheus project is free, open-source, and available on GitHub. Originally developed at SoundCloud, Prometheus became a project of the Cloud Native Computing Foundation in 2016, alongside other popular frameworks such as Kubernetes. The core of the project is the Prometheus server, which acts as the system’s “brain” by collecting various metrics and storing them in a time-series database.

SaaS COGS: What To Consider In Your Cost Of Goods Sold

It can be challenging to translate complex engineering concepts to business leaders. If you frame things well, though, you can have more productive conversations that lead to greater alignment between engineering and the rest of the business. With that in mind, I’d like to share some of my recommendations for discussing cost with your leadership team — and hope it might help you have stronger cost conversations.

Configuring Python StatsD Client

Building and deploying highly scalable, distributed applications in the ever-changing landscape of software development is only half the journey. The other half is monitoring your application states and instances while recording accurate metrics. There are moments when you wish to check how many resources are being consumed, how many files are under access by the specialized process, etc. These metrics provide valuable insights into our tech stack execution and management.

Zenbleed vulnerability fix for Ubuntu

On 24 July 2023, security researchers from Google’s Information Security Engineering team disclosed a hardware vulnerability affecting AMD’s Zen 2 family of microprocessors. They dubbed this vulnerability “Zenbleed” (CVE-2023-20593), evoking memories of previous vulnerabilities like HeartBleed and hinting at its possible impact.

Fostering a fearless engineering culture: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

The mistake boot in engineering: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

What's missing in engineering today?: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Engineers should have a desire to find bugs: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

The only industry not licensed to do their job - Engineering: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Software is ubiquitous and can change our mood: Piyush Verma - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

My job is an engineer = build ACs: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

The job of a backend dev: Build good ACs - Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

The only industry not licensed to do their job - Engineering: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Virtana Named in Prestigious Industry Research by Gartner

Virtana’s AI-powered platform is at the forefront of IT infrastructure management, offering a comprehensive suite of tools and services that empower IT leaders to make informed decisions on how to forecast demand and streamline operations. The rapid evolution of technology has ushered in an era of complexity and dynamism that IT leaders must navigate effectively.

Incident response and awareness acceleration: What we can learn from responders of Queenstown floods.

I was visiting Queenstown, New Zealand last week amidst the horrible floods which quickly escalated. As an incident responder myself, I was amazed at the operations and how fast responders on the ground acted in evacuating and clearing the grounds. Over 100 people were evacuated in the middle of the night with zero casualties. A commendable job. Here are some observations I made and what we can learn as incident responders ourselves..

Accelerated Remediations: How to Maximize AIOps Investments in Network Operations

So, you’ve spent some money and you’re the proud owner of a shiny new AIOps tool that helps improve your Network Operations. Network alarms are now usable, but with all the constant monitoring, supervision, and incident management, your Network Operations Center (NOC) is still overwhelmed. It’s time to pull out another stop.

A Journey through the Blameless Resource Library

From the very beginning of Blameless, we had two vital missions. First, to offer a solution to what we saw as a mounting crisis of reliability by offering a comprehensive, easy-to-use, reliability platform. Second, to educate the companies facing this crisis on the fundamentals of incident management, cutting-edge best practices, and the cultural values that sustain learning and growth.

The WeWork-ization of software: Piyush Verma - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

The most beautiful thing about Kubernetes: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Stop using debuggers, learn a mental model of a codebase: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

Observability-OSS vs Paid vs Managed OSS with Hosted Graphite

Observability is a critical aspect of modern software development and infrastructure management. It involves the ability to gain insights into the internal workings of your systems, applications, and services through monitoring and collecting relevant data. With the increasing complexity of technology stacks and the need for real-time visibility, observability has become a fundamental requirement for businesses across various industries.

On-Prem vs Colo vs. Cloud vs. Hybrid: How to Choose

With the ever-increasing demand for digital services, organizations face the questions of how will they keep up and where will they store and access their data and applications. The evolution of hosting options from on-premise infrastructure to colocation, cloud, and hybrid deployments has provided a diverse set of choices for organizations, and it is critical to select the option that best suits your unique needs.

Enterprise Chaos Engineering Certification Prep Session

Demonstrate your reliability expertise, increase your visibility, and advance your career with a Gremlin Enterprise Chaos Engineering certification. Chaos Engineering continues to grow in popularity and is rapidly becoming a job requirement for Engineering teams focused on reliability. In this webinar, Sr. Reliability Specialist Andre Newman goes over the mindset shifts, best practices, and key information you need to prep for your certification.

In engineering, DON'T BUILD FAST: Bill Kennedy - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

GitKraken Client 9.9 Release: Cloud Patches, Snooze & Pin, & more

It’s like passing a note in class, but with code! GitKraken Client 9.9 is here (and it’s a big one) with exciting features like Cloud Patches, which introduces a new way to share code, jump-start features, and fix issues without the hassle of pull requests. Plus, enhance your Focus View with Pinning & Snoozing options, and enjoy a more customizable commit graph.

SSO is now available

We now support SSO (single-sign on), offering an improved login experience for our customers. SSO can be enabled on our website. We want our customers to have a great experience when using our products and part of that is an easy sign-in experience for users. Enabling SSO will remove the need for users to use their Redgate ID and password when signing into the customer portal and compatible products.

Designing Your Cloud With Failure In Mind

Implementing any cloud development project can be tricky, and frustrating. Especially when you are pressured with time, reactive approaches, or cost-saving scenarios. However, there are some things you can do to implement solutions in your cloud architecture for long-term scalability and risk mitigation. Rather than short-term fixes until it arises again, consider designing your cloud with failure in mind, or speculating worst-case scenarios. It might sound counterintuitive or obvious.

Top DevOps Experts offer Key Insights at swampUP

With five keynotes and 15 breakout sessions in one day, there was no shortage of important industry knowledge and key insights from this year’s JFrog swampUP DevOps and DevSecOps user conference. Presenters discussed the role of DevOps at Netflix, how Fidelity migrated to the Cloud, the trend of shifting further left than left, and more. In this post we highlight the three presentations below that challenged attendees to rethink the status quo and reassess their own DevOps and security practices.

The new principles of incident alerting: it's time to evolve

In the ever-evolving world of software engineering, the landscape is constantly shifting. New technologies emerge, best practices evolve, and how we build and run software continues to change. However, when it comes to incident alerting, it often feels like we're stuck in the past.

Digital transformation trends in oil and gas

The Oil and Gas sector, responsible in part for two industrial revolutions in the last 300 years, has been something of a laggard when it comes to adoption of new technologies. Cloud and edge penetration in Oil and Gas is late compared to other industries, mainly due to long standing concerns around security and data privacy.

Double Down on Your Backups

In August, a ransomware attack hit another company. Unfortunately, it hit a regional cloud provider in Europe this time, and we can call this a “critical hit.” So far, we know a virtual server got compromised and used as a jump host; from there, the attacker started to encrypt all volumes in the same domain. Based on pure luck or some profound reconnaissance, the same server migrated into a different data center and continued its unplanned job from there.

Release Roundup Sept 2023: Measurably improve reliability

It’s been another busy few months here at Gremlin. Overall, our team has been working on feature improvements to enable teams to measurably improve the reliability of their systems, whether that’s through broadening platform support so you can run Gremlin in more places, making it easier than ever to identify reliability risks, or improving reporting so you can manage reliability programs effectively at enterprise scale. Here’s a summary of what’s new.

What is IPAM (IP address management)

The ability to manage IP addresses within a network is crucial for effective network management, especially as networks become more complex and have to manage more demanding loads. Assigning hundreds or even thousands of IP addresses to devices that may be highly distributed or disparate is no simple task. Once devices leave the network, those IP addresses may need to be deleted, plus there’s always the risk of IP address conflict.

Kubernetes Autoscaling for Continuous Integration/Continuous Deployment

Continuous Integration/Continuous Deployment (CI/CD), the ability to adapt swiftly to fluctuating workloads is paramount. Kubernetes, with its dynamic orchestration capabilities, offers an invaluable toolset for achieving seamless scalability. This article explores the concept of Kubernetes autoscaling and its pivotal role in optimising CI/CD pipelines.

Free Preview Environments For Open-Source Projects

We at Qovery are excited to offer our Preview Environments for free to all open-source projects. A Preview Environment is like a sandbox where developers can see how changes to the code will work before these changes are final. This is great for projects where many parts, like the backend, frontend, and databases, must talk to each other.

Azure File Storage Best Practices

Azure File Share is a cutting-edge service offered by Microsoft’s Azure platform. This robust solution allows seamless integration of serverless file sharing capabilities accessible through industry-leading protocols such as SMB, NFS, and Azure Files REST API. When effectively utilized, Azure File Share can drastically improve the file-sharing experience for cloud-based and on-premises deployments. In the realm of Dynamics 365 Business Central SaaS, it has demonstrated unparalleled benefits.