Operations | Monitoring | ITSM | DevOps | Cloud

January 2024

RCAs Within Incident Management Tools

The IT world thrives on uptime, efficiency, and seamless experiences. But amidst software and servers, glitches and disruptions threaten to bring operations to a halt. When these disruptions arrive, Incident Management takes center stage, collecting resources to restore order and minimize the chaos. Yet, simply fixing the immediate issue isn't enough. Preventing future disruptions requires delving deeper, finding the root cause, the reason that triggered the incident.

Introducing ManageEngine DDI: The key to unlocking the full potential of your critical network infrastructure

Building a future-ready network begins with integrating three core network services: DNS, DHCP, and IPAM, collectively known as DDI, which serves as the heart of network connectivity and operations.

How to get SharePoint Online Library Size

Navigating through the complexities of document management in SharePoint Online is essential in today’s digital landscape. SharePoint Online offers robust solutions for managing vast amounts of data. However, understanding and managing the size of these libraries is key to maintaining an efficient and streamlined digital workspace.

A practical approach to on-call compensation

Asking engineers to be on-call is usually a tough sell. Think about it: if someone asked you to add even more to your already packed workload, that would be a difficult proposition to say yes to. And that’s before you mention that this work typically happens late into the day and even (some) sleepless nights. Companies need to have an on-call function to keep their services and products running smoothly—it’s practically a non-negotiable at this point.

Product discovery in action | Atlassian Presents: Unleash | Atlassian

"Product discovery.” If you’re a product manager, you probably have some idea of what it is, but it’s a term that can mean different things to different people. That makes it hard to know how to get started – and whether or not you’re doing it right! The best way to learn about product discovery is to see how it works in real life for one team. In this session, you’ll learn the practices the Jira Product Discovery team used when building Jira Product Discovery.

Still running Ubuntu 18.04? What you need to know

Ubuntu 18.04 LTS, installed by millions of users and continuing to have a massive footprint on AWS, hit its End of Standard Support in May 2023. When using an unsupported version of Ubuntu LTS, your system and your end users are vulnerable to security risks. Not every company has the time or resources to undertake a migration project to a later and supported Ubuntu LTS distribution which is why many are adopting Ubuntu Pro.

Driving into 2024 - The automotive trends to look out for in the year ahead

With multiple technological innovations all converging at the same time, we are living in an exciting era for the automotive industry. From AI to 5G, and plenty in between, we can expect to see a host of groundbreaking trends emerge this year. As electric vehicles (EVs) completely disrupt the market and the OEMs’ business strategies, the customer focus is shifting away from traditional internal combustion engine (ICE) vehicles, challenging the way that cars are being built and designed.

Supercharge Your Azure Savings Strategy with Azure Dev/Test Subscription

An Azure subscription is a fundamental concept in the billing and management structure of Microsoft Azure. It serves as an agreement with Microsoft to use Azure services, where the services used are either paid for or are part of a free offer.

Enhancing Service Reliability: Uniting Rootly's Incident Management and Backstage's Software Catalog

In today's fast-paced digital landscape, ensuring the reliability of services is paramount for businesses aiming to deliver seamless user experiences. However, as the complexity of companies' environments grows, ensuring your services, infrastructure and applications are reliable and resilient to failure is challenging. It’s naive to think all services and infrastructure are operating 100% as designed.

#017 - Kubernetes for Humans Podcast with Nilesh Gule (DBS Bank)

Nilesh is a Vice President at DBS Bank, where he leads the design and implementation of scalable and innovative solutions using hybrid cloud, big data, and emerging technologies. He has over 2 decades of experience as a software developer, architect, and leader in various domains such as BFSI, healthcare, and retail. He is a Microsoft MVP for Azure since 2018 and the first Docker Captain in Singapore for his excellent contributions to technical communities.

GitKraken Workshop: Making Sense of Multi-Repo Madness

It's painful managing multiple repos. Whether you're a developer juggling between different app windows, or a manager looking to understand what's happening – GitKraken Workspaces and Insights provide an easy way to organize the chaos into something informative and actionable. (And don't worry. If you're working with a mono-repo, we'll have insights for your use case too.) In this workshop, Trevor Polidore and Kevin Bost explore ways to save time and reduce headaches when working with a large number of repos.

8 Powerful CLI Extensions on GitHub in 2024

The command line interface (CLI) has its roots embedded in the early eras of computing, back when storage was measured not in gigabytes, but in square feet of room space. Unlike today’s icon-filled screens, the CLI offers a text-only portal into the depths of a computer, using typed commands to perform operations. This simplicity masked an underlying robustness and efficiency, which has endeared it to generations of developers.

Zurich, a new low-carbon-powered public region, now available on Platform.sh

The new Platform.sh Swiss public region empowers organizations to meet sustainability, data localization, and compliance requirements in the Zurich Google Cloud region. Our new public region has been launched in response to increasing demand from Swiss businesses—and those with an international footprint doing business in Switzerland—that need to ensure their data is held locally.

Advancing Platform Engineering with Northflank and Civo

Mastering platform engineering is becoming critical for businesses aiming to streamline development processes, enhance collaboration, and scale infrastructure efficiently. Join Dinesh Majrekar (CTO at Civo) and Will Stewart (Co-Founder & CEO at Northflank) as they explore the intricacies of building robust, scalable platforms using Northflank on Civo's cloud infrastructure. Our panelists will guide you through the nuances of efficient application delivery, leveraging the synergies between Northflank and Civo for enhanced performance and scalability.

Best Programming Languages for DevOps in 2024

We're StatusPal. We help DevOps and SRE engineers effectively communicate to customers and stakeholders during incidents and maintenance with a super-charged hosted status page. Check us out—your status page can be up and running in minutes. As the DevOps and Site Reliability Engineering (SRE) fields continue to mature in 2024, the choice of programming languages has become more critical than ever.

Simple. Streamlined. Secure. Effortless user management has arrived with Teams

Have you ever found yourself dreaming of an easier way to collaborate on projects? Tired of individually managing users across multiple projects? Or perhaps you long for the day when auditing user access across your organization can be done in a flash? Well, you’re in luck!

Introduction to Charmed Spark, A cloud-native Apache Spark solution on Kubernetes

🚀 A cloud-native Apache Spark® solution on Kubernetes with 10 years of support, compliance, and security maintenance, Charmed Spark® is now available! Enterprise data engineers want Apache Spark® with the ease and long-term security commitment of Ubuntu, and Charmed Spark is the first of many Canonical open-source data solutions designed for reliability and multi-cloud operations.

Chaos To Control: Incident Management Process, Best Practices And Steps

Did you know, only 40% of companies with 100 employees or less have an Incident Response plan in place? Does that include you too? Even if it doesn't, this blog post is for you. Explore the Incident Management processes, best practices and steps so you can compare how your current IR process looks like and if you need to revamp it.
Sponsored Post

The Pulse Of Technology: Why IT Monitoring Is Non-Negotiable In 2024

It's 2024 already, and to say that IT monitoring is indispensable for operational resilience wouldn't be wrong. The Global IT monitoring tool market size was USD 17150 million in 2022 and the market is projected to reach 60302.6 million by 2031 exhibiting a CAGR of 15%. All the more reason to understand why IT monitoring is an absolute non-negotiable. So, in this blog we'll know the significance of IT monitoring in face of the modern technological challenges.

Kubernetes Tutorial for Developers

Welcome to our hands-on tutorial on Kubernetes, the powerful open-source platform often abbreviated as k8s. In this tutorial, we are diving into the world of container orchestration, simplifying the complexities, and making Kubernetes accessible for developers. Whether you are just starting out or looking to enhance your existing skills in Kubernetes, this guide is designed to walk you through the process step-by-step.

AI on-prem: what should you know?

Organisations are reshaping their digital strategies, and AI is at the heart of these changes, with many projects now ready to run in production. Enterprises often start these AI projects on the public cloud because of the ability to minimise the hardware burden. However, as initiatives scale, organisations often look to migrate the workloads on-prem for reasons including costs, digital sovereignty or compliance requirements.

Making Informed Software Selection Decisions: MCDA and DMA Compared

When faced with complex software selection choices, navigating the numerous factors involved can be difficult. That's where Multi-Criteria Decision Analysis (MCDA) and Decision Matrix Analysis (DMA) come into play. Both methods provide structured and transparent approaches to evaluating options, but they differ in complexity and suitability for different types of decisions.

Mastering Workflow Automation in SharePoint Online

The integration of workflow automation in SharePoint Online marks a significant evolution in business process management. With the emergence of tools like Power Automate, SharePoint Online transcends its role as a document management system, becoming a powerful platform for streamlining a variety of business operations. This advancement is not just a matter of convenience; it’s a strategic transformation, enabling organizations to optimize productivity and efficiency.

Managing the Talent Gap

As we dive into 2024 the relentless march of technological progress combined with economic green shoots (the year the ‘UK turns a page on the difficult post post-pandemic years ) it should be an exciting opportunity for Data Centres and the talent who work in them…and whilst people are clearly excited about the opportunities this presents for the industry, there remains a nervousness.

What can language models actually do well? | #GitKraken CTO at #Dockercon #shorts

What are language models good at, and where do they struggle? 🤖 While LMs are improving by the day, they still struggle with handling the "gray area" around certain tasks. But simple problems & solutions? That's where they shine. ✨

Micrometer: The Gold Standard in Observability

In the dynamic realm of observability frameworks, one name stands out as the unspoken gold standard with the ability to guide developers through the intricacies of monitoring and data collection. Let’s look at the heart of open-source technologies and uncover the essence of a tool that has become the industry's beacon of excellence.

Team DevOps effectiveness scorecard overview | Atlassian Analytics Demos | Atlassian

Watch this demo for an overview of the Team DevOps effectiveness scorecard dashboard template in Atlassian Analytics. To ensure we’re providing useful content, please let us know if this video is helpful by liking it. If you have additional feedback, feel free to share in our Community or through a support ticket.

Organization DevOps effectiveness scorecard overview | Atlassian Analytics Demos | Atlassian

Watch this demo for an overview of the Organization DevOps effectiveness scorecard dashboard template in Atlassian Analytics. To ensure we’re providing useful content, please let us know if this video is helpful by liking it. If you have additional feedback, feel free to share in our Community or through a support ticket.

Cross-team DevOps effectiveness scorecard overview | Atlassian Analytics Demos | Atlassian

Watch this demo for an overview of the Cross-team DevOps effectiveness scorecard dashboard template in Atlassian Analytics. To ensure we’re providing useful content, please let us know if this video is helpful by liking it. If you have additional feedback, feel free to share in our Community or through a support ticket.

System Reliability Metrics: A Comparative Guide to MTTR, MTBF, MTTD, and MTTF

In the ever-evolving landscape of technology, where systems and applications play a pivotal role in our daily lives, ensuring their reliability has become a critical concern for organizations. Unforeseen incidents and downtime can lead to significant financial losses, damage to reputation, and decreased customer satisfaction. In the realm of incident management and site reliability engineering (SRE), understanding and leveraging key reliability metrics is essential.

On-Demand Webcast: Unleashing FinOps

The growing popularity of FinOps is creating an opportunity for you to level up your entire approach to IT financial and cost management by embracing FinOps as a foundational discipline that you apply to your entire technology estate. It’s not just about saving money — it’s about making smarter, data-driven decisions that fuel growth and innovation.

21+ DevOps Monitoring Tools Vital To Success

DevOps emphasizes continuous improvement, rapid delivery, and cost optimization. It does this by recommending several engineering best practices you can implement in your IT environments. DevOps also emphasizes automation to improve efficiency and engineering velocity in software delivery. In this guide, we will cover the importance of DevOps monitoring. This will include what to monitor and a list of some of the best monitoring tools for DevOps teams.

How To Maximize Free Cloud Resources Without Overstepping

Navigating the cloud computing landscape often involves a delicate balancing act between leveraging available resources and managing costs. Amazon Web Services (AWS), a leader in cloud services, offers an enticing proposition through its Free Tier, especially with services like Amazon EC2 and RDS. These offerings present a unique opportunity for IT professionals to push the boundaries of cloud computing, but they also pose the challenges of staying within budgetary constraints.

Announcing the Mattermost Trustcenter

Our mission is to make the world safer and more productive by developing and delivering secure, open source collaboration software. And that mission starts with ensuring that our customers can make informed decisions about their software choices. That’s why we’re excited to introduce the Mattermost Trustcenter.

The Debrief: Why we killed our Slackbot and bought incident.io with Michael Cullum of Bud Financial

For financial services companies, good incident management is absolutely critical—maybe more so than in other industries. So, for Michael Cullum and his team at Bud Financial, the choice to build an incident response tool felt right for them in the moment. But very quickly, Michael and the team came face-to-face with the myriad limitations that come with building your own response tooling.

What to expect from Civo in 2024: GPUs, Navigate, and ML

From the announcement of our first tech event, Civo Navigate, to building a machine learning landscape, 2023 has been an incredible year for Civo. Last year enabled us to shift our focus towards community engagement, product enhancements, and sustainability, allowing us to foster a more robust, inclusive, and environmentally conscious technology ecosystem. Let’s take a look back at the incredible work the team has been working on over the past year and what we can expect to see in 2024.

What's new with AWS for 2024

Although IT spend, including on cloud services, cooled somewhat in 2023, the second half of the year showed signs of stability, even resilience, buoyed in part by the growing interest in generative AI. For cloud market leader AWS, with 31% global share, the company’s efforts to cut costs and enhance efficiency resulted in substantial profit improvements. Like its peers, AWS also sees great potential in AI and came out with a slew of capabilities towards the end of the year.

Seven Jellyfish alternatives driving engineering efficiency and impact

Jellyfish is one of the most popular engineering management platforms, offering comprehensive insights into engineering organizations, their tasks, and operational processes. Engineering management platforms aggregate and analyze metrics from various tools and systems that enable the software delivery process and development lifecycle. Jellyfish and other engineering management platforms aim to connect key development processes and decisions to overarching business goals.

Reliability At Your Fingertips | Squadcast

Reliability Automation Platform from Squadcast! Squadcast helps global teams streamline Incident Management with a unified platform for on-call and incident response. We help teams at over 500 businesses around the world to automate tasks, get notified of critical events, and work together to resolve incidents and minimize impact to business. Key Features of Our Reliability Automation Platform.

Create Follow the sun Oncall model

Explore the efficient setup of a Follow-the-Sun on-call model using Spike.sh. This video provides a step-by-step guide for tech professionals to implement this global, time-zone-optimized on-call strategy seamlessly. Enhance your team's responsiveness and reduce burnout with our expert tips and insights. Perfect for IT and DevOps teams aiming for 24/7 incident management without compromising on efficiency.

What is a Failover Cluster? How It Works & Applications

Seamless operations and system resilience are critical concerns in modern business IT, and failover cluster technology can play a major role in achieving these goals. This article delves deep into the core of failover clusters, exploring their functionality and applications. Whether you’re a seasoned IT professional or a curious enthusiast, read on to understand the intricacies of failover clusters, discover how they work, and see scenarios where they prove indispensable.

How Organizations Hire SRE's- Laterals or Internal?

Securing reliable system operation necessitates building a formidable Site Reliability Engineering (SRE) team. However, a critical strategic decision confronts every organization: do we cultivate SRE talent internally or venture into the external talent pool? Both approaches possess distinct advantages and disadvantages, each impacting the composition, skillset, and overall effectiveness of the SRE team.

How to Optimize Your AWS Costs in 2024: Top Vendors, Tips, and More

Sigh...in the blink of an eye, AWS now has hundreds of thousands of SKUs, 180 different types of services, 17 different ways to launch a container, and has dozens upon dozens of regions and availability zones. No wonder AWS billing has become so complex to understand. Is there a fix?

Empowering Productivity: Unleashing the Full Potential of Device Management Solutions

Effective device management solutions are crucial in today’s dynamic workplace, where working with several devices is the norm. As companies navigate hybrid models, remote work, and an expanding ecosystem of devices, device management solutions become necessary. In this article, we will look at the fundamentals of this technology paradigm and see how firms can increase productivity by making the most of device management solutions.

The testing pyramid: Strategic software testing for Agile teams

The testing pyramid model untangles the complexity of software testing by fitting it into an efficient hierarchical structure. By focusing on unit tests at the base, integration tests in the middle, and end-to-end tests at the top, the testing pyramid ensures that most testing efforts are spent on tests that are fast, reliable, and easy to maintain. This allows for quicker iterations, improved code quality, and more stable releases.

Integrating AI and DevOps for Software Development Teams

For a long time, the domains of Machine Learning and AI on one side, and software development on the other side, were separate kingdoms. Sometimes, they touched, and something magical would happen. But more often, things didn’t really work out. They faced challenges stemming from a lack of mutual understanding, shared language, and compatible tools. With the meteoric rise and increased accessibility of powerful generative AI and LLMs, the need for collaboration to achieve real-world engineering and customer value has never been more vital.

The Top 25 Vendor Selection Software and Tools in 2024

I’m LV, co-founder and CEO of Taloflow, a technology selection platform built for the AI productivity era. I’ve spent more than a year interviewing over 200 enterprise IT procurement professionals and IT and Engineering decision-makers and have found that the space of software selection and procurement is undergoing a sea change in expectations. Based on the interviews, I’ve determined nine steps in the software procurement lifecycle, but these are more fluid than ever.

Role of Human Oversight in AI-Driven Incident Management and SRE

In the fast-paced landscape of technology, AI-driven Incident Management and Site Reliability Engineering (SRE) have emerged as critical components in ensuring the seamless functioning of digital systems. AI algorithms are increasingly employed to detect, diagnose, and resolve incidents with unprecedented speed and efficiency, revolutionizing the traditional approaches to reliability.

Blameless CommsAssist - 3 Tips on Making Incident Communication Easy

When you’re in the thick of an incident, communication is both essential and challenging. A wide variety of stakeholders will need timely updates on the situation in order to respond effectively. At the same time, breaking away from the actual diagnostic and resolving work to send these updates can massively slow progress.

AI Explainer: Demystifying Embeddings

In a previous blog post, entitled "What's Our Vector, Victor?," I went through the basics of vector databases. That post explained how vector databases are used by large language models, and one of the concepts included was this brief explanation of embeddings: So, let's dig in a little more on this. Embeddings, in the context of vector databases, refer to vector representations of data points or entities within the database.

What is Bash Scripting? Tutorial and Tips

Linux is a powerful and versatile operating system that’s widely recognized for its robustness, security, and open-source nature. It is a Unix-like OS, which means it shares many characteristics with the original Unix system developed in the 1970s. Linux has become a preferred choice in various computing environments, from personal computers and servers to embedded systems and supercomputers.

What Is Continuous Delivery and How Does It Work?

Continuous delivery (CD) is an application development practice that involves automatically preparing code changes for release to a production environment. Combined with continuous integration (CI), continuous delivery is a key aspect of modern software development. Together, these two practices are known as CI/CD. Properly implemented CI enables developers to deploy any code change to testing and production environments late in the software development lifecycle (SDLC).

10 Top Kubernetes Alternatives (And Should You Switch?)

Containers and microservices are revolutionizing how distributed applications are built, run, and optimized. They enable apps to be highly scalable. You can also isolate some areas for updates and patches without shutting down the entire application or service. Yet, managing containers and microservices at scale can be tricky. That’s where a container management platform like Kubernetes comes in – or, as you’ll see below, where the top Kubernetes alternatives shine.

The need for speed: achieving high data rates from data centre to cloud

Having decided to embrace the opportunities of the cloud, be it a full migration, or developing a hybrid infrastructure, one of the first questions a business faces is how to transfer all its data into the cloud. This question will evolve rapidly, as the business realises that not only must it continually transfer new data to the cloud, but that the amount of data is going to grow massively.

Confused by Kubernetes Multi-Tenancy? A Workshop with Dario Tranchitella - Navigate Europe 23

Dive into the world of Kubernetes multi-tenancy with Dario from Clastix in this Navigate workshop. Explore the complexities of Kubernetes environments and learn about innovative solutions like Capsule and Paralus that revolutionize how multi-tenancy is managed. Video is packed with valuable insights and practical demonstrations.

How to build DevOps automations with Kosli Actions

Kosli allows regulated organizations to scale their continuous delivery so that they can deploy changes to production at maximum speed without the risk of non-compliance. It does this by recording all of the data you need to get through regulatory events like audits. With Kosli you can record everything that happens in your software delivery process from initial requirement all the way through to deployment to production. Events like builds, tests, scans, code reviews, etc.

Canonical's recipe for High Performance Computing

In essence, High Performance Computing (HPC) is quite simple. Speed and scale. In practice, the concept is quite complex and hard to achieve. It is not dissimilar to what happens when you go from a regular car to a supercar or a hypercar – the challenges and problems you encounter at 100 km/h are vastly different from those at 300 km/h. A whole new set of constraints emerges.

LLM hallucinations: How to detect and prevent them with CI

An LLM hallucination occurs when a large language model (LLM) generates a response that is either factually incorrect, nonsensical, or disconnected from the input prompt. Hallucinations are a byproduct of the probabilistic nature of language models, which generate responses based on patterns learned from vast datasets rather than factual understanding.

Azure Storage cost optimization to achieve maximum cost savings

Azure Storage Cost Optimization is a crucial aspect for organizations looking to harness the power of Azure storage while keeping expenses in check. This involves implementing strategies to minimize expenses, optimize resource utilization, and select appropriate storage types. It encompasses understanding and leveraging various features to optimize resource utilization, choosing the right storage types, and implementing best practices.

Azure Unit Economics for Crafting a Financially Sound Strategy

Embarking on a journey through the cloud landscape, Azure Unit Economics is a compass, guiding through the intricacies of financial optimization in the realm of Microsoft Azure. This blog post aims to clarify the complexity of Azure Unit Economics, underscoring its critical role in optimizing resource allocation and ensuring cost-effectiveness within the realm of cloud computing.

NEW FEATURE: Rainbow Deployments with Zero Downtime! (Jan 24 2024)

In this video, Jake Warner (CEO / Founder) does a quick dive into one of our latest features: Rainbow Deployments. These deployments enable teams to maintain multiple independent versions of their applications within one single environment. Additionally, deployments make it easy to take new versions of your applications live with zero downtime. Cycle is a container orchestration and infrastructure management platform built to provide a true alternative to Kubernetes.

Monitor Heroku Add-Ons Using Hosted Graphite

Monitoring your Heroku stack helps you understand the performance of your application and infrastructure. You can identify bottlenecks, slow-performing queries, or resource-intensive processes and optimize them. Monitoring also allows you to detect issues or anomalies in real-time. By setting up alerts based on predefined thresholds, you can be notified as soon as something goes wrong, enabling you to address the issue before it affects users.

How Squadcast Helps With Flapping Alerts

Often we receive a series of alerts that get auto-resolved within a short period of time. Such alerts are called flapping or transient alerts. In this blog, we'll explore Auto Pause transient alert (APTA) feature that detects flapping alerts and temporarily pause incident notifications hence reducing alert fatigue.

The 25 Most Crucial Software Engineering Tools In 2024

As a modern software engineer or CTO, you’re responsible for building, delivering, and maintaining high-quality software solutions at scale. Yet, software programs have grown increasingly complex over time, requiring meticulous work. The competition threatens to take your subscribers every billing cycle if you don’t constantly innovate, too. Customers want more, bigger, and better upgrades and updates.

Kubernetes Vs. Openshift

Kubernetes and OpenShift are two major players in the container management space, each offering unique advantages and challenges. This article will discuss both these platforms, comparing their strengths and weaknesses. We will also explore their ideal use cases, and how a platform like Qovery bridges the gap between them. Whether you're a proficient DevOps or a developer, understanding these platforms can help you make the right choice for your container-based applications.

The Financial Services Automation Toolkit for Orchestrating Existing Automations with ITPA

The spike in pressure among the financial services industry is one that today’s organizations have to get in front of. When digital transformation, the demands for modernization, and fierce competition began in recent years, proactive approaches became the requirement. Companies that still react to change are likely to struggle in the current, trailblazing landscape, and they may even fail to reach business-critical goals.

SysAdmin's guide to migrating from CentOS

CentOS EOL - Are you affected? CentOS used to be community driven. Imagine an OS being tested by a global community of volunteers against a testing team in a company—that gave CentOS unmatched stability. An OS that came with Securuty-Enhanced Linux (SELinux) by default and also included 10-year support meant it was the favorite of both individual developers and enterprises as well (even Facebook, now known as Meta, used CentOS for its data centers).

Cortex Notifications: Stay up to date while staying in flow

Notifications are designed to be annoying. Think of your phone buzzing in a quiet room: it demands your attention, lighting up your screen and making noise so you look at it. A notification is supposed to pull you away from whatever you’re working on. They can be useful, but they can also be a nuisance.

Transforming DevOps with IaC and GitOps - John Dietz & Jared Edwards - Navigate Europe 23

Join John Dietz and Jared Edwards as they navigate the transformative journey from Infrastructure as Code (IaC) to GitOps, emphasizing the pivotal role of Kubernetes. This talk offers an in-depth look at combining IaC practices with GitOps advantages for robust and efficient DevOps operations. Don't miss this workshop on tool integration, workflow management, and a live demo illustrating GitOps in action.

Step-by-step Guide for Monitoring Redis Using Telegraf and MetricFire

Monitoring Redis instances is essential for maintaining performance, reliability, and security. It allows you to detect issues early, optimize resources, and provide a seamless experience for both developers and end-users. Monitoring your database allows you to track key performance metrics such as memory usage, CPU usage, and query response times. By analyzing these metrics, you can identify performance bottlenecks, optimize queries, and ensure that Redis is operating efficiently.

Simplifying Service Dependency With Squadcast's Service Graph

Microservices are fantastic for agility and innovation, but the trade-off is complex service management and ownership. With hundreds of interconnected services, troubleshooting and Incident Response can become a potential blocker. The traditional siloed approach to service ownership and the increasing deployment makes service management more complex.

Cloud Scaling: Secrets to Stability + Security When Scaling Cloud Computing

If you’re doing cloud operations right, your cloud needs are going to change over time. Cloud scaling can help you add cloud resources when you need them and retire or recycle them when you don’t. Cloud scaling is great for meeting traffic demands, accommodating demanding workloads, and controlling the chaos of cloud ops. But for all the flexibility that cloud scaling offers, it can also introduce liabilities in your cloud infrastructure.

What To Do When A Customer (Or Segment) Is Costing Your SaaS Business Too Much

You’re a responsible SaaS company leader, so you understand the importance of tracking your cloud costs in detail. Perhaps you’ve even begun working with us at CloudZero, and you’re starting to see data and insights hit your dashboard. If so, you may have noticed — because this happens to all of us in the SaaS world at some point — that some customers cost your business far more than others. Suppose you’re also tracking your revenue per customer.

Streamlining Cloud Costs With Smart Management Strategies

Cost optimization within cloud services is not just about cutting services; it’s about investing resources wisely to achieve greater efficiency and growth. Amazon Web Services (AWS) continues to be a leader in providing solutions that help businesses manage and optimize their cloud spending. This guide aims to guide you through the complex world of AWS cost management, highlighting key indicators and tools essential for keeping your cloud expenses in check.

What is microservices architecture?

Microservices architecture is a method of developing software systems that structures an application as a collection of loosely coupled services, each focusing on a single function or business capability. Each service operates within a discrete, confined context, communicating with other services through well-defined interfaces — typically APIs.

What Is Kubernetes? What You Need To Know As A Developer

Building containerized applications opens doors to efficiency and scalability, especially for developers looking to streamline their workflows. Kubernetes, a game-changer in container orchestration, makes it easier for developers to manage these applications. This article will discuss in great detail on what Kubernetes is, why it matters, and how it simplifies container orchestration, paving the way for robust and flexible applications.

"Our job is not to write code." | #GitKraken CTO Eric Amodio at #Dockercon

How should devs feel AI's rapid growth? What will happen to their jobs? 🤯 Well, GitKraken CTO Eric Amodio doesn't think developers should have an existential crisis. In a #Dockercon keynote with Justin Cormack, he explains that developers' jobs encompass way more than just writing code – it's about problem solving & critical thinking.

The Frugal Architect, Law I: Make Cost A Non-Functional Requirement

This is part one of seven in our Frugal Architect blog series. In case you weren’t as giddy as CloudZero was at re:Invent this year, we wanted to recount the seven laws outlined by Werner Vogels, Amazon’s CTO, which he’s bundled into a framework called “The Frugal Architect”. What is “The Frugal Architect”? A constitution of sorts for how engineers can build high-functioning, cost-efficient cloud software.

How Gremlin's dependency discovery feature works

Modern applications are rarely created entirely from scratch. Instead, they rely on a framework of pre-existing applications and services, each adding specific features and functionality. These dependencies empower teams to build and deploy applications more efficiently, but they bring their own set of challenges. Tracking, managing, and updating these dependencies is difficult, especially in large, complex applications where dependencies are likely managed by different teams.

The Debrief: Building AI-Related Incidents

Recently we went live with one of our biggest product launches to date AI. And this product was unique in that it was broken up into four smaller projects: So naturally most folks might be wondering: What were the biggest differences between these projects and what went into actually building out each of these features? In this episode, you'll hear from Rob and Isaac, both Product Engineers who played a really critical role in the building out of related incidents, to get a peek behind the curtain.

Monitor Oracle managed databases with Datadog DBM

Datadog Database Monitoring (DBM), which provides host-level and query performance metrics and insights for PostgreSQL, MySQL, and SQL Server, is now available for Oracle. Oracle is one of the most common database types, and now teams that operate Oracle databases can use Datadog to monitor these resources alongside telemetry from across their environments.

Azure VM Autoscaling to enhance performance and cost efficiency

Azure VM Autoscaling is a feature provided by Microsoft Azure that allows to automatically adjust the number of Virtual Machines (VMs) in a specific scale set based on predefined criteria such as load, performance metrics, or a schedule. This post delves into the significance of autoscaling within Azure VMs, spotlighting its role in cost optimization, performance enhancement, and improved availability.

Understanding Cardinality with Levitate's Cardinality Explorer

Predicting the future is hard, especially with metrics-based monitoring systems, because metrics cardinality can snowball. This is important because it affects query performance adversely. Having visibility into what’s happening now and workflows to manage cardinality is crucial. Because the answers depend on the quality of questions, a system allows you to ask. The questions one may have is —

Digital Realty and Console Connect Collaborate for Greater Interconnectivity

In today's hyper-connected world where data reigns supreme, businesses need agile, reliable, and cost-effective ways to seamlessly integrate their digital ecosystems. Understanding this critical need, Console Connect and Digital Realty are taking our collaboration to the next level to bring new levels of connectivity and choice to enterprises, carriers, and service providers.

What is Hyper-V? Key Features and Capabilities

Virtualization involves creating a virtual representation of resources, such as servers, storage, or networks, to efficiently utilize hardware and enhance resource management. This technology has revolutionized IT by enabling the creation of virtual machines (VMs) that operate independently within a shared physical environment.

What is Software-Defined Networking (SDN)?

Modern networking methodologies have evolved significantly, paving the way for innovations like Software-Defined Networking (SDN). Recent trends, such as increased demand for cloud services, data center virtualization, and the growing complexity of networks, have necessitated more dynamic and flexible network solutions. These advancements have set the stage for SDN by highlighting the limitations of traditional, hardware-centric networking.

8 Common Network Issues & How to Address Them

A stable network is synonymous with operational reliability in an era where digital transactions and communications form the backbone of most operations. Unstable networks can lead to many problems – from lost revenue to damaged reputations. Imagine a retail company experiencing network slowdowns or outages during a high-volume sales period. This could result in significant sales losses and customer dissatisfaction.

What Is Cloud Elasticity And How Does It Affect Cloud Spend?

Cloud computing provides significant benefits over on-premises computing, including the ability to expand operations without purchasing new hardware. But there’s more. With cloud computing, you can adjust compute resources to meet changing demands. For example, you can buy extra online storage for your chatbot system as you receive increasing customer inquiries over time. You’d pay for as much online storage as you use.

NGINX Access and Error Logs

Nginx, a widely used web server and reverse proxy, maintains two crucial logs that provide valuable insights into its performance and user interactions: the access log and the error log. These logs play a pivotal role in monitoring and troubleshooting web server activities. The access log records every request made to the server, capturing details such as the requested URL, client's IP address, response status code, and user agent.

Finding relationships in your data with embeddings

With the world still working out the limits of LLMs and ever more powerful models being released each month, it’s a little hard to know where to begin. Whether it’s summarising and generating text, building a useful chat assistant, or comparing the relatedness of strings with embeddings, almost all of this now can be done via a few simple API calls. It has never been easier to incorporate these new technologies into your own product.

Coming Soon: Cloudsmith Migration Toolkit

One of our core motivations in building Cloudsmith is to make software developers' lives easier. We want Cloudsmith to be one of those great products that feels intuitive and automates everything. As we’re picking up more and larger customers, we’re seeing an increased need for migration tools. We want to make it as easy as possible for teams who are stuck using JFrog Artifactory, or Sonatype Nexus, or other legacy tools to move over to the joy of SaaS artifact management using Cloudsmith.

Real Production Readiness with Internal Developer Portals

In cultures of continuous improvement, the criteria by which teams define a release's fitness for production is flexible by definition. Engineering organizations strive to balance risk and velocity, aiming for high quality releases on a cadence that doesn’t impede overall business throughput.

Analyze Your Mailchimp Campaigns Using Telegraf

Monitoring your email campaigns helps you track key performance indicators (KPIs) such as open rates, click-through rates, and conversion rates. This evaluation provides insights into the success of your email campaigns and allows you to identify areas for improvement and by analyzing metrics like open rates and click-through rates, you can gauge the level of engagement your emails are generating.

Conceptual Pillars Of Kubernetes

Kubernetes, often abbreviated as K8s, is a powerful container orchestration platform that has revolutionized the way modern applications are developed, deployed, and managed. At its core, Kubernetes relies on several conceptual pillars that form the foundation of its design and functionality. Let’s delve into these fundamental principles that underpin Kubernetes.

What's New in Serverless360: Databricks monitoring, Azure Cost group notifications, etc..

Serverless360's latest update, brings a suite of enhancements. It includes Databricks management, resource availability monitoring, query insights for Azure SQL, and activity log monitoring for Azure services. The Cost Analyzer offers customized rightsizing, cost group notifications, and enhanced cost data at the Service/Meter level.

Building a GPT-style Assistant for historical incident analysis

Like most things, our AI Assistant started out as an idea. One of our data scientists, Ed, was working with our customers to improve our existing insights. But the most common theme that kept surfacing was the wide-range of use cases that our customers wanted to use insights for. Using this user feedback as our inspiration, we came up with the idea of a natural language assistant that you can use to explore your incident data.

The Debrief: incident.io, say hello to AI

This week was a particularly exciting one for us at incident.io. We launched not one, not two, but four AI-powered features to help folks get the most out of their incidents. In this episode of The Debrief, we sit down with Ed Dean, Product Analyst, and Charlie Revett, Product Engineer, to talk through all of these features and discuss how they're already making a measurable impact. You'll also hear them talk about: You can learn more about our AI features here.

Terraform Time | Distribute PagerDuty config utilising Terraform Remote State

We'll explore how to distribute PagerDuty configuration between multiple repositories leveraging Terraform Remote State feature. You will be able to access the code written during this Terraform Time episode in the following Github repository.

Mattermost v9.4: IP filtering, bring your own key & cloud-native compliance exports in Mattermost Cloud Enterprise

Mattermost v9.4 includes several new features designed to significantly enhance digital security and compliance, including the introduction of IP filtering, bring your own key (BYOK) for data control, and cloud-native compliance export. IP filtering tightens access control, BYOK offers greater data protection through personalized encryption, and streamlined compliance reporting ensures adherence to regulatory standards.

The alert fatigue dilemma: A call for change in how we manage on-call

Once the unsung heroes of the digital realm, engineers are now caught in a cycle of perpetual interruptions thanks to alerting systems that haven't kept pace with evolving needs. A constant stream of notifications has turned on-call duty into a source of frustration, stress, and poor work-life balance. In 2021, 83% percent of software engineers surveyed reported feelings of burnout from high workloads, inefficient processes, and unclear goals and targets.

5 Reasons Why GitKraken Client is the Ultimate Tool for GitHub Users

GitHub is the industry standard for version control and collaboration. It’s reliable, it’s robust, and it gets the job done. But here’s the thing: industry standard doesn’t automatically translate to gold standard. So how do today’s top devs elevate their GitHub experience from good to great? They pair it with GitKraken Client. Below are the top five reasons why GitKraken Client is a game-changer for GitHub users.

Experience Omni Dev with CodeZero with Narayan Sainaney - Navigate Europe 23

Join Narayan Sainaney, co-founder and CTO of Codezero, as he unveils the groundbreaking concept of Omni-development on Kubernetes. Explore the challenges developers face with microservices and how Codezero's solution streamlines the development process, reducing complexity and enhancing productivity. Watch as Narayan demonstrates the practical application of this novel approach and discusses its impact on the future of software development.

The Benefits of Using DCIM Software for Data Center Cable Management

One of the primary benefits of DCIM software is the enhanced visibility it provides into the physical infrastructure of a data center. This software creates detailed, accurate, and up-to-date maps of all the cabling and connections within the facility. Such documentation is invaluable for troubleshooting, as it enables IT staff to quickly identify and resolve issues related to cable connections.

New CNCF Survey Highlights GitOps Adoption Trends - 91% of Respondents Are Already Onboard

Amid all the activity and excitement at KubeCon and ArgoCon 2023 – including some exciting news of our own – the CNCF released the results of a new micro survey assessing trends in GitOps adoption. There’s a lot of great data tucked inside, and worth the read to find out what peers across the industry are saying about GitOps usage in their organizations.

How to Monitor Your RabbitMQ Performance Using Telegraf

Monitoring RabbitMQ is essential for maintaining the health, performance, and reliability of your messaging infrastructure. It empowers you to take proactive measures, prevent downtime, and deliver a seamless messaging experience for your applications and users. Monitoring helps you keep an eye on the performance metrics of RabbitMQ, such as message rates, queue lengths, and resource utilization.

Lessons learned from building our first AI product

Since the advent of ChatGPT, companies have been racing to build AI features into their product. Previously, if you wanted AI features you needed to hire a team of specialists to build machine learning models in-house. But now that OpenAI’s models are an API call away, the investment required to build shiny AI has never been lower. We were one of those companies. Here’s our journey to building our first AI feature, and some practical advice if you’ll be doing the same.

Does Every Incident Need a Retrospective? Here's What the Experts Have to Say

Every quarter, we host a roundtable discussion centered around the challenges encountered by incident responders at the world’s leading organizations. These discussions are lightly facilitated and vendor-agnostic, with a carefully curated group of experts. Everyone brings their own unique perspective and experience to the group as we dive deep into the real-world challenges incident responders are facing today.

From Git to Deployed - Fast Platform Delivery - Colin Griffin & Dinesh Majrekar - Navigate Europe 23

Join Colin Griffin and Dinesh Majrekar in this insightful workshop. Dive into the world of application and platform engineering with Civo and Krumware's experts as they unravel the intricacies of IT Ops, developers' tools, and infrastructure management. This workshop blends theory with hands-on exercises, equipping participants with practical skills and an understanding of modern platform engineering concepts.

Continuous Compliance Content Hub

The Continuous Compliance content hub is a set of guides for DevOps teams who need to move fast while remaining in compliance for audit and security purposes. We know that the old change management processes for software releases that happened once every 6 months don’t scale for DevOps teams who want to deploy every day. This is where Continuous Compliance comes in.

Effortlessly keep EKS AMI up to date with Spot Ocean

New Kubernetes versions can introduce significant changes and security updates. When Amazon Elastic Kubernetes Service (Amazon EKS) releases a new version or a security patch, it is the user’s responsibility to be aware of updates and to perform the full update on their side. This can be a tedious and time-consuming task. Until today, Spot Ocean users needed to manually update their AMIs after updating EKS version.

Test-driven development (TDD) explained

Test-driven development (TDD) is a software development process that involves writing tests for your code before you write the code. This approach has transformed the development methodology around testing. While the traditional waterfall model of software development was linear, with testing occurring near the end of one long timeline, TDD makes testing an ongoing, iterative process.

Ep. 12: Let's talk serverless: A discussion with AWS hero Yan Cui

Ep. 12: Let’s talk serverless: A discussion with AWS hero Yan Cui In this episode of Cloud Control, host Shon Harris engages with Yan Cui, an AWS Serverless hero and consultant, to explore the transformative world of serverless computing. Yan shares his journey from traditional infrastructure to serverless technologies, discussing the evolution of AWS and the pivotal role serverless plays in modern computing. They delve into practical applications, the impact of AI and ML on serverless environments, and the significance of community and knowledge sharing in tech.

How to Monitor PostgreSQL With Telegraf and MetricFire

Monitoring your PostgreSQL instance is essential for maintaining performance, reliability, security, and compliance. It allows you to stay ahead of potential issues, optimize resource utilization, and ensure a smooth and efficient operation of your database system. Database monitoring helps you can pinpoint problematic queries, analyze execution plans, and make necessary adjustments to improve overall application responsiveness.

Monitoring-as-Code for Scaling Observability

As data volumes continue to grow and observability plays an ever-greater role in ensuring optimal website and application performance, responsibility for end-user experience is shifting left. This can create a messy situation with hundreds of R&D members from back-end engineers, front-end teams as well as DevOps and SREs, all shipping data and creating their own dashboards and alerts.

How To Troubleshoot False Alerts in Netreo

Regardless of the attention given to configuring monitoring solutions, the dynamic nature of today’s modern infrastructures can impact alert functionality. Optimizing network performance in complex, hybrid infrastructures leveraging SD-WANs, real-time provisioning and other advanced features is really tough. So what should IT teams do when receiving false alerts or notifications that appear inaccurate?

DevOps Change Management Resources

The DevOps Change Management Content Hub is a set of resources for modern software teams who struggle to align their DevOps automation with their change management requirements. In our experience, cloud native teams with lots of automation struggle when they run into a compliance event like an audit, or need to achieve a security standard like SOC2 or ISO27001. How do you comply without adopting old fashioned change management practices and screwing up your DevOps?

Software Selection: Your Company's Future Depends on Getting it Right

The world of enterprise software, SaaS, and cloud platforms has gone through a Cambrian explosion in the last half-decade, as the number of solutions available nearly doubles every two to three years. Meanwhile, the locus for decision-making has been pushed deeper into the organization, and the number of constituents impacted by each decision has also increased when choosing the right software vendor for the use case.

Unleashing the power of AI and automation for effective Cloud Cost Optimization in 2024

In the current dynamic business environment, cloud computing has emerged as the fundamental driver of innovation and scalability. As companies increasingly rely on the cloud for their business initiatives achieving cloud cost optimization remains a significant hurdle.

Collecting OpenShift container logs using Red Hat's OpenShift Logging Operator

This blog explores a possible approach to collecting and formatting OpenShift Container Platform logs and audit logs with Red Hat OpenShift Logging Operator. We recommend using Elastic® Agent for the best possible experience! We will also show how to format the logs to Elastic Common Schema (ECS) for the best experience viewing, searching, and visualizing your logs. All examples in this blog are based on OpenShift 4.14.

Q&A: What IT Automation Best Practices Should You Know Right Now? - Part 2

With a limitless load of questions on IT automation and the industry’s biggest trends, Resolve’s “Ask Me Anything (AMA)” session went about tackling them in an all-new way. We threw out the preparation, we threw out the scripts, and we asked our community to submit the questions that matter most to them and their organizations. Part of our leadership team took the hot seat and provided answers in real time, sans dress rehearsal.

8 Strategies for Reducing Alert Fatigue

Site Reliability Engineers (SREs) and DevOps teams often deal with alert fatigue. It's like when you get too alert that it's hard to keep up, making it tougher to respond quickly and adding extra stress to the current responsibilities. According to a study, 62% of participants noted that alert fatigue played a role in employee turnover, while 60% reported that it resulted in internal conflicts within their organization.

Supercharged with AI

One of the most painful parts of incident management is keeping on top of the many things that happen when you’re right in the middle of an incident. From figuring out and communicating what’s happening, to ensuring you learn from previous incidents, and even capturing the right actions – incidents are hard, but they don’t need to be this hard.

GitHub Pull Request Management with GitKraken Client

Let’s dive into the world of pull requests (PRs). They’re the bridges connecting your hard work to the bigger project, facilitating code review, collaboration, and more. But why are they so crucial, and how can tools like GitKraken Client and GitHub take their management to the next level? Keep reading to explore the unique features of both platforms, plus time-saving tips for efficient PR management.

EKS Add-ons And Integrations: Evaluating Cost Impacts

Amazon Kubernetes Service (EKS) has rapidly become the de facto solution for organizations seeking to deploy, manage, and scale containerized applications using Kubernetes. EKS simplifies the complexities associated with Kubernetes, allowing teams to focus on developing and deploying applications more efficiently. However, as organizations scale their Kubernetes environments, managing and optimizing costs can quickly become a significant concern.

The Catchpoint 2024 SRE Report - Five Key Takeaways

Only emerging into the mainstream in the 2010s, SRE is a relatively new discipline in tech. It’s been rapidly adopted by a widening variety of organizations, implementing constantly evolving practices. For the last six years, Catchpoint has been running a survey to take the temperature of the latest developments and trends. Check out the full report here, and read on to see our analysis on five key takeaways.

Easily Monitor URL and IP Availability Using Telegraf with Ping

Monitoring your domain URLs and server IPs is important for many reasons and plays a crucial role in ensuring the health, performance, and security of a network or web application. Monitoring hosted IPs within your infrastructure helps track the availability and uptime of websites and services. It also allows organizations to identify and respond quickly to downtime or outages, minimizing the impact on users.

Unlock the Secrets of Machine Learning: A Beginner's Guide with Josh Mesout - Navigate Europe 23

Dive into the world of machine learning with Josh Mesout. This video is a great starting point for beginners, offering a practical approach to understanding and applying machine learning concepts. Follow along as Josh demonstrates setting up a machine learning environment on Civo and explores a PyTorch notebook for handwriting recognition. Whether you're coding along or just watching, this session is packed with useful tips and resources for your machine learning journey. Don't forget to check out our GitHub repository for additional materials and join the conversation in the comments!

Azure Resource Monitoring: Setting Up Key Metrics Made Simple!

Setting up Azure Monitor to oversee all essential metrics and points of interest across every single Azure resource in a solution can be a challenging task. It consumes considerable time, especially when dealing with individual Azure resources, and the effort multiplies when managing numerous Azure resources. This video demonstrates how Serverless360 alleviates some of these challenges, simplifying the process of setting up effective Azure monitoring.

How to monitor Azure Automation Runbooks?

This video guides you through developing an Automation solution with Azure Automation accounts. The primary objective is to streamline the management and monitoring of these services for IT support operators, irrespective of their expertise in Azure. Mike demonstrates how you can add an automation runbook to a Business Application in Serverless360 to manage it alongside the other resources that make up your solution and how we can democratize some of the support to the IT support operators.

Identify and rectify network issues proactively with the OpManager-Jira integration

As technology evolves, it has become incredibly difficult for IT teams to work in the conventional siloed environment. Technologies such as NetDevOps and site reliability engineering (SRE) call for collaborative efforts between various IT teams. This allows them to develop and deploy products much faster, streamline and automate their operations, and proactively detect and rectify issues as they crop up.

Does Anonymous Web Hosting Really Make You Anonymous?

If you want to stay anonymous on the internet, there are many ways to do it. You can use a VPN with your server, conduct payment with cryptocurrency, or purchase anonymous VPS hosting. In regards to anonymous web hosting, there are many questions associated with it such as what exactly anonymous web hosting means, how is it different from regular web hosting, and does it really helps you stay anonymous. We answer all of these questions in this article.

Marking deployments and more in Redgate Monitor

SQL Monitor is an essential tool for DBA teams worldwide, providing real-time monitoring of SQL Server and PostgreSQL performance. With SQL Monitor, you can easily track deployments, errors, and other events on the timeline. This feature, called annotations, allows you to quickly identify the root cause of performance issues and take corrective action. SQL Monitor’s timeline is a powerful tool that helps you stay on top of your database performance and keep your systems running smoothly.

Mastering IT Alerting: A Short Guide for DevOps Engineers

$575 million was the cost of a huge IT incident that hit Equifax, one of the largest credit reporting agencies in the U.S. In September 2017, Equifax announced a data breach that impacted approximately 147 million consumers. The breach occurred due to a vulnerability in the Apache Struts web application framework, which Equifax failed to patch in time. This vulnerability allowed hackers to access the company's systems and exfiltrate sensitive data. ‍

Debugging Go compiler performance in a large codebase

As we’ve talked about before, our app is a monolith: all our backend code lives together and gets compiled into a single binary. One of the reasons I prefer monolithic architectures is that they make it much easier to focus on shipping features without having to spend much time thinking about where code should live and how to get all the data you need together quickly. However, I’m not going to claim there aren’t disadvantages too. One of those is compile times.

Managing software in complex network environments: the Snap Store Proxy

As enterprises grapple with the evolving landscape of security threats, the need to safeguard internal networks from the broader internet is increasingly important. In environments with restricted internet access, it can be difficult to manage software updates in an easy, reliable way. When managing devices in the field, change management and compliance policies can introduce even more complexity to the update process. You can solve these challenges using snaps and the Snap Store Proxy.

What is Infrastructure as Code (IaC)?

Infrastructure as code (IaC) is the act of writing infrastructure configurations as code so they can be understood, repeated, and enforced with less manual effort. IaC is also a powerful way to convert tribal knowledge into technical knowledge. It’s a far-reaching and essential part of managing infrastructure at scale, with benefits that have expanded to platform engineering, security and compliance, network administration, and so much more.

5 Ways CloudZero Found Savings Using Its Own Platform

Naturally, our own platform forms the backbone of our cloud cost savings strategy. And before releasing any new feature, we try it out ourselves to see how well it helps us manage our own SaaS costs. If it works well for us, that means there’s already one satisfied user in the world — and there will likely be more!

"The first versions of Copilot were just...comically bad." | GitKraken's Eric Amodio at #Dockercon

At #Dockercon 2023, #GitKraken CTO Eric Amodio reflected on the early days of AI in the software development landscape, particularly his experience watching #GitHub #Copilot evolve from a clunky language model to one of the most popular #AI tools for devs.

Enhancing Data Center Efficiency and Sustainability with Power Capacity Effectiveness

PCE is a performance metric that evaluates the effective utilization of power in data centers. It measures the ratio of IT equipment power to the total power consumed, encompassing all aspects of the facility’s operations, including cooling and lighting. Unlike traditional metrics, PCE provides a comprehensive assessment of how power is used within data centers, aiming to optimize the actual power capacity available.

Supercharge FinOps Programs with Resource Guardrails

Time and time again we hear the same statements from FinOps teams with respect to what is holding back optimization of wasteful cloud resource consumption. Engineers and App Owners are interested in helping but stop short at actually taking actions to reduce that waste. There are many reasons for this main sticking point when it comes to application owners and developers taking action.

Building Controllers with Python Made Easy with Steve Giguere - Navigate Europe 23

Join Steve Giguere and Matt Johnson in this comprehensive workshop from Navigate Europe '23, focusing on Kubernetes Admission Controllers. Learn to build these controllers from scratch using Python, understand their role in Kubernetes security, and get hands-on with real-world applications.

Why Kubernetes For Developers is the Next Big Thing

Let's face it! Kubernetes is the standard for container orchestration, reshaping how we deploy and manage applications for the last 8 years - do you even remember when you run "docker run " on an application server? 😁. However, it's crucial to recognize that Kubernetes was never initially designed with the everyday developer in mind.

What is RMM software?

In this article, we will thoroughly address RMM Software (Remote Monitoring and Management Software) and its essential role for Managed Service Providers (MSPs). We will explain the core functions of RMM, from remote monitoring to efficient management of client devices, highlighting its key advantages such as reducing labor costs and improving productivity.

Azure App Service Pricing (2024)

Azure App Services, part of Microsoft Azure’s platform-as-a-service (PaaS) offerings, simplifies web application and API development, deployment, and scalability without managing underlying infrastructure complexities. Supporting various programming languages and frameworks, its versatility suits diverse applications. Understanding Azure App Services pricing is crucial for effective cost management and resource optimization.

Supercharge FinOps Programs with Cloud Resource Optimization

Time and time again we hear the same statements from FinOps teams with respect to what is holding back optimization of wasteful resource consumption. Engineers and App Owners are interested in helping but stop short at actually taking actions to reduce that waste. There are many reasons for this main sticking point when it comes to application owners and developers taking action.

Understanding ISO27001 Security - and why DevOps teams choose Kosli

Modern software delivery teams find themselves under constant pressure to maintain security and compliance without slowing down the speed of development. This usually means that they have to find a way of using automation to ensure robust governance processes that can adapt to evolving cyber threats and new regulatory requirements.

A Guide to Continuous Security Monitoring Tools for DevOps

DevOps has accelerated the delivery of software, but it has also made it more difficult to stay on top of compliance issues and security threats. When applications, environments and infrastructure are constantly changing it becomes increasingly difficult to maintain a handle on compliance and security. For fast-moving teams, real time security monitoring has become essential for quickly identifying risky changes so they can be remediated before they result in security failure.

Non-Abstract Large System Design (NALSD): The Ultimate Guide

Non-Abstract Large System Design (NALSD) is an approach where intricate systems are crafted with precision and purpose. It holds particular importance for Site Reliability Engineers (SREs) due to its inherent alignment with the core principles and goals of SRE practices. It improves the reliability of systems, allows for scalable architectures, optimizes performance, encourages fault tolerance, streamlines the processes of monitoring and debugging, and enables efficient incident response.

Set Resource Requests and Limits Correctly: A Kubernetes Guide

Kubernetes has revolutionized the world of container orchestration, enabling organizations to deploy and manage applications at scale with unprecedented ease and flexibility. Yet, with great power comes great responsibility, and one of the key responsibilities in the Kubernetes ecosystem is resource management. Ensuring that your applications receive the right amount of CPU and memory resources is a fundamental task that impacts the stability and performance of your entire cluster.

Data Center Liquid Cooling 101

As rack densities in data centers increase to support power-hungry applications like Artificial Intelligence and high-performance compute (HPC), data center professionals struggle with the limited cooling capacity and energy efficiency of traditional air cooling systems. In response, a potential solution has emerged in liquid cooling, a paradigm shift from traditional air-based methods that offers a more efficient and targeted approach to thermal management.

Protect Against Netscaler Vulnerability CitrixBleed

CitrixBleed, or CVE-2023-4966, is now an infamous security vulnerability affecting Citrix NetScaler that allows attackers to hijack user sessions by stealing session authentication tokens. Unfortunately, it has affected many NetScaler customers including Xfinity, which lost data for 36 million customers as a result of CitrixBleed. There is no way to protect against CitrixBleed by configuring the NetScaler WAF to detect and block it.

Cloud-native infrastructure - When the future meets the present

We’ve all heard about cloud-native applications in recent years, but what about cloud-native infrastructure? Is there any reason why the infrastructure couldn’t be cloud-native, too? Or maybe it’s already cloud-native, but you’ve never had a chance to dive deep into the stack to check it out? What does the term “cloud-native infrastructure” actually even mean? The more you think about it, the more confusing it gets.

What's in store for AI in 2024 with Patrick Debois

In this episode, Rob is joined by Patrick Debois, a seasoned industry expert and DevOps pioneer. Patrick shares his personal odyssey within the realm of DevOps, reflecting on the current state of the industry compared to his initial expectations. The conversation delves into the convergence of business analytics and technical analytics, exploring innovative approaches developers are adopting to integrate generative AI into their products.

Transform Your Customer Experience with DevOps Collaboration

Learn how end-to-end monitoring and observability enable enterprises to break down team silos and deliver industry-leading experiences for their customers and achieve business benefits such as: Improved business resilience by identifying and resolving IT risks faster before they result in customer service outages Increased competitive standing with DevOps and shift-left best practices to accelerate software releases.

Prompt engineering: A guide to improving LLM performance

Prompt engineering is the practice of crafting input queries or instructions to elicit more accurate and desirable outputs from large language models (LLMs). It is a crucial skill for working with artificial intelligence (AI) applications, helping developers achieve better results from language models. Prompt engineering involves strategically shaping input prompts, exploring the nuances of language, and experimenting with diverse prompts to fine-tune model output and address potential biases.

How to Build an App with Spin and Wasm with Matt Butcher & Saiyam Pathak - Navigate Europe 23

Join Saiyam Pathak and Matt Butcher for an engaging workshop. This session dives into the exciting world of WebAssembly, offering insights into its evolution, key features, and practical applications. Discover how WebAssembly stands out from previous technologies and learn about its role in modern web development through hands-on examples and discussions.

Scaling Down Kubernetes Clusters

Datadog, the observability platform used by thousands of companies, runs on dozens of self-managed Kubernetes clusters in a multi-cloud environment, adding up to tens of thousands of nodes, or hundreds of thousands of pods. This infrastructure is used by a wide variety of engineering teams at Datadog, with different feature and capacity needs.

IoT Monitoring Challenges

With the increasing prevalence of IoT devices, which are being used in a wide range of applications, from smart homes and cities to industrial and agricultural systems, monitoring thei performance and health is extremely important. However, it’s essential to remember that monitoring IoT devices involves more than just tracking device-level data. In addition, monitoring data from the IoT platform or application layer is equally important.

Rancher Vs. OpenShift

In the world of Kubernetes and container management, Rancher and OpenShift have established themselves as prominent players. Both offer unique features and capabilities to streamline cloud application deployment and management. This article will explore a detailed comparison between these two powerhouses. From deployment flexibility and cloud integration to user interface and support structures, we will analyze each platform to provide a comprehensive view.

The Last Mile of Observability - Fine-Tuning Notifications for More Timely Alerts

No one wants to get an alert in the middle of the night. No one wants their Slack flooded to the point of opting out from channels. And indeed, no one wants an urgent alert to be ignored, spiraling into an outage. Getting the right alert to the right person through the right channel — with the goal of initiating immediate action — is the last mile of observability.

IoT Management with JFrog Connect (5-Minute Demo)

JFrog Connect is a modern Linux-first IoT platform designed to efficiently monitor, manage and update edge and IoT devices at scale. You can quickly register thousands of devices, organize them into logical groups, automate software updates for entire device fleets, and leverage secure tools to remotely troubleshoot devices from the comfort of your laptop. Start free at jfrog.com/connect.

How Can You Navigate the CNCF Ecosystem? Insights from Kunal Kushwaha - Navigate Europe 23

Dive into the world of cloud-native technologies with Kunal Kushwaha at Navigate Europe 2023. Kunal explores the CNCF landscape, offering practical advice on selecting tools, understanding project maturity, and staying updated with evolving technologies.

The 25+ Best Cloud Cost Management Tools In 2024

Without the right tools, managing cloud costs and knowing where your cloud spend goes is nearly impossible. Cloud-native, distributed technologies like microservices, containers, and Kubernetes can make it even more difficult to have full visibility into resource usage — and the associated costs. This cost information is also often buried in rows and columns of text on cloud providers’ bills. In addition, a lot of cloud cost management tools are clunky and inexact.

The SaaS Revenue Model: 5 Types To Consider

Most people still remember the days when software applications were distributed mostly through CD-ROMs and floppy disks. While some companies still maintain CD distribution methods, it’s safe to say that the SaaS distribution or “SaaS revenue model” has taken over. In this guide, we’ll explore what this SaaS Revenue Model is, why it’s so popular now, and some tips to help you make the most of it.

What is a sovereign cloud?

In the ever-evolving landscape of cloud computing, the concept of a sovereign cloud has recently emerged in response to data management challenges. As governments increasingly recognise the importance of safeguarding their data, ensuring compliance with local regulations, and asserting digital autonomy, sovereign cloud solutions have gained prominence. This blog will explore this concept in detail.

High Performance Computing - It's all about the bottleneck

The term High Performance Computing, HPC, evokes a lot of powerful emotions whenever mentioned. Even people who do not necessarily have vocational knowledge of hardware and software will have an inkling of understanding of what it’s all about; HPC solves problems at speed and scale that cannot be achieved with standard, traditional compute resources. But the speed and the scale introduce a range of problems of their own.

Q&A: IT Automation Best Practices for 2024, Part One

Editor’s Note: This blog is the first of a two-part series that recaps our first-ever “Ask Me Anything (AMA)” session. Part 2, to include questions 5-9, is set to publish next Tuesday. Seems like there’s an overload of burning, tough questions surrounding IT automation and orchestration, doesn’t it?

Is YAML Essential for Kubernetes? Engin Diri Explores Alternatives - Navigate Europe 2023

Join Engin Diri, an expert in Kubernetes and cloud transformation from Pulumi, as he explores the challenges of using YAML in Kubernetes and presents more efficient alternatives. In this talk, Engin discusses the limitations of YAML and showcases how generic programming languages can significantly improve the deployment process in Kubernetes. Watch as he demonstrates the practical application of tools like Pulumi and CDK8s.

What are networks?

Networks are present in numerous aspects of our daily lives. It's essential for organizations to keep track of their networks to prevent unexpected outages that may result in a drop in productivity. In this segment, we will delve into the subject of networks and their various types. If you already have a basic grasp of networks, this video will act as a refresher. However, if you're unfamiliar with networks, our objective is to provide you with a clear understanding of the concepts.

Receive zipped messages (or files) in BizTalk Server Solutions

Welcome again to another BizTalk Server to Azure Integration Services blog post. In my previous blog post, I discussed how to send zipped messages or files. Today, we will discuss the same topic but in the opposite direction, which is also a classic requirement in legacy BizTalk Server solutions: How do you receive zipped messages (files)?

SLOs with Prometheus done wrong, wrong, wrong, wrong, then right

We have Carson Anderson, Sr. DevOps Engineer at Weave HQ, talking about how they implemented SLOs using Prometheus, what went wrong, and how they fixed it. This talk was given at "Last9 of Reliability" Discord community on 13th December. Talk Description: First thing's first: Yes, it really did take us 5 tries to implement our SLOs with Prometheus. While that may seem embarrassing, we are very happy to be able to share our SLO journey so that we can hopefully help you avoid the same mistakes.

Easy Ways to Fix Unrecognized Database Format Error in MS Access

You may experience the "unrecognized database format" error in MS Access when trying to open the ACCDB/MDB files. It indicates the application fails to read the database file format. Due to this error, you may fail to open the database. There could be several reasons behind this error. This guide will discuss some easy and effective methods to resolve this error. It will also mention an access database recovery software that can help fix the error if corruption is the cause behind this error.

What Is A Cloud Engineer? Here's a Quick Breakdown

As you’ve probably realized by now, cloud computing isn’t the future. It is here and now. According to Gartner, global spending on public cloud services alone will surpass $725 billion by 2024. This rise in cloud computing is a great career opportunity for you. And what better way to immensely benefit from this shift than becoming a cloud engineer?

What is an Internal Developer Portal?

Imagine a central hub where your tech team finds all the tools and resources they need, organized and ready for action. This is exactly what the Internal Developer Portal offers, becoming a key factor in revolutionizing the way we build software today. In this article, we will explore how Internal Developer Portals revolutionize the development process by streamlining tasks, boosting collaboration, and customizing workflows.

Managed GCP GKE Autopilot Released in Public Beta

I am thrilled to announce a significant milestone for Qovery: the public beta release of our Google Cloud Platform (GCP) GKE Autopilot support. This marks a new chapter in our journey, following the successful integration with AWS EKS, Scaleway Kapsule, and Kubernetes. In this announcement article I'll explain what you can get out of GCP GKE Autopilot and what is coming next 👇

5 open source projects to contribute to in 2024

Contributing to open source software helps you develop new skills, gain real-world coding experience, interact with new technologies, and meet new people. But with so many open source projects to choose from — developers started some 52 million new projects on GitHub in 2022 alone — it can be difficult to figure out which repositories to contribute to. If you’re thinking about joining a new open source project in 2024, you’ve come to the right place.

How Can Kubernetes Thrive in Regulated Environments? Insights from Ryan Gutwein - Navigate Europe 23

Join Ryan Gutwein, co-founder of Ignyte Assurance Platform, as he explores the challenges and solutions for implementing Kubernetes in regulated industries. This talk provides a comprehensive look at the relationship between software development and regulatory compliance, with a focus on the Department of Defense's modern approach to data security and compliance. Discover insights into Kubernetes configurations, the DOD Iron Bank, and the future of secure cloud-native environments.

How to optimize your cloud infrastructure management

As on-premises infrastructure and workloads increasingly migrate to the cloud, you’ve undoubtedly encountered many challenges in managing complex cloud architectures. These hurdles include juggling cost-efficiency and security to maintain a seamless, high-performance infrastructure. Navigating your cloud infrastructure landscape requires thoroughly understanding its virtualized elements—servers, software, network devices, and storage.

Understanding roles in software operators

As we’ve seen throughout this series, a design pattern is a general solution that has been proven to solve a repeatedly occurring problem when designing software. In my previous blog posts, we examined the basics of the software operator design pattern, the forces impacting it, and the advantages and disadvantages of using it. But how does the software operator pattern actually work?

Succeeding with Teams Phone in 2024

Moving to Teams Phone as your primary voice system can save money and provide a great user experience, or it can “crash and burn”. In a two-part workshop, I had the opportunity to explore insights to help migrate successfully to Teams Phone with Greg Zweig of Ribbon. (Ribbon was kind enough to sponsor both workshop sessions.) This article summarizes the information we covered in the workshop.

Video analytics at the edge: How video processing benefits from edge computing

Computer vision: digital understanding of the physical world From face recognition to fire prevention, autonomous cars to medical diagnosis, the promise of video analytics has enticed technology innovators for years. Video analytics, the processing and analysing of visual data through machine learning and artificial intelligence, is perceived as a significant opportunity for edge computing.

EP2: The Unsustainable Truth: Data Centers' Dirty Little Secret w/ Dean Nelson

In this episode of Hyper Views, we're delighted to have Dean Nelson, the founder and chairman of the Infrastructure Masons, joining us. We dive into the world of data centers, specifically focusing on power capacity effectiveness (PCE) as a key metric. We highlight the significance of measuring PCE and its impact on sustainability reporting, while aligning sustainability goals with economic performance objectives. And of course, we provide some valuable insights and advice for executives and operators on optimizing PCE and power usage effectiveness (PUE) in data centers.

Monitor Amazon EC2: key metrics for instances, regions, and more in one view

Amazon EC2 was one of the first services available on AWS, helping propel the cloud platform into the mainstream of IT. And while EC2 instances come in a wide range of sizes and flavors to address all sorts of use cases, keeping tabs on those instances isn’t always easy. That’s why we’re excited to introduce our new EC2 monitoring solution in Grafana Cloud.

Effective strategies for managing cron jobs: Best practices and tools

Cron jobs are essential for automating repetitive tasks and streamlining website and application management. Properly managing cron jobs is crucial for maintaining system efficiency and minimizing risks. In this article, we will explore the significance of cron jobs in tech environments, delve into common challenges in their management, and introduce advanced monitoring solutions like WebGazer. We will also provide best practices to ensure efficient and secure cron job management.

Building a Custom Read-only Global Role with the Rancher Kubernetes API

In 2.8, Rancher added a new field to the GlobalRoles resource (inheritedClusterRoles), which allows users to grant permissions on all downstream clusters. With the addition of this field, it is now possible to create a custom global role that grants user-configurable permissions on all current and future downstream clusters. This post will outline how to create this role using the new Rancher Kubernetes API, which is currently the best-supported method to use this new feature.

Avoiding vendor lock-in with your IDP

Commercial Internal Developer Portals (IDPs) are a valuable investment for teams that want to move quickly toward addressing initiatives surrounding software ownership, production readiness, and improving developer experience. But there's a common misconception that all commercial internal developer portals (IDPs) carry an inherent risk of “vendor lock-in” vs open-source alternatives like Backstage.

Introducing Squadcast's Intelligent Alert Grouping and Snooze Notifications

Maintaining system reliability amidst a deluge of alerts remains a formidable challenge for complex infrastructure environments. To address this critical need, Squadcast is happy to introduce Intelligent Alert Grouping - designed and developed based on in-depth discussions and feedback from our enterprise customers. This innovative solution is designed to streamline Incident Management, ensuring that Incident Response teams can focus on what truly matters.

Mastering NGINX Monitoring: Comprehensive Guide to Essential Tools

NGINX, is a versatile open-source web server, reverse proxy, and load balancer, stands out for its exceptional performance and scalability. Monitoring Nginx is pivotal for maintaining its optimal functionality. By tracking and analysing performance, including real-time insights into server health, resource utilization, and user requests, administrators can proactively identify issues.

Tech trends to watch in 2024

The past year has seen a transformational change in the cloud computing landscape, spearheaded by the advent of AI technology, the ongoing economic challenges for businesses to achieve successful results, and the evolving data regulations surrounding security and privacy. To navigate these changes, and create future opportunities, here are the major trends for 2024 and beyond.

Observability trends and predictions for 2024: CI/CD observability is in. Spiking costs are out.

From AI to OTel, 2023 was a transformative year for open source observability. While the advancements we made in open source observability will be a catalyst for our continued work in 2024, there is even more innovation on the horizon. We asked seven Grafanistas to share their predictions for which observability trends are on their “In” list for 2024. Here’s what they had to say.

The role of the CI/CD pipeline in cloud computing

The Continuous Integration/Continuous Deployment (CI/CD) pipeline has evolved as a cornerstone in the fast-evolving world of software development, particularly in the field of cloud computing. This blog aims to demystify how CI/CD, a set of practices that streamline software development, enhances the agility and efficiency of cloud computing.

An Introduction to Civo Cloud - A Complete Guide with Field CTO Saiyam Pathak - Civo.com

Join Saiyam Pathak, Field CTO at Civo, in this comprehensive crash course on Civo Cloud. Dive into the world of cloud-native services as we explore Civo's offerings, including Kubernetes, managed databases, and advanced machine learning capabilities. Whether you're a beginner or an experienced developer, this guide provides valuable insights into creating and managing Kubernetes clusters, utilizing Civo's CLI and Terraform integration, and leveraging the power of GPU for machine learning projects. Get a glimpse of the future with previews of upcoming features and learn how to maximize the potential of your cloud resources with Civo.

Open Source Automation Tools: The Most Popular Options + How to Choose

When looking at the broad landscape of IT automation tools, you’ll find dozens of tools that seem like viable solutions to your automation needs. Almost all of those tools can be broken down into two categories: Open source automation tools and commercial automation tools. (Open source automation tools with a commercial offering are still considered open source, even if the commercial version has a price tag.)

What is Kustomize ?

In the dynamic realm of container orchestration, Kubernetes stands tall as the go-to platform for managing and deploying containerized applications. However, as the complexity of applications and infrastructure grows, so does the challenge of efficiently managing configuration files. Enter Kustomize, a powerful tool designed to simplify and streamline Kubernetes configuration management.

BizTalk Server to Azure Integration Services: Send zipped messages (or files)

Welcome again to another BizTalk Server to Azure Integration Services blog post. In my previous blog post, I discussed how you can migrate one-way BizTalk Server routing solutions. Today, we will address another classic requirement in legacy BizTalk Server solutions.

Team Komodor Does Klustered with David Flannagan (AKA Rawkode)

An elite DevOps team from Komodor takes on the Klustered challenge; can they fix a maliciously broken Kubernetes cluster using only the Komodor platform? Let’s find out! Watch Komodor’s Co-Founding CTO, Itiel Shwartz, and two engineers – Guy Menahem and Nir Shtein leverage the Continuous Kubernetes Reliability Platform that they’ve built to showcase how fast, effortless, and even fun, troubleshooting can be!

Kubernetes Networking: Understanding Services and Ingress

Within the dynamic landscape of container orchestration, Kubernetes stands as a transformative force, reshaping the landscape of deploying and managing containerized applications. At the core of Kubernetes' capabilities lies its sophisticated networking model, a resilient framework that facilitates seamless communication between microservices and orchestrates external access to applications. Among the foundational elements shaping this networking landscape are Kubernetes Services and Ingress.

How Squadcast's Workflows Enhance Incident Management Automation?

One of the daily challenges for Incident Response teams is the pressure to resolve incidents swiftly and effectively. However, manual processes often hinder this objective, leading to delays, oversight, and potential miscommunication. In this blog, we’ll learn the practical aspects of workflow automation in Incident Management using Squadcast, exploring how it streamlines processes, eliminates manual tasks, and enhances overall efficiency.

How to Calculate and Minimize Downtime Costs

Downtime is an unwelcome reality. But, beyond the immediate disruption, outages carry a significant financial burden, impacting revenue, customer satisfaction, and brand reputation. For SREs and IT professionals, understanding the cost of downtime is crucial to mitigating its impact and building a more resilient infrastructure.

Popular Kubernetes Distributions You Should Know About

In the realm of modern application deployment, orchestrating containers through Kubernetes is essential for achieving scalability and operational efficiency. This blog deals with diverse Kubernetes distribution platforms, each offering tailored solutions for organizations navigating the intricacies of containerized application management.

Webinar: "Is it Done Yet?" - Defining Production Readiness with Internal Developer Portals

While we know software projects are never truly “done,” developers will, nevertheless, often face a long list of tasks needed to achieve a certain level of “doneness.” But to what end? And when do they end? When is done—done enough? In this fireside chat-style webinar, Justin Reock - Head of DevRel for Cortex - alongside Alina Anderson - Principal Technical Program Manager at Outreach - will explore an evolved approach to determining production readiness.

AI in 2024 - What does the future hold?

2023 was an epic year for artificial intelligence. In a year when industry raced faster than academia in machine learning (source), the state of the art for AI evolved to include increasingly larger amounts of data, and bringing to bear sufficient computing resources to support new use cases remained a challenge for many organisations. With the rise of AI, concerns were not far behind.

Azure Not For You? Here Are 10 Azure Alternatives

Microsoft Azure offers over 150 cloud products and services. In addition to Infrastructure-as-a-Service (IaaS), Software-as-a-Service (SaaS), and Platform-as-a-Service (PaaS), Azure Cloud also supports multiple use cases. Yet, Azure can be complex, expensive, and a lot to figure out — let alone optimize for your specific cloud computing needs. Maybe you’ve looked into the cloud service provider.

Rancher Live: What's the buzz with Cilium?

The Cilium community has had some truly buzzworthy accomplishments (pun intended!) in the past year - from hosting the first ever CiliumCon in Amsterdam to becoming a CNCF graduated project! In this first episode of Rancher Live for 2024, we will be joined by the community pollinator for Isovalent, Bill Mulligan. Together, we will be diving into the how-tos of creating "hive"-ly orchestrated container workloads that are as sweet as honey!

Announcing the Rancher Kubernetes API

It is our pleasure to introduce the first officially supported API with Rancher v2.8: the Rancher Kubernetes API, or RK-API for short. Since the introduction of Rancher v2.0, a publicly supported API has been one of our most requested features. The Rancher APIs, which you may recognize as v3 (Norman) or v1 (Steve), have never been officially supported and can only be automated using our Terraform Provider.

Simplifying MongoDB Operations

In the dynamic realm of database technology, MongoDB has emerged as the leader among document databases. A growing number of enterprises are deploying MongoDB on diverse infrastructures, such as the public and private cloud. This approach is unlocking the advantages of containerisation, virtualisation, and orchestration for MongoDB instances. The benefits are compelling, but achieving them is a highly complex undertaking.

Enlightning - Be a Security Hero with Kubescape as Your Sidekick

Security is something we all need, and something we would all love to forget about. 🙂 Kubescape is a CNCF project that helps to secure your clusters easily and quickly. It helps you validate the configuration of your control plane, workloads, and RBAC; and it finds vulnerabilities, cutting the noise very effectively. In general, Kubescape assists with hardening configurations like network policies and seccomp profiles.

Revolution in Development: Mark Allen's Code Zero & Kubernetes Top Tips - Navigate Europe 23

Mark Allen, a developer evangelist from DevCycle, shares his expertise and insights on revolutionizing local development environments. Diving into his journey from Docker Compose to Code Zero and Kubernetes, Allen unfolds the challenges and solutions in creating efficient, easy-to-use development setups.

Boost Your Software Deliveries with Docker and Kubernetes

Software delivery are paramount. The ability to swiftly deploy, manage, and scale applications can make a significant difference in staying ahead in the competitive tech industry. Enter Docker and Kubernetes, two revolutionary technologies that have transformed the way we develop, deploy, and manage software.

Shifting Left with FinOps: A Developer-Centric Approach to Managing Azure Costs

Shifting Left with FinOps: A Developer-Centric Approach to Managing Azure Costs In this episode of "Azure on Air," discover why a developer-centric approach is essential in optimizing Azure costs. Host Lex engages with Michiel, an Azure cost optimization specialist, unraveling the significance of developers in driving efficiency and accountability. Also, learn how shifting left with FinOps allows developers to take ownership of production costs, accelerates innovation, and leads to substantial Azure savings.

What sets OpManager apart as reliable virtual server management software?

Fluctuations in network usage within organizations can occasionally surge due to several factors. For instance, increased traffic might stem from activities such as large-scale promotional campaigns, sudden software updates or patches, escalated remote work demands, or even unforeseen security incidents. An article published by the Financial Express claims e-commerce order volumes spiked by 23% in 2023 during Black Friday sales on the Unicommerce platform.

Taking the Gold Rush to the regions

2023 was the year of Artificial Intelligence (AI). 2024 will build on the incredible momentum of the likes of ChatGPT, Google Bard, Microsoft CoPilot, and others, delivering applications and services that apply AI to every industry imaginable. A recent analysis piece from Schroders makes the point well: “The mass adoption of generative Artificial Intelligence (AI) …has sparked interest akin to the Californian Gold Rush.”

Harmony in Chaos: Uniting Team Autonomy with End-to-End Observability for Business Success

Imagine a symphony where every musician plays their part flawlessly, but without a conductor to guide the orchestra, the result is just a discordant mess. Now apply that image to the modern IT landscape, where development and operations teams work with remarkable autonomy, each expertly playing their part. Agile methodologies and DevOps practices have empowered teams to build and manage their services independently, resulting in an environment that accelerates innovation and development.

Securing Software Development: Marino Wijay's Expert Insights - Navigate Europe 23

Join Marino Wijay as he delves into the critical realm of security and compliance in software development. Faced with the challenge of a solo presentation, Marino showcases a live demonstration of a Kubernetes cluster, emphasizing the importance of integrating security early in the software lifecycle. He explores tools like Kubescape for security scanning and discusses strategies to address cloud-native security challenges.

Time for Some More Meaningful Data Center Metrics?

Our CEO, Jad Jebara joins Digitalisation World podcast to provide insights as to how the data center industry, prompted by the requirement for meaningful environmental reporting, can work towards a truly sustainable future by focusing on the metrics that matter. See first-hand how modern DCIM software is being used to manage hybrid IT environments. Schedule a free one-on-one demo of Hyperview today.

Rack PDU Management Trends for 2024

As data centers grow more complex and power-hungry, rack PDUs are an increasingly important component of data center power circuits. Modern intelligent rack PDUs have many advanced features and work seamlessly with Data Center Infrastructure Management (DCIM) software to provide a complete solution for monitoring and managing data center infrastructure. Let’s delve into the key trends shaping rack PDU management in 2024 and beyond.

Modernizing Financial Services with Automated, Proactive Threat Management

There’s a rising and intensifying pressure on financial services institutions that aligns with the demand for modernization, down to the core. It comes from laws like those of the Service Organization Control Type 2 (SOC 2) and the General Data Protection Regulation (GDRP), which enforce the need to build and hold down cybersecurity policies.

Empowerment without Clarity Is Chaos

The companies we work with at Tanzu by Broadcom are constantly looking for better, faster ways of developing and releasing quality software. But digital transformation means fundamentally changing the way you do business, a process that can be derailed by any number of obstacles. In his recent video series, my colleague Michael Coté identifies 14 reasons why it’s hard to change development practices in large organizations.

Client Testimonial - Carhartt

In this featured video, Earl Williams, a Systems Engineer at Carhartt and a longtime Galileo customer, shares a story about how Galileo helped them address a persistent issue within a critical application at their distribution center. Despite increasing CPU and memory resources as instructed by the application vendor, a problem persisted for Carhartt.

Quick #Git Tip: Diffs #shorts

Diffs compare two data sets – like files or commits – and show the changes between them. They're especially handy when reviewing your repo, helping you see at a glance what's been added or removed. 👀 Additions light up in green with a plus sign, while deletions are marked in red with a minus sign. This makes it easier to review changes, decide on merges, and even copy code snippets! 👍

The Cost Benefits Of Using Scaling Within An EKS Cluster

The promise of cloud computing has always been about flexibility and cost-effectiveness. Yet many organizations find themselves trapped in a cycle of unpredictable costs and underutilized/overutilized resources. The culprit? A lack of understanding about the power of scaling within platforms like Amazon’s Elastic Kubernetes Service (EKS).

What's New in Docker 2023? Discover New Docker Features with Francesco Ciulla! - Navigate Europe 23

Join Francesco Ciulla, Developer Advocate at Daily.dev, as he explores the latest advancements in Docker in 2023 at Navigate Europe 23. Discover three innovative features: Docker Init for streamlined project setup, Docker Scout for enhanced security through vulnerability detection, and the new Docker Compose File Watch for efficient project management. Dive into Docker's 10-year evolution and the new frontiers in container technology.

Top 5 Azure Monitoring tools in 2024

Many organizations migrate their workloads to the cloud or begin leveraging what the cloud offers. However, to keep their businesses up and running during this process, organizations still require integrating their systems in the cloud, like Dynamics365, Salesforce, and ServiceNow, with Azure Integration Services (AIS) and potentially on-premises. One crucial aspect of such integrations is keeping them healthy and available, which requires monitoring and diagnostics.

Self-Hosted or SaaS, JFrog Has You Covered

Freedom of choice can make choosing harder. Since a JFrog cloud (SaaS) account provides the same functionality as a self-hosted JFrog Software Supply Chain Platform how will you decide which is right for you? Choice without tradeoffs is one of the key ways JFrog enables you to be cloud-nimble, and run the mission-critical heart of your DevSecOps process wherever you need it to be — in any cloud, public or private, as SaaS, BYOL, or on-prem. So how will you decide?

Architecting For Cost In AWS: Design Patterns And Best Practices

Cost optimization in cloud environments is no longer a luxury — it’s a necessity. As businesses increasingly migrate to AWS, they are met with the difficult task of ensuring robust performance and scalability, all while keeping an eagle eye on costs. How does one achieve this delicate balance? How can organizations ensure that their AWS architectures are both high-performing and cost-efficient?

OpenStack with Sunbeam for small-scale private cloud infrastructure

Whenever it comes to a small-scale private cloud infrastructure project roll-out, organisations usually face a serious dilemma. The implementation process often seems complex due to a lack of knowledge, tricky migrations and an immediate need from management to run various extensions, such as Kubernetes, on top.

Intro to GitKraken CLI

There's power in grouping your repos! With the GitKraken CLI, you gain access to GitKraken Workspaces which allow you to group repos and perform multi-repo actions like fetch, pull, and push. Synchronize on PRs & Issues from platforms like GitHub, GitLab, and Bitbucket, and integrate seamlessly with GitKraken Client & GitLens in VS Code for instant Git visualization.

Want to take the stress out of multi-repo management? | GitKraken Workspaces #shorts

In Kevin's recent YouTube video (check him out @kitokeboo), he dives into his secret weapon for multi repo organization – GitKraken Workspaces! See everything you care about in a consolidated view so you don't have to jump between dozens of tabs and tools. 😵‍💫 GitKraken Workspaces simplifies your workflow by bringing all your essential repos into one, easy-to-navigate space. 🌟 Check out Kevin's full video to learn how he uses Workspaces for his open-source projects! 🧑‍💻

Harnessing the Power of Metrics: Four Essential Use Cases for Pod Metrics

In the dynamic world of containerized applications, effective monitoring and optimization are crucial to ensure the efficient operation of Kubernetes clusters. Metrics give you valuable insights into the performance and resource utilization of pods, which are the fundamental units of deployment in Kubernetes. By harnessing the power of pod metrics, organizations can unlock numerous benefits, ranging from cost optimization to capacity planning and ensuring application availability.

Boosting Kubernetes Stability by Managing the Human Factor

As technology takes the driver’s seat in our lives, Kubernetes is taking center stage in IT operations. Google first introduced Kubernetes in 2014 to handle high-demand workloads. Today, it has become the go-to choice for cloud-native environments. Kubernetes’ primary purpose is to simplify the management of distributed systems and offer a smooth interface for handling containerized applications no matter where they’re deployed.

Striking the Balance: Tips for Enhancing Access Control and Enforcing Governance in Kubernetes

Kubernetes, with its robust, flexible, and extensible architecture, has rapidly become the standard for managing containerized applications at scale. However, Kubernetes presents its own unique set of access control and security challenges. Given its distributed and dynamic nature, Kubernetes necessitates a different model than traditional monolithic apps.

Docker File Best Practices For DevOps Engineer

Containerization has become a cornerstone of modern software development and deployment. Docker, a leading containerization platform, has revolutionized the way applications are built, shipped, and deployed. As a DevOps engineer, mastering Docker and understanding best practices for Dockerfile creation is essential for efficient and scalable containerized workflows. Let’s delve into some crucial best practices to optimize your Dockerfiles.

New Feature: Deploy Your Helm Charts With Ease

Today, I'm thrilled to announce a game-changing feature that's set to revolutionize how Platform Engineers and Developers interact with Kubernetes: the introduction of first-class support for Helm Charts within Qovery 🥳. This latest update, now generally available, empowers our users to deploy any Helm Chart with unparalleled ease, marking a significant leap forward in our mission to streamline application deployment for everyone.

How to Fix the "DNS Server Not Responding" Error?

With the vast amount of data that is transmitted through the internet, it is essential to have a reliable connection. However, sometimes even the most stable connection can experience issues, one of which is the "DNS Server Not Responding" error. This error occurs when your device is unable to establish a connection with the DNS server, thereby depriving you of access to the internet.