Operations | Monitoring | ITSM | DevOps | Cloud

Autoscaling Made Easy with Rancher Cluster API

Kubernetes has revolutionized application deployment and management. However, manually adjusting cluster sizes to meet fluctuating workloads, without constantly under- or over-provisioning resources, quickly drains platform teams’ time and energy. While traditional cloud provider autoscaling tools are functional, they often fall short when it comes to truly dynamic, Kubernetes-aware scaling, especially in a world with diverse infrastructure.

Bring high-performance observability to secure Kubernetes environments with Datadog's new CSI driver

In Kubernetes environments, applications often communicate with the Datadog Agent to send telemetry data such as custom metrics via DogStatsD or traces through Datadog APM. How this communication takes place depends on the communication mode set on the Datadog Cluster Agent's Admission Controller. With the sockets option, communication takes place through local inter-process communication via Unix domain sockets (UDS), whereas the service and default hostip options rely on network communication.

Kubernetes Is Powerful-But It's Slowing You Down. Here's How to Fix It.

Ask any SRE what slows them down in a Kubernetes incident, and the answer is usually too much information in too many different places. Kubernetes has changed the way we run software. It’s given us incredible flexibility, scalability, and power. But in the years I’ve worked in cloud operations and platform engineering, I’ve also seen how that power comes at a price: complexity.

Rancher Live: What is Developer Advocacy?

Join us for an engaging Rancher live stream hosted by Orlin Vasilev, as we dive into the world of Developer Advocacy—what it really means, why it matters, and how it's evolving in the cloud-native space. Orlin will be joined by two powerhouse guests in the field: Jorge Castro – a community strategist and long-time open source advocate, known for his work with Kubernetes and cloud-native ecosystems. Jorge brings deep insights from years of building developer communities and bridging the gap between engineers and users.

The Second Wave of Private Cloud

Over the past decade, the public cloud became the default way to run software. Its flexibility, on-demand pricing, and global reach made it the obvious choice for many teams. Startups could move fast, and enterprises could avoid long procurement cycles and complex hardware management. As teams gain more experience with cloud infrastructure, unintended consequences start to rear their costly heads. Bills grow quickly and are difficult to predict.

The Digital Sovereignty Revolution: Why Big Tech Is Losing UK Trust

In a world where data is power, UK businesses are demanding control. Civo’s latest whitepaper, The Digital Sovereignty Revolution, uncovers the top challenges UK businesses face in securing true data sovereignty from eroding trust in Big Tech to geopolitical instability. Our latest research, based on a survey of 1,000+ UK IT decision-makers, reveals the extent to which sovereignty is reshaping the UK's tech sector. From the risks of relying on US-based providers to the benefits of multi-cloud strategies, the whitepaper explores key trends and what they mean for UK businesses.

How we're killing YAML fatigue with our new K8s integration process

Kubernetes has rapidly grown in adoption, with more than 84% of surveyed users evaluating or actively using Kubernetes in some way. It has become the go-to container orchestration deployment. As we grow the Coralogix platform, we continuously go back and improve flows that we believe will have a high impact on our user base.

Top 5 Kubernetes Network Issues You Can Catch Early with Calico Whisker

Kubernetes networking is deceptively simple on the surface, until it breaks, silently leaks data, or opens the door to a full-cluster compromise. As modern workloads become more distributed and ephemeral, traditional logging and metrics just can’t keep up with the complexity of cloud-native traffic flows.

Getting started with the relaxAI API: Sovereign, cost-effective AI

Earlier this year, we launched relaxAI, an AI assistant designed with one paramount focus: your privacy. We’re now excited to announce the relaxAI API is in General Availability (GA) offering an OpenAI interface. This gives UK organizations up to 90% cost savings versus leading providers while ensuring data never leaves UK jurisdiction.

Are Egress Fees Holding Your AI Business Back?

For AI companies, the landscape of cloud computing has always been a balancing act between innovation, costs, and compliance. That’s where Civo comes in. Offering a full cloud offering with GPUs, but without the usual headaches, Civo provides a rare combination: true data sovereignty and zero data egress charges. Let’s break down why these two features should be non-negotiables for your AI infrastructure.

Is your cloud data truly sovereign? The CLOUD Act & FISA 702 reality check

As UK public sector bodies, financial institutions, and enterprises accelerate cloud adoption, a pivotal question emerges: Who truly controls your data, and under which laws? With data breaches and regulatory scrutiny intensifying, storing data and workloads in a host country alone doesn't guarantee sovereignty. U.S.

Understanding GPUs for AI success: Insights from our panel discussion

This blog is based on the webinar, “Panel Discussion: Understanding the importance of GPUs for AI success”, you can watch the full recording by clicking here! Last week, we hosted a panel discussion surrounding the importance of GPUs for AI success that featured Kunal Kushwaha (Field CTO), Ben Norris (AI Engineer), and Kendall Miller (Strategic Business Development).

Grafana Cloud updates: deeper insights in Kubernetes Monitoring, Adaptive Metrics updates, and more

We consistently roll out helpful updates and fun features in Grafana Cloud, our fully managed observability platform powered by the open source Grafana LGTM Stack: Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics. In case you missed them, here’s our monthly round-up of the latest and greatest updates in Grafana Cloud. You can also check out our What’s new in Grafana Cloud documentation to explore all the latest features. Not a Grafana Cloud user yet?

Vendor lock-in and the fight for UK digital sovereignty

To read more on the findings from this research, visit the Digital Sovereignty Revolution whitepaper by clicking here. For years, global hyperscalers have been the backbone of cloud infrastructure for UK businesses. Their scale, reach, and performance made them the default choice. But as geopolitical uncertainty grows and concerns around data governance deepen, the cracks in this model are beginning to show.

Less Overhead, More Impact: The Cycle Approach

Every company is now a software company. While the industry gets caught up in buzzwords and complexity, the core question remains: How can my organization reduce costs without creating long-term problems, and without giving up security or speed? The Cycle platform was built to answer this. It offers a lower total cost of ownership, simplifies operations at scale through automation and standards, and is secure by default without slowing down development.

Accelerate Your Deployment Frequency: Strategies to Remove Bottlenecks

Is slow deployment hindering your mid-size organization? This guide tackles common deployment bottlenecks like manual processes and inconsistent environments head-on. Discover actionable strategies for faster, safer releases, including CI/CD automation, Infrastructure-as-Code (IaC), GitOps, and cultivating a strong DevOps culture.

Evaluating Serverless Vs. Containers And How To Choose

Containers and serverless computing are two of the most popular methods for deploying applications. With the rise of microservices and modern DevOps, teams need faster, leaner ways to build and release software. However, selecting the wrong architecture can slow down delivery, increase cloud costs, or lock you into tools that don’t scale with your business. Both methods have their advantages and disadvantages.

Set Up ClickHouse with Docker Compose

ClickHouse is built for high-performance OLAP workloads, capable of scanning billions of rows in seconds. If your analytical queries are bottlenecked on PostgreSQL or MySQL, or you're burning too much on Elasticsearch infrastructure, ClickHouse offers a faster and more cost-efficient alternative. This blog walks through setting up ClickHouse locally with Docker Compose and scaling toward a production-grade cluster with monitoring in place.

Kubernetes Clusters Break in the Weirdest Ways

If you’ve ever spent hours chasing a weird issue in your Kubernetes cluster, you’re in good company. Reddit’s r/kubernetes is full of hilarious and painful stories about clusters going off the rails for reasons no monitoring dashboard ever predicted. And while it’s easy to laugh after the fact, each of these moments highlights just how important observability is because these kinds of problems don’t show up on your radar until it’s too late.

Navigating Change: How VMware's VCSP Partner Reduction Affects Your Business

VMware customers are once again facing major change. Under Broadcom’s ownership, VMware’s is closing its Broadcom Advantage Partner Program for VMware Cloud Service Provider (VCSP) partners on October 31, 2025. This change will also include any VCSP Partner contracts not being renewed after this date. For many, this shift is more than an internal reshuffle.

Kubernetes Observability with OpenTelemetry | A Complete Setup Guide

Kubernetes provides a wealth of telemetry data from container metrics and application traces to cluster events and logs. OpenTelemetry offers a vendor-neutral, end-to-end solution for collecting and exporting this telemetry in a standardised format.

Panel Discussion: Understanding the importance of GPUs for AI success

Are you curious about the role of GPUs in AI and how they can accelerate your projects? Join Kunal Kushwaha (Field CTO), Ben Norris (AI Engineer), and Kendall Miller (Strategic Business Development) in this upcoming panel discussion as they dive into the world of GPUs and their significance in AI.

What's Next for Cloud in India? Help Shape the Future with Our Cost Survey

As the cloud computing industry continues to evolve in India, it's becoming increasingly important for organizations to understand the complexities and challenges associated with it. Last year, we released a whitepaper on the cost of cloud that gathered insights from over 500 industry professionals. While this helped us understand more about the rising cloud costs, complex billing models, and vendor lock-in within the UK, it was unclear how this differed for the Indian market.

5 Ways to Accelerate Product Delivery Without Managing Infrastructure

Is slow product delivery holding you back? This article explores how traditional infrastructure management creates significant bottlenecks, from time-consuming provisioning to inconsistent environments. Discover 5 strategies to streamline your delivery without managing infrastructure, including fully managed services, on-demand ephemeral environments, GitOps, self-service deployment platforms, and intelligent container orchestration.

Kubernetes Monitoring backend 2.2: better cluster observability through new alert and recording rules

We’re excited to announce version 2.2.0 of the backend for our Kubernetes Monitoring solution in Grafana Cloud is now available. The app’s backend is supported by kubernetes-mixin, an open source Prometheus Monitoring Mixin, and this latest version features significant improvements to alert rules and recording rules that will enhance your cluster observability and monitoring experience. There’s a lot to tell you about, so let’s dive in.

The Rise of Tech Events in India: A New Era for Cloud-Native Computing

As India emerges as a significant player in the global public cloud landscape, with its public cloud services market projected to reach $25.5 billion by 2028 at a CAGR of 24.3% for 2023-28, the country is witnessing a surge in tech events. This growth is mirrored in the live events market, which is experiencing a 15% YoY growth, fostering a stronger community and facilitating the exchange of ideas and innovation in the public cloud sector.

Vertical Pod Autoscaling: How It Compares to Pepperdata Capacity Optimizer

Vertical Pod Autoscaling (VPA) is a component within Kubernetes designed to automatically resize the CPU and memory requests of pods based on their observed, historical usage patterns. While Pepperdata Capacity Optimizer and VPA both change the resource requests of pods in response to changing application resource requirements, there are several key differences.

Docker Layer Caching: Speed Up CI/CD Builds

Docker layer caching (DLC) is a powerful technique that can significantly accelerate your CI/CD pipelines. By reusing unchanged image layers across builds, DLC not only cuts down on build times but also reduces cloud costs and boosts developer productivity. In this article, we’ll break down how Docker layer caching works, how to implement it effectively, and how to combine it with ephemeral environments for maximum impact.

A New Era of Cooperation: How the UK-India Free Trade Deal Can Benefit India's Digital Economy

India’s digital economy is on a historic growth trajectory. According to the Digital Infrastructure Providers Association (DIPA), it’s projected to reach $1 trillion by the end of 2025, driven by rising internet penetration, data consumption, and cloud adoption.

Kubernetes Monitoring 101: 25 Tools And Must-Know Tips

The Kubernetes platform is the standard for orchestrating containerized applications. It’s ideal for large applications running on distributed instances. However, monitoring Kubernetes infrastructure can be notoriously challenging. This guide will cover Kubernetes monitoring in more detail, including what metrics to track to improve visibility and control over your K8s containers, apps, microservices, etc.

Why is the Trust in Big Tech Fracturing? Next Steps for the Cloud Industry

To read more on the findings from this research, visit The Digital Sovereignty Revolution whitepaper by clicking here. For years, US tech giants have enjoyed near-unquestioned dominance over the global cloud market. However, this dominance is being challenged as trust in Big Tech begins to fracture. According to our latest whitepaper, "The Digital Sovereignty Revolution", the foundation of trust in Big Tech's infrastructure and services is cracking.

Cloud Cost Optimization Strategies: How Mid-Size Organisations Can Reduce Cloud Infra Costs

Learn how mid-size companies can dramatically cut cloud infrastructure costs using practical strategies like compute rightsizing, serverless, storage tiering, and automated scaling. This guide also explores how Qovery simplifies and automates cost optimization for growing teams - no full DevOps team required.

How to Get Logs from Docker Containers

When a container misbehaves, logs are the first place to look. Whether you're debugging a crash, tracking API errors, or verifying app behavior—docker logs gives you direct access to what's happening inside. This blog covers the full workflow: how to retrieve logs, filter them by time or service, and set up logging for production environments.

Navigating the Complexities of Data Sovereignty: A Guide for UK Businesses

To read the full findings from this research, visit The Digital Sovereignty Revolution whitepaper by clicking here. As the digital landscape continues to evolve, one question is becoming increasingly pressing: are you in control of your digital future? With growing concerns around data sovereignty and the impact of geopolitical risks on cloud strategies, it's time to assess your organization's digital infrastructure.

How Replicas Work in Kubernetes

Replicas in Kubernetes control how many copies of your pods run simultaneously. They're the foundation of scaling, availability, and recovery in your cluster. When you're running a stateless API or a background worker, understanding how replicas work directly impacts your application's reliability and performance. This blog walks through replica management, from basic concepts to production monitoring patterns that help you maintain healthy, scalable applications.

Boost Your AI Projects with GPUs: Live Expert Insights Webinar

Are you ready to supercharge your AI initiatives? Join our live webinar on July 16, 2025, at 05:00 PM, where Kunal Kushwaha, Ben Norris, and Kendall Miller will dive into the world of GPUs and their critical role in AI. Get ready to explore the latest insights and trends in GPU technology, including: This webinar is perfect for AI enthusiasts, startup founders, and engineers looking to stay ahead of the curve.

How to troubleshoot Kubernetes issues using Events | Site24x7 Kubernetes Monitoring

Troubleshooting Kubernetes just got easier. In this video, we walk you through how to use Kubernetes Events in Site24x7 to quickly detect, analyze, and resolve issues like CrashLoopBackOff, ImagePullBackOff, Evicted pods, and more without the guesswork. Learn how to: With Site24x7 Kubernetes Monitoring, you get full observability—right down to every critical event in your cluster.

Building Stronger Tech Communities: Foundations, Success Stories, and Future Directions

Discover the secrets to building thriving tech communities in Austin. Join industry experts Cherie Werner (FIESTA), Kasey Randall (Good Code), Cyndi Schultz (Capital Factory), Laura Santamaria (Red Hat), and Emily Gupton (SKG Texas) as they share their insights on community building, personal success stories, and the future of tech communities. Recorded at Civo Navigate Austin 2025, this panel discussion offers a unique perspective on the challenges and opportunities of building strong, inclusive, and supportive communities.

Docker Status Unhealthy: What It Means and How to Fix It

If your container shows Status: unhealthy, Docker's health check is failing. The container is still running, but something inside, usually your app, isn’t responding as expected. This doesn’t always mean a crash. It just means Docker can’t verify the app is working. Here’s how to debug the issue and restore the container to a healthy state.

Best Ways to Find Troublesome Containers and Virtual Machines Using Cycle's Portal

The best problems are the ones you never have to deal with. That's why smart teams catch issues early on, before they impact production. Cycle gives great visibility to spot troublesome workloads, control resource usage, and take action before things go sideways.

Scaled Kubernetes Resource Management Requires Cross-Team Collaboration

As organizations scale their Kubernetes infrastructure, one truth becomes clear: no single team can optimize it alone. Efficiency, resilience, and cost-effectiveness in Kubernetes environments depend on the collective effort of multiple personas, each bringing essential knowledge and responsibility. But it’s not just about division of labor. It’s about active collaboration across roles to unlock the full potential of the platform.

Automating Kubernetes Resource Optimization: Strategies for Efficient, Scalable Workloads

Kubernetes gives you the amazing power to deploy and manage containerized applications. But this power comes with a trade-off. Instead of letting you focus only on writing code and delivering features, Kubernetes also shifts the burden of resource optimization i.e., cost control, performance, and scalability, directly onto your shoulders. The answer to these challenges is automation. Automated optimization takes the guesswork out of resource allocation.

How to Run Elasticsearch on Kubernetes

Elasticsearch stands as one of the most robust open-source search engines available today. Built on Apache Lucene, it handles complex search operations, real-time analytics, and large-scale data processing with impressive speed and accuracy. Kubernetes has transformed how we deploy and manage containerized applications. This orchestration platform automates deployment, scaling, and operations of application containers across clusters of hosts.

Is AI About to Create Its Own Language? Here's What You Need to Know!

This panel brings together experts Josh Mesout (Civo), Nobel Chowdary Mandepudi (Arm), Jimil Patel (Intuit), Numa Dhamani (iVerify), and James Gress (Accenture) to discuss the cutting edge of AI and machine learning. They explore when AI might develop its own language beyond human syntax, the evolving landscape of ML frameworks such as MLIR, Mojo, and JAX, and the challenges involved in bridging the gap from AI research to production while optimizing models for deployment.

A Detailed Look at Calico Cloud Free Tier

As Kubernetes environments grow in scale and complexity, platform teams face increasing pressure to secure workloads without slowing down application delivery. But managing and enforcing network policies in Kubernetes is notoriously difficult—especially when visibility into pod-to-pod communication is limited or nonexistent. Teams are often forced to rely on manual traffic inspection, standalone logs, or trial-and-error policy changes, increasing the risk of misconfiguration and service disruption.

Top Kubernetes Monitoring Tools in 2025, And Why Alerting Is Critical for DevOps and SRE Teams

What are the best Kubernetes monitoring tools in 2025? And how can you ensure alerts actually drive action when something goes wrong? Kubernetes monitoring is critical for keeping your containerized applications healthy, but alerting is often overlooked. This blog compares popular tools like Prometheus and Datadog and explains why intelligent alerting solutions like OnPage are essential for effective incident response.

Introducing the Coralogix Operator for Kubernetes

As organizations begin to scale their observability strategy, point and click methods of management become increasingly unworkable. This is why Coralogix has now fully released the Coralogix Operator for Kubernetes. Kubernetes operators are control loops that allow users to declare their desired state in their Kubernetes clusters, and the operator is responsible for resolving this state.

What Impacts GKE Pricing? A Guide To Kubernetes Spending

Google Cloud released Google Kubernetes Engine (GKE) as a commercial version of native Kubernetes (K8s). GKE promises a user-friendly, reliable, and cost-effective service. Yet calculating GKE costs can be daunting, including understanding what you’re paying for and maximizing your return on investment. In this GKE pricing guide, we’ll discuss how GKE pricing works, what it costs, and more.

Pepperdata Helps Karpenter Work Better

Running Kubernetes on AWS? You're probably using Karpenter, the open-source autoscaler that dynamically provisions new instances as your EKS workloads grow. Karpenter launches rightsized instances in real time in response to pending pods, based on available instance types and the resources applications need. It also terminates underutilized nodes to reduce costs.

Logging in Docker Swarm: Visibility Across Distributed Services

Docker Swarm's logging model shifts from individual container logs to service-level aggregation. The docker service logs command batch-retrieves logs present at the time of execution, pulling data from all containers that belong to a service across your cluster. This approach gives you a unified view of distributed applications, but it comes with its patterns and considerations for effective observability.