Operations | Monitoring | ITSM | DevOps | Cloud

How to test Istio and other service meshes

Part of the Gremlin Office Hours series: A monthly deep dive with Gremlin experts. Service meshes bring applications together, but not always reliably. Even the most well-configured Istio deployment can have unexpected reliability risks that aren’t apparent until you’re already in production. Latency, single points of failure, poorly defined APIs—these problems can grow beyond a single service and impact the user experience for your entire application.

It's not just about fixing problems, it's about detecting them before they escalate.

IT teams can’t solve what they can’t see. Undetected issues impacting end users lead to lost revenue, brand reputation damage, and frustrated customers. That’s why proactive monitoring is critical. By simulating end-user experiences, you catch small issues before they snowball into major incidents—saving time, money, and operational headaches.

What Is a Network Assessment, and What Is a Network Audit?

These days, networks are larger and more complex than ever. It’s all too easy to fall short when managing performance, security, and compliance. That’s where network assessments and network audits can help. Both network assessments and network audits can give you a more comprehensive understanding of your network and its current strengths, weaknesses, and threats. As a result, you can quickly identify and resolve issues.

Top 3 tools for DORA metrics reporting: SquaredUp vs Power BI vs Jira

What is it that makes a high-performing software engineering team successful? This was the challenge undertaken by the DevOps Research and Assessment (DORA) team around 2015, who created a set of metrics that could provide a reliable, data-driven way to measure and improve software delivery performance.

Meta-monitoring Loki (Loki Community Call May 2025)

In this Loki Community Call, we talk about the need for meta-monitoring Loki: why Loki needs to be monitored, what to watch out for, and how to do it. We talk about different ways to get information from Loki that allow you to make it reliable, consistent, and performant, including a Helm chart to deploy a meta-monitoring stack on Kubernetes. We discuss the Loki mixin for Grafana and how to use it to visualize data about Loki. On the call are Jay Clifford, Nicole van der Hoeven, and Dylan Guedes from Grafana Labs.

Cloud quotas: How to make cloud management easy

In the past, a cloud architect's pain point was usually deciding between these two options: To tackle this confusion, major cloud service providers (CSPs) launched quotas (in their own words). To give you examples, here are the different terminologies used by the three major public CSPs: The main ingredient of a well-oiled cloud setup that significantly impacts cloud operations is understanding and managing cloud quotas, also known as service quotas.

Empowering Engineers To Act On Cloud Costs - Right From Where They Work

At CloudZero, our mission has always been clear: power efficient innovation in the cloud by connecting engineering decisions with business outcomes. But here’s the truth — we’re not just solving for visibility. We’re solving for action. In order to turn theoretical savings into actual waste elimination and tangible outcomes, you have to enable engineers to act on cost insights within the tools and workflows they already use. CloudZero’s new Jira integration does just that.

How Operational Resilience Can Help Build and Maintain Trust

In today’s business landscape, trust and reputation are the foundation upon which organizations are built. A single service outage or poor customer experience can severely damage both revenue and brand reputation. When customers or businesses encounter obstacles with their preferred vendor, they often turn to competitors – and these temporary shifts frequently become permanent changes in loyalty.

RITA Demo

Watch a demo of RITA, the first AI Agent powered by Resolve Actions. RITA brings smart, real-time assistance to IT ticket handling, where delays often occur. From auto-populating ticket fields and translating user inputs to generating new knowledge articles, RITA enhances the efficiency and accuracy of every support interaction.+ Increases first call resolution rates+ Scales your service desk capacity while retaining quality.