Operations | Monitoring | ITSM | DevOps | Cloud

gRPC Golang Example: Using gRPC with Golang | Speedscale

In this tutorial, you will learn how to work with the gRPC Golang library for microservice communication by creating a simple note-taking application. You will generate a gRPC client that is highly efficient and has a service implementation that handles a diverse range of request and response types. APIs and service-to-service communication are what make modern microservice architecture possible.

How LinkedIn Stopped Relying on Users to Report Bugs

When making changes to your production services, it’s important to have a plan for how to detect problems and roll back changes. How many roll out plans would include: “if it breaks, don’t worry, the users will tell us!” But if your monitoring coverage of production services isn’t complete, you’re implicitly relying on your users to tell you when something breaks.

The Ultimate Guide to REST API Testing: Best Practices and Tools

APIs might not always be in the spotlight, but they’re the unsung heroes of just about every modern software project. APIs make it possible for different apps, services, and platforms to talk to each other seamlessly. When an API falters, so does the entire system. That’s where REST API testing comes in. It’s not just a box to check off—it’s the key to making sure your services are dependable, secure, and prepared to handle the unexpected.

How to Load Test Kubernetes

Performance tests, end-to-end tests, integration tests. There are many different types of tests you can run on your infrastructure. One of the most overlooked kinds is load testing. Failure to include load tests in your supply chain can be detrimental, as you will be missing out on a number of benefits. Some of the big advantages of load testing Kubernetes are.
Sponsored Post

Testing Kubernetes Ingress with Production Traffic

Kubernetes is an incredibly powerful solution, but testing the Kubernetes Ingress resources themselves can prove to be quite tricky. This can lead to significant frustration for developers - bugs can pop up in production that weren't caught during testing, workflows that make sense on paper might fail in practice, and so forth.

A Guide to Optimizing Kubernetes Clusters with Karpenter

With the promise of auto-provisioning and self-healing, Kubernetes environments can be an attractive option for hosting your application platform. However, with increasing budget restrictions, the competitive cloud providers and offerings, and the need to do more with less, engineers are looking to get a handle on their resource utilization.

LLM Testing in 2025: Methods and Strategies

Large Language Models, or LLMs, have become a near-ubiquitous technology in recent years. Promising the ability to generate human-like content with simple and direct prompts, LLMs have been integrated across a diverse array of systems, purposes, and functions, including content generation, image identification and curation, and even heuristics-based performance testing for APIs and other software components.

Troubleshooting CORS Errors in Offsite API Calls

You may have wrestled with a web application attempting to call an offsite web service, such as an OpenTelemetry Collector, and gotten an odd error with the word CORS in it. Something like: Or, maybe you got a generic thrown error from your fetch statement that states Error: Failed to fetch …and you wondered, “What’s the problem, and how can I fix it?” These kinds of errors are called CORS errors, and they can be a bit confusing.

Using GRPC with Python Best Practices Guide

Microservices are now the architecture of choice for many developers when crafting cloud-native applications. A microservices application is a collection of loosely coupled services that communicate with each other, enhancing collaboration, maintainability, scalability, and deployment. There are several options for enabling this communication between microservices. When it comes to Python, gRPC and REST are two extremely popular directions to go.

What is a Memory Leak?

Memory leaks happen when a program fails to release memory it no longer needs, and can be a big issue for developers and system administrators alike, as the gradual depletion of available memory often makes for complex troubleshooting and debugging. Given how the consequences of a memory leak can range from decreased system performance to outright crashes, it’s crucial to isolate the root cause of the leak quickly and efficiently.

Understanding gRPC: A Modern Approach to High-Performance APIs

With systems more interconnected than ever, the ability to communicate quickly and efficiently has become crucial today. This is where gRPC, an open-source framework by Google, comes in to transform the way APIs are designed and utilized. In this blog, we will explore what gRPC is, how it works, how it differs from existing protocols like REST, and the best practices for Optimizing its full potential.

How good is GitHub Copilot at generating Playwright code?

People keep asking us here at Checkly if and how AI can help create solid and maintainable Playwright tests. To answer all these questions, we started by looking at ChatGPT and Claude to conclude that AI tools have the potential to help with test generation but that "normal AI consumer tools" aren't code-focused enough. High-quality results require too complex prompts to be a maintainable solution.

What is API Monitoring? Importance, Tools & Strategies

API Monitoring is the process of continuously observing and testing APIs to ensure they perform as expected, maintain uptime, and deliver the desired functionality. This includes tracking metrics such as API availability, uptime, latency, and response times. Whether you’re dealing with a REST API, a web API, or a microservices architecture, it’s important to understand that monitoring is essential for detecting issues before they impact end-users.

New API endpoint to add comments to error groups

This enhancement is part of Raygun’s 12 Days of Christmas 2024. Over the next few weeks, we’ll share daily updates on bug fixes and feature improvements inspired by feedback from you, our customers. These are the small but impactful changes you’ve asked for, designed to make Raygun faster and easier to use. Check back tomorrow for the next update and see how we’re leveling up your experience one day at a time! Our special thanks to Gwilym from the U.K.

Traffic-Driven Testing: Shift Right Testing

In the process of developing software, designing and performing testing is a critical aspect of ensuring high software reliability, improving software quality, and deploying strong fit and function. The shift-right testing approach moves testing to later in your production cycle as a way of doing this with more accurate user data and post-production testing practices. Also known as “testing in production,” with shift-right, you test software after it has been deployed.

Bridging Old and New: Designing APIs That Connect Legacy Systems with Modern Microservices

Nowadays, choosing between legacy systems that have served the company for years and modern systems is a complex decision. However, instead of making this choice, integrating them would be more financially and operationally beneficial. It would reduce costs and accelerate timetomarket without discarding crucial business logic. We invited Sumit Saha, a Software Engineer from a BigTech to shed more light on this.

Kubernetes vs Docker: 7 Key Differences

It’s impossible to learn about containerization without hearing about Docker and Kubernetes. These two tools together dominate the world of containers, both being the de facto standard in what they each do. When you’re first getting started learning about containers, it can be quite a challenge to figure out the differences between these two tools.

New API endpoints for deployments

This enhancement is part of Raygun’s 12 Days of Christmas 2024. Over the next few weeks, we’ll share daily updates on bug fixes and feature improvements inspired by feedback from you, our customers. These are the small but impactful changes you’ve asked for, designed to make Raygun faster and easier to use. Check back tomorrow for the next update and see how we’re leveling up your experience one day at a time! Our special thanks to Andrew from the U.K.

What is Resilience Testing: The Ultimate Guide

Today’s complex, dynamic applications demand rigorous resilience testing. A common hurdle is accurately mimicking real user behavior. This post discusses a possible solution: production traffic replication (PTR), a technique that captures actual user interactions to enhance chaos testing, and the principle of intentionally introducing failures to evaluate application recovery.

How I reduced an API call from >5 seconds to under 100ms

Given that 100% of the databases I have interacted with in my professional career have been SQL databases, my data-based mental model (please enjoy my pun) has always defaulted to a relational one. However, when spinning up a tiny side project in 2020 (a bot to provide interactivity to my Twitch stream), my data-storing requirements didn’t call for a relational model at the time, so I chose a NoSQL solution: MongoDB.

What are Preview Environments?

Local preview environments are transforming how developers test and validate code changes before merging them into the main codebase. Acting as temporary cloud environments, they provide a production-like setting where new features and bug fixes can be tested in isolation, catching issues early and streamlining the development code review process. These environments are crucial for enhancing development velocity, especially in CI/CD workflows used by DevOps engineers and QA teams.

New API endpoint to delete all source maps

This enhancement is part of Raygun’s 12 Days of Christmas 2024. Over the next few weeks, we’ll share daily updates on bug fixes and feature improvements inspired by feedback from you, our customers. These are the small but impactful changes you’ve asked for, designed to make Raygun faster and easier to use. Check back tomorrow for the next update and see how we’re leveling up your experience one day at a time! Our special thanks to Kelvin from the U.K.

Easiest Way to Monitor Your API Endpoints Using Telegraf

Monitoring the health of your API endpoints is crucial to keeping your applications running smoothly and ensuring users have a reliable experience. Keeping an eye on 4XX and 5XX status codes can help you spot issues like client errors, misconfigurations, or server problems before they get out of hand. Plus, setting up alerts for when these errors spike allows you to react quickly, fix problems, and maintain a high-quality service that your users can count on.

Prometheus Blackbox Exporter vs Kuberhealthy for K8s monitoring

We all implement tools to monitor our nodes and keep our entire cluster up and running. But how often do updates, failures, or errors mean that users suffer outages, even though our status boards look green? As Kubernetes has enabled more complex microservice architecture, the gap between the state of the dashboard, and the health of services for the user, has grown wider.

API update: Manage source maps

We’re thrilled to announce the latest endpoints for the Raygun API - Source maps. This new release allows developers to efficiently add or remove their sourcemaps, with increased flexibility and control over their Raygun platform. The Raygun API now gives you multiple endpoints to manage your JavaScript source maps, making handling error tracking for your web apps easier than ever.