Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Incorporating Crash Data Into Your Workflow

Crash reporting can alter the balance between support and development, and swing it in your favor! Sure "literally" is an overused term. But, I think it’s fair to say that “LITERALLY” nobody likes it when software crashes. Users don’t like it when the software they love using breaks. They’re left wondering what happened and how long will it be until there’s a fix.

8 Ways To Love Your Computer More

We consider ourselves tech savvy. If you are reading this blog, you probably do too. And like us, you probably expect that computers are just supposed to work for you. However, in my experience, this close relationship we share with our machines just expands the realm of possible annoyances we encounter. Here are some easy ways to help you improve your relationship with the most important piece of equipment in your working life.

A Cool Milestone for Monitoring as Code: Checkly Recognized a THIRD Time by Gartner!

Hello, Checkly community and Monitoring as Code (MaC) aficionados! We have some exhilarating news that we can't wait to share. Our mascot is sporting sunglasses today because Checkly has been named in Gartner®'s 2023 Cool Vendors in Monitoring and Observability: Where Awareness Meets Understanding report!

Five Things Your APM Platform Should do for Your Container Application Deployments.

One of the chief complexities in running large scale containerized applications is the need for continuous systems/application monitoring. Containers are very different from traditional VMs and the 3 tier applications that run on them. Monitoring that needs to ensure that SLAs promised to the business are being met as well as an ability to forecast usage trends while identifying problem areas such as bugs, capacity challenges, slowing performance, and any potential downtime.

Dynamic Sampling by Example

Last week, Rachel published a guide describing the advantages of dynamic sampling. In it, we discussed varying sample rates to achieve a target collection rate overall, and having different sample rates for distinct kinds of keys. We also teased the idea of combining the two techniques to preserve the most important events and traces for debugging without drowning them out in a sea of noise.

Why Your Lambda Functions May Be Doomed To Fail

AWS Lambda has a cool feature that can be both a blessing and a nightmare for a serverless application, depending on whether it’s properly handled by our code: the retry behavior. A retry occurs when an invocation of a Lambda function results in an error and the AWS Lambda platform automatically invokes the function again, with the same event payload. Before we get deeper, make sure you are familiar with the AWS documentation on the subject.

Best Practices for Monitoring Your Azure Environment

Adoption of cloud services and Azure services in particular has exploded in the last few years – over 60% of enterprises now use Azure. As Azure users deploy ever more sophisticated application architectures, it becomes even more important to have a logging and monitoring system that can handle the complexity. The ELK stack is the most popular tool for this, but comes with its own challenges.

Firefox add-on outage: Yet another reminder for companies to enforce PKI life cycle automation

More often than we’d like to admit, we tend to underestimate the impact of every moving part within an organization—especially those that seem small or insignificant. And usually, it’s not until we’re facing the fallout of neglecting that seemingly insignificant factor when we realize what a mistake we’ve made.