Operations | Monitoring | ITSM | DevOps | Cloud

IT Incidents vs. Alerts

IT incidents are events which lead to a disruption or deviation from the regular operating standards of a computer system or network. They can be caused by various factors, including hardware or software failures, human error, or even deliberate external (cybersecurity) attacks. It begins with short delays, or services cutting out - for example, when a website or server is down, or access to data(bases) takes too long.

The Three Pillars of Observability: Metrics, Logs and Traces

Metrics, Logs and Traces are often referred to as The Three Pillars of “Observability“. The term observability has been used in control theory to refer to how the state of a system can be inferred from the system’s external outputs. Applied to IT, observability is how the current state of an application can be assessed based on the data it generates. Applications and the IT components they use provide outputs in the form of metrics, events, logs and traces (MELT).

Optimize your CI/CD Pipeline with Coralogix Tagging

Continuous Integration/Continuous Delivery (CI/CD) has now become the de-facto standard for all engineering teams seeking to keep pace with the demands of the modern economy. At Coralogix, we operate some of the most advanced build and deploy pipelines in the world. We’ve baked that knowledge into our platform with a CI/CD Observability feature called Coralogix Tagging.

Rest Assured, Cribl's Improved Webhook Can Now Write to Microsoft Sentinel

As version 4.0.4, we are excited to announce the capability of Cribl’s webhook to write to any destinations and APIs that requires OAuth including Microsoft Sentinel. Cribl has long supported OAuth in many destinations through native integrations but with the enhanced Webhook we can now write to any destination that require OAuth authentication.

A Guide to OpenTelemetry for .NET Engineers

Hey.NET engineers! Today, we’ll explore the world of OpenTelemetry, focusing on how it can benefit your.NET applications. We’ll talk about the strengths and weaknesses of OpenTelemetry, walk you through the setup process, discuss the basics, and share some best practices. Plus, we’ll touch on topics like auto-instrumentation, metrics, and more. So, let’s dive in!

Achieving Great Dynamic Sampling with Refinery

Refinery, Honeycomb’s tail-based dynamic sampling proxy, often makes sampling feel like magic. This applies especially to dynamic sampling, because it ensures that interesting and unique traffic is kept, while tossing out nearly-identical “boring” traffic. But like any sufficiently advanced technology, it can feel a bit counterintuitive to wield correctly, at first. On Honeycomb’s Customer Architect team, we’re often asked to assist customers with their Refinery clusters.

Automated Employee Onboarding: The Gamechanger for New Hires and IT Teams

Too many IT tickets, not enough time. That’s just one problem that comes with a poor employee onboarding experience, and it’s one that causes deal-breaking difficulties for new employees, and just as importantly, IT leaders in the IT service management (ITSM) department. Ninety-three percent of employers said that a good onboarding experience is critical for retention of new employees, according to market share data from Finances Online. The total cost of voluntary turnover in 2020?

Reduce MTTR and Take Automation to a New Level with PagerDuty Global Event Orchestration

PagerDuty’s Global Event Orchestration is now generally available. Global Event Orchestration’s powerful decision engine enriches events, controls their routing, and triggers self-healing actions based on event data. Teams can use this functionality across any or all services within PagerDuty. This feature is a continued investment in Event Orchestration, demonstrating PagerDuty’s commitment to providing customers with best-in-class automation capabilities.