Operations | Monitoring | ITSM | DevOps | Cloud

Top 10 best practices of Cloud SIEM

Nowadays, it’s not uncommon to see enterprise IT leaders in a situation that seems like a catch 22. Oftentimes, they are expected to be involved in making data-driven decisions for augmenting productivity and profitability. Paradoxically, they are preoccupied with what they consider as their core responsibilities – applying best practices to safeguard the IT infrastructure and expediting investigations when incidents occur.

How Our Customers Get 80% Response Rates From Employee Surveys

You can’t really blame most people for thinking employee surveys are a lost cause. Survey fatigue is a real thing, and every organization has its unique red tape when it comes to data collection and HR policy. If you can even manage to convince your employees to respond to that well-crafted questionnaire, how do you separate the signal from the noise and operationalize this information for ITOps?

Five BYOD challenges IT teams face and ways to mitigate them

BYOD stands for bring your own device, whereby your organization lets employees use personal devices for day-to-day work. Sounds simple, right? Unlike corporate devices where the enterprise has complete freedom to choose users’ device types and platforms, BYOD is a different case altogether. In BYOD environments, employees often use different devices manufactured by different OEMs running on multiple OS versions.

How to Create an Azure Monitor Alert

Azure Monitor gathers performance metrics from your various Azure resources and allows you to explore those metrics through visualizations. It also allows you to manually create alerts that will notify you when a metric crosses a predefined threshold. In this blog post, we’ll cover how to create an alert in Azure Monitor.

Metrics Documentation with the metrics2docs Tool

Metrictank exposes many metrics to aid with operating the software in production. As the metrictank team (the primary on-call team for metrictank at Grafana Labs) grows and onboards new people, and more customers deploy the software on their premises, we need to solve a few problems regarding the metrics exposed by metrictank.

Sentry Integration Platform: Optimizing Incident Management with Amixr

It’s hard (if not impossible) to imagine production infrastructure without incidents. And service reliability can be highly dependent on how quickly and efficiently engineers are able to tackle these incidents. Reliability engineers are often faced with four questions... Sometimes the answers to these questions are surprising.

Turbocharge QA with Pre-Production Monitoring

Traditionally, Quality Assurance (QA) has been a very manual process. Our QA teams do an amazing job running through test plans, finding critical bugs, and logging reports. But it can be a lot of work to run through the tests again and again, dig into the errors to provide the contextual information developers need to fix bugs quickly, and prepare the reports your developers need to find and fix errors in the codebase.

Understanding common library implementation

As Falco grows in popularity, many new users get exposed to it on a daily basis. As should be expected, most of these users are not aware of what the architecture underneath Falco is. What components play a role in powering it? How do these components relate to each other? I thought it would be fun to write a blog post that answers these questions. And I thought it would be fun to write it with an historical perspective.