Operations | Monitoring | ITSM | DevOps | Cloud

Latest posts

Optimizing Team Strengths for Effective Operations

Most people think great network engineers are defined by technical expertise. This episode challenges that idea. Because what Troy McDonald shows is that the real differentiator isn’t just technical skill—it’s the ability to translate complexity into clarity. From military operations to enterprise networks, one lesson keeps showing up.

Unlock telemetry value with a well-planned data lake

Your SIEM only holds a slice of your telemetry. Your data lake holds the rest. We'll show you how to use that to your advantage for investigations, threat hunting, and reporting. Why your data lake beats your SIEM for investigations – Your SIEM keeps a short window of expensive, filtered data. Your data lake keeps everything. When something goes wrong, that difference matters more than you think Threat hunting without the handcuffs – Hunting across months of data in a SIEM is painful and costly. We'll show you how a well-planned lake makes broad, deep searches practical and affordable.

Mini Shai-Hulud Explained: How the TanStack and RubyGems Supply Chain Attacks Worked | Harness Blog

Shai-Hulud is back - this time being lighter, faster and more automated than before. This new wave, termed as Mini Shai-Hulud, has affected a number of packages from tanstack, uipath, opensearch-project and mistralai among others over the past few weeks, with the latest series of major compromises coming on 19th May, 2026 on major organizations openclaw-cn and antv. Check an extensive list of affected packages here.

Why Artifact Repository Sprawl Slows Down Software Delivery | Harness Blog

Three weeks into a platform modernization project, this question landed in my inbox: "Why does our deployment pipeline take 40 minutes instead of four?" This is artifact repository sprawl in practice, and it does more than slow pipelines. It fragments your security posture, your compliance evidence, and your ability to answer basic questions like "what's actually running in production right now?".

Reduce CI Costs Without Slowing Down Development | Harness Blog

Continuous integration (CI) costs can escalate quickly as engineering teams scale. While most organizations focus on cloud bills, the true cost of CI includes slow build times, developer wait time, inefficient test execution, and overprovisioned infrastructure. CI cost optimization is the practice of reducing the total cost of CI pipelines by improving build efficiency, minimizing compute usage, and eliminating unnecessary work without slowing down development.

Teach Your AI Coding Agent to Instrument, Monitor, and Troubleshoot Infrastructure with netdata/skills

There’s a growing ecosystem of AI coding agents: Claude Code, Cursor, Copilot, Codex, Gemini CLI, Windsurf, and others. They’re good at writing code, but they don’t inherently know how to instrument that code for observability, configure monitoring infrastructure, or troubleshoot production systems using real telemetry data. That knowledge lives in documentation, runbooks, and the heads of your senior SREs.