Operations | Monitoring | ITSM | DevOps | Cloud

Groq vs. GPUs: The future of AI inference in 2026

Back in 2016, Jonathan Ross founded Groq, the AI chip startup, which went on to enter a non-exclusive licensing agreement with NVIDIA for Groq’s inference technology (as part of a $20 billion deal). The name ‘Groq’ is commonly confused with X (formerly Twitter)’s Grok, which was launched in 2023 as a Gen AI chatbot. As demand for real-time AI continues to grow, inference has become one of the most important and expensive parts of the machine learning lifecycle.

Finding performance bottlenecks with Pyroscope and Alloy: An example using TON blockchain

Performance optimization often feels like searching for a needle in a haystack. You know your code is slow, but where exactly is the bottleneck? This is where continuous profiling comes in. In this blog post, we’ll explore how continuous profiling with Alloy and Pyroscope can transform the way you approach performance optimization.

Build Numbers That Actually Make Sense: Branch-Scoped Sequence IDs in Harness CI | Harness Blog

You're tagging Docker images with build numbers. -Build is your latest production release on main. A developer pushes a hotfix to release-v2.1, that run becomes build. -Another merges to develop, build. A week later someone asks: "What build number are we on for production?" You check the registry. -You see,,, on main. The numbers in between? Scattered across feature branches that may never ship. Your build numbers have stopped telling a useful story.

Telegraf Enterprise Beta is Now Available: Centralized Control for Telegraf at Scale

Telegraf is incredibly good at what it does: collecting metrics, logs, and events from just about anywhere and sending them wherever you need. But once Telegraf becomes part of your production telemetry pipeline, spread across environments, teams, regions, and edge locations, the hard part isn’t installing agents; it’s operating them. Configs drift. “Temporary” overrides linger. Rolling out changes across hundreds (or thousands) of agents becomes a careful, manual process.

One CLI, Two Audiences: How We Built for Agents and Human

Half of the Checkly CLI users are already coding agents. This is not a prediction — it's what the data shows today. Since February, more and more agents have been using the CLI to manage and configure their Checkly monitoring setups. Right now, we're at 50% human and 50% agentic CLI users. And we predict that by the end of 2026, it won't be humans using the CLI; the agents will have taken over. The terminal became the primary interface for AI agents doing real work in the Checkly ecosystem.

Checkly and the Agentic Software Layer

November 24th, the Opus 4.5 release turned around the entire tech industry. This was the moment when agents became capable. Capable enough to write solid staff-level code. Capable enough to reason about alerts, investigate root causes much faster than most engineers, and set up the reliability layer faster. For me, this feels like an iPhone moment on steroids; the adoption of AI is accelerating much faster than any adoption curve I’ve seen over the past few decades.

How to Reduce MTTR with AI

The quick download: AI reduces MTTR by helping teams detect issues sooner, pinpoint root causes faster, and resolve incidents with less manual effort. IT downtime costs organizations an average of $9,000 per minute. AI-powered observability can cut incident resolution time by up to 70%. Here’s what it takes to get there. Every minute an incident goes unresolved, the meter is running.

Introducing Bits AI Dev Agent for Code Security

As organizations adopt AI-assisted development and increase their release velocity, they are not only generating more code but also finding more vulnerabilities from static analysis. The traditional remediation workflow of manually triaging issues, creating tickets, and opening individual pull requests (PRs) cannot keep pace. Fixing tens of thousands of vulnerabilities one by one is not a viable remediation strategy.

Datadog achieves ISO 42001 certification for responsible AI

As AI-powered products and services become central to how organizations operate, the need for responsible AI governance has never been greater. Customers, partners, and regulators are seeking assurance that AI systems are built, managed, and monitored responsibly and effectively. Datadog is committed to the responsible use of AI, both in how we build our products and in how we help customers observe their AI workloads.

Monitor Nutanix clusters, hosts, and VMs with Datadog

Nutanix is a hyperconverged infrastructure (HCI) platform that combines compute, storage, and virtualization into a single software-defined stack. By collapsing traditional infrastructure tiers into one platform, Nutanix simplifies provisioning and operations for virtualized workloads. Clusters are managed through Prism Central, which provides visibility into health, performance, capacity, and operational activity across hosts and VMs.