Operations | Monitoring | ITSM | DevOps | Cloud

Sponsored Post

What Do You Use for AI Agent Infrastructure? The Complete Guide to Building Production-Ready Agent Systems

The question "what do you use for AI agent infrastructure?" has become one of the most searched queries in the DevOps and platform engineering space. And for good reason: the global AI agent market is projected to grow from $5.1 billion in 2024 to $47.1 billion by 2030, representing a compound annual growth rate of nearly 45%. With 85% of enterprises expected to implement AI agents by the end of 2025, getting the infrastructure right has never been more critical.

AIEnhancer AI room design: See Your Space Clearly Before You Redesign

Most interior projects don't fail because of bad taste; they fail because people can't fully see the outcome early enough. A vague idea lingers, doubts creep in, and decisions stall. AIEnhancer was built to shorten that uncertain phase, turning ordinary room photos into convincing visual directions that help ideas settle into something concrete and usable.

How Agentic AI is Redefining Network Operations

For much of the past decade, many of the most ambitious ideas in artificial intelligence lived primarily in research papers, labs, and long-term roadmaps. Agentic AI was no exception. The concept of AI systems capable of reasoning, planning, and acting autonomously was widely discussed but largely theoretical. But earlier this month, Gartner released its report The Future of NetOps Is Agentic, reflecting a growing consensus that this has changed. What was once conceptual is now becoming operational.

Context engineering: The missing layer for trusted AI in financial services

Financial services AI demands more than models and prompts. Context engineering provides real-time, governed, and explainable intelligence with Elastic serving as the foundational context layer. Artificial intelligence in financial services is no longer constrained by model capability. The real bottleneck is context.

Track OpenAI Spend: Explain Where Your OpenAI Budget Goes

The inevitable happened. A while back, Gartner projected that in 2026, 30–50% of all new SaaS product features would use LLM inference. That meant OpenAI-style costs would become a standard part of SaaS COGS. Today, OpenAI has become one of the most operationally significant line items for SaaS companies. But for many teams, this creates an uncomfortable gap. Engineering sees OpenAI as a fast path to innovation.

Building with the InfluxDB 3 MCP Server & Claude

InfluxDB 3 Model Context Protocol (MCP) server lets you manage and query InfluxDB 3 (Core, Enterprise, Dedicated, Serverless, Clustered) using natural language through popular LLM tools like Claude Desktop, ChatGPT Desktop, and other MCP-compatible agents. The setup is straightforward. In this article, we will focus on setting up InfluxDB 3 Enterprise using Docker with Claude Desktop.

PIM Systems in the Age of AI: Real Benefits for Businesses

Modern companies and brands compete across multiple channels: websites, marketplaces, social media, and apps, while customers expect accurate, detailed, and personalized product information instantly. Managing product data manually is no longer sustainable. Product Information Management (PIM) systems, once reserved for large companies, are now essential for businesses of all sizes. The global PIM market reached $14.4 billion in 2024 and is expected to grow to $33.4 billion by 2033 (IMARC Group). This growth reflects the urgent need for centralized product data management.

Refactor Safely with AI: Using MCP and Traffic Replay to Validate Code Changes

So as software engineers using AI coding assistants, we’re quickly learning of a new anti-pattern: Hallucinated Success. You give your agent (e.g. Claude via terminal or various IDE code assistants) the command “refactor the billing controller.” The agent happily complies, churning out nice clean code. The agent even goes so far as to write a new unit test suite that passes at 100%. You integrate it. Your test suites pass. Your production code breaks. Why?

Harness AI January 2026 Updates: Human-Aware SRE and Smarter API and Application Security | Harness Blog

Harness AI is starting 2026 by doubling down on what it does best: applying intelligent automation to the hardest “after code” problems, incidents, security, and test setup, with three new AI-powered capabilities. These updates continue the same theme as December: move faster, keep control, and let AI handle more of the tedious, error-prone work in your delivery and security pipelines. ‍

Webinar Recap: What It Really Takes To Make AI Profitable

Right now, 48% of organizations say they’re being asked to measure or report on AI-related costs. The problem is that they’re still figuring out how to do it. That was a very telling stat from a recent CloudZero webinar on AI and profitability, and speaks loudly to the reality that many organizations are still struggling to get a grasp on AI spend which our data shows to be rising sharply as a part of total spend in recent months.

IT as the Proving Ground for AI: Driving Enterprise Innovation

As per the Enterprise AI Survey conducted by Digitate in collaboration with Sapio Research revealed that IT operations have emerged as the primary proving ground for artificial intelligence in the enterprise. With 78% of organizations already deploying AI in IT, 65% identifying ITOps as the biggest AI beneficiary, and adoption outpacing every other function, IT leads enterprise AI maturity.

How Qovery uses Qovery to speed up its AI project

Discover how Qovery leverages its own platform to accelerate AI development. Learn how an AI specialist deployed a complex stack; including LLMs, QDrant, and KEDA - in just one day without needing deep DevOps or Kubernetes expertise. See how the "dogfooding" approach fuels innovation for our DevOps Copilot.

Datadog acquires Propolis

Generative AI enables teams to write and ship code faster than ever. But current methods for testing and quality assurance have not evolved to match the new pace and scale of deployments. Manual and deterministic testing paths quickly become obsolete when new features are released, and they fundamentally can’t test AI outputs, leaving a massive untested surface area. To keep up, teams need new testing methods that can define what goals users have, and ensure that their outcomes match.

Why Context, Not Prompts, Determines AI Agent Performance

Prompt engineering improves single responses, but agent performance is determined by how execution context is captured, replayed, and constrained over time. For the past few years, enterprises have obsessed over prompts, with entire roles emerging around their design and an ecosystem of tooling and templates following close behind. This focus delivered early gains because it allowed teams to rapidly improve outputs without modifying the surrounding system. Over time, those gains flattened.

The Hidden Cost of 30% AI-Generated Code #speedscale #aicoding #devops #technews #ai

AI now writes 30% of Big Tech’s code, but the resulting surge in defects is crashing platforms like AWS and GitHub. Manual testing can no longer keep up with this velocity; it's time to deploy AI Quality Agents to save our systems. Is AI speed worth the decline in code quality, or are we headed for a breaking point? Let me know if you’ve noticed more bugs in your workflow lately. Video collab with @ScottMooreConsultingLLC.

Scaling AI Reliability: Real world lessons from Mistral AI

How does one of the world's leading AI companies keep its infrastructure reliable while shipping new models constantly? In this webinar, Devon Mizelle, Senior SRE at Mistral AI, shares the real story. Devon walks through how Mistral built an automated system that generates synthetic checks for every model the moment it goes live—no manual configuration, no forgotten monitors, no inconsistent alerting. Using monitoring as code, his team eliminated the toil of maintaining hundreds of checks across a rapidly evolving model ecosystem.

How Cisco Revolutionized Platform Engineering with Komodor's Agentic AI

In the world of cloud-native infrastructure, complexity is the silent killer of innovation. For Cisco Outshift, the company’s incubation engine, managing a sprawling environment of AWS EKS clusters and edge-based MicroK8s workloads created a classic bottleneck: the Platform Engineering team was drowning in toil. Facing SRE burnout and the limits of human scaling, Cisco embarked on an ambitious journey to evolve its internal operations from standard DevOps to Agentic AI.

Boost your test coverage with CircleCI Chunk AI agent

Test coverage is one of those metrics everyone agrees matters until it’s time to actually write the tests. Between shipping features, fixing bugs, and handling production issues, writing comprehensive tests for edge cases and error paths often falls to the bottom of the backlog. The result is coverage gaps that accumulate technical debt and leave your codebase vulnerable to regressions. As AI-powered development tools reshape how we write code, the volume and velocity of changes is accelerating.

Fix bugs faster with CircleCI's Chunk AI agent

Bugs hide in plain sight. A date validator that rejects February 29th on leap years. An edge case that slips through code review. A flaky test that passes locally but fails in CI. These issues erode trust in your codebase and waste hours of debugging time. In the era of AI-assisted development, code is being written faster than ever. But speed creates risk.

Intelligent Voice Agents: Transforming Operational Efficiency in Business Communication

Businesses today operate in an environment where customers expect immediate, personalised responses across all communication channels. Traditional phone systems and support teams often struggle to meet these demands without incurring significant costs or operational strain. As a result, intelligent voice agents - AI-powered systems designed to conduct natural conversations - are emerging as a strategic solution to drive operational efficiency and enhance customer engagement.

Why the 2026 Google Ads Specialist is Now an "AI Strategist"

The days of simply "picking the right keywords" are officially behind us. As we navigate 2026, the Google Ads interface has transformed from a tool of manual levers into a sophisticated AI powerhouse. For businesses, this means the role of a Google Ads Specialist has fundamentally shifted. They are no longer just "account managers"; they are data architects and AI co-pilots. If you're wondering why your old PPC strategies aren't hitting the same ROAS (Return on Ad Spend) targets, it's time to look at how the role of the specialist has evolved.
Sponsored Post

Meet AlmaIQ: The AI Concierge Simplifying Employee Support

Almaden has exciting news to make life easier for enterprise employees: AlmaIQ . Unlike other virtual assistants that are complex to set up and maintain, AlmaIQ is simple. Acting like a "concierge" or personal assistant, it answers questions from computer issues to corporate processes, instantly and without complication in the user's native language.

AI is not intelligent. It's obedient.

Tech companies and brands love calling AI “intelligent.” But is it really? AI doesn’t decide what matters. Humans do. We decide what’s important, then feed prompts, data, and instructions into AI models so they work the way they do. At the end of the day, AI is obedient to human intelligence, not the other way around. And it’s on us to use it in ways that actually matter, instead of dismissing it or freaking out that it’s going to replace humans.

Seer: debug with AI at every stage of development

When we launched Seer, our AI debugging agent, we built it on a core belief: production context is essential for understanding the complex failure modes of real-world software. Seer uses the detailed telemetry that Sentry collects (errors, spans, logs, metrics, and more) to accurately root cause and fix bugs. Because this telemetry is trace-connected, Seer can deterministically traverse all the data relevant to a problem rather than relying exclusively on imprecise time-range searches.

How to Reduce Service Desk Workload with AI and Automation

For many IT directors, the service desk feels permanently stretched. It’s a math problem that is forever in motion. Every quarter brings new apps, new devices, new access rules, and new ways for small issues to become daily interruptions. Even when tooling improves, the queue still grows because the work expands with the environment. The pressure shows up in familiar places, like rising ticket counts, tighter SLAs, and a large backlog of projects that need help.

The 2026 IT Leader's Priority Shift: Why AI, Resilience, and Visibility Now Outrank Everything Else

IT leaders are replacing traditional focuses with three things that now outrank everything else: AI readiness, operational resilience, and unified visibility. You can’t add another priority to the list. There’s no space left. Your team is already stretched managing hybrid infrastructure, responding to incidents, juggling tool sprawl, and delivering on AI promises while keeping costs under control.

Getting Started with Seer - Sentry's AI Debugging Agent

Seer is Sentry's AI Debugging agent that has access to all the context that Sentry pulls together from your applications. Sometimes it shows up predicting bugs before they ship to prod. Sometimes it's catching issues in prod and bringing you the fix. Seer pulls from distributed traces, logs, profiles, stack traces, errors, and your codebase, and helps you find the broken parts of your application and fix them faster.

AI Is WAY More Expensive Than You Think... | SolarWinds TechPod #105

Artificial intelligence isn’t just about innovation and efficiency — it comes with hidden costs. From massive data centers and rising energy consumption to layoffs, governance, and long-term business impact, the real price of AI is often ignored. Companies rush to adopt AI, but are they calculating the true cost for the environment and their bottom line?

Optimize your CI/CD pipeline with CircleCI Chunk AI agent

A slow CI/CD pipeline costs more than just time. Developers context-switch while waiting for builds, feedback loops stretch longer, and compute costs add up with every inefficient run. Most teams know their pipelines could be faster, but optimizing configurations requires deep knowledge of caching strategies, parallelism, and resource allocation. The challenge compounds with AI-assisted development. As AI coding assistants help teams ship code faster, pipelines run more frequently.

Refactor your codebase with CircleCI Chunk AI agent

d function there, and before long you’re navigating a codebase full of inconsistent patterns, repeated logic, and code that’s harder to maintain than it should be. Refactoring is essential, but finding the time to clean up code while shipping features is a constant challenge. The rise of AI-assisted development has accelerated this tension. AI coding assistants help teams ship features faster, but they don’t always produce consistent code.

Can We Still Trust the Code? #speedscale #qualityassurance #digitaltwin #trust #devops

The "Velocity Gap" is real. AI like Claude and GitHub Copilot are pumping out code faster than ever, but there’s a catch: Engineers don't trust it yet. We’re moving away from the old days of "clicking around" in a test environment, but how do we verify code at the speed of light? Ken breaks down why the future of QA isn't just "testing," it’s simulation. Video collab with @ScottMooreConsultingLLC Learn More: speedscale.com.

The 4 pillars of AI in 2026: Agents, cost, observability & sovereignty

AI is no longer just about "one-shot" prompts. In this session from our "From Idea to Agent" webinar, Ben Norris (AI Engineer at Civo) breaks down the four key priorities dominating the enterprise space in 2026. From the 130x explosion in token usage to the "vibe-coding" revolution, learn why businesses are turning away from US hyperscalers in favor of democratized, secure, and UK-sovereign AI infrastructure. We explore how autonomous agents are solving multi-step problems and why "Chain of Thought" reasoning is unlocking AI for heavily regulated industries like finance and healthcare.

Agentic IT operations, powered by BigPanda

BigPanda delivers the next evolution in AIOps solutions, featuring agentic automation for ITOps and ITSM teams, all in a single platform. Agentic IT operations from BigPanda keep the digital world running by transforming reactive, manual IT processes into proactive, intelligent automation. Our platform uses AI to detect, respond to, and prevent IT incidents at machine speed.

Actionable Network Device Monitoring with Automated Anomaly Detection and AI Troubleshooting

Network device monitoring is often a mess of polling, graphs, and alerts that don't lead to answers. In this webinar, we'll show how to monitor routers, switches, and firewalls in a way that quickly surfaces what matters: interface health, errors, drops, saturation, latency signals, and performance regressions—without drowning in noise. You'll learn how Netdata turns raw SNMP metrics into high-signal insights using automated anomaly detection and AI-assisted troubleshooting, so your team can move from 'something is wrong' to 'here's the root cause' faster.

GenAI Observability in Grafana Cloud: End-to-End Agent Debugging (Demo)

From Observability for GenAI Applications (Grafana OpenTelemetry Community Call) We drill into traces to see which agents called which tools, where errors occurred, how long each LLM call took, and how costs and tokens are distributed. The walkthrough also covers using AI assistance to summarize long traces and identify optimization opportunities in real time..

AI SRE in Practice: Resolving Node Termination Events at Scale

When a node terminates unexpectedly in a Kubernetes cluster, the immediate symptoms are obvious. Workloads restart elsewhere, services experience partial outages, and alerts fire across multiple systems. The harder question is why it happened and how to prevent it from recurring. This scenario walks through a node termination event where the entire node pool was affected, requiring investigation across infrastructure layers to identify root cause and implement lasting remediation.

AI Hosting: The Colocation vs. Cloud Dilemma for Your Next Project

Organisations running AI workloads, like banks training fraud detection models, hospitals testing diagnostic tools, or manufacturers using predictive analytics, all face the same problem: hosting them is costly and resource-intensive. They require dedicated GPUs running non-stop, vast amounts of data moving in and out, and far more power and cooling than a typical IT system.

AI in Production Is Growing Faster Than We Can Trust it

Enterprise software has moved past the generative AI testing phase. Businesses with millions of daily users or workloads are no longer just prototyping LLMs in a vacuum. They’re directly wiring agentic efficiency into product interfaces and infrastructure to stay competitive. This wave is often compared to the spread of microservices in the past, but we aren’t just adding new dependencies and complexity.

Engineering reliable AI agents: The prompt structure guide

The difference between an AI assistant that "almost" works and one that consistently delivers high-value results is rarely a matter of raw model capability. Instead, the bottleneck is typically the quality and structure of the instructions provided. For DevOps and SRE teams building automated workflows, "magical prompt tricks" are no substitute for a repeatable, engineered structure.

The Invisible Million Dollars and How AI Prevents Revenue Leakage

We have spent the last decade engineering our organizations for velocity. We optimized for "Land and Expand." We celebrated bookings. We built commercial architectures designed to intake revenue faster than we could operationalize it. In that era, operational friction was accepted as the cost of doing business. That era is over. The mandate has shifted from growth at all costs to efficient growth.

Feature Friday: Personalized Context in One Click. New Cortex MCP My Workspace

Stop digging through tabs and explaining your role to your tools. In this Feature Friday, we’re unveiling 'My Workspace' for the Cortex MCP, designed to give you instant, personalized identity the moment you start your day. Before today, you had to manually fetch data from Jira, GitHub, and internal docs just to figure out your priorities. Now, you can simply ask, "What should I work on this week?" and get a parallel, high-speed pull of your entire ecosystem.

AI Is Bigger Than LLMs: Why Network Teams Need to Think Beyond Chatbots and Agents

AI in network operations is more than chatbots and agents. LLMs make AI easier to use, but the real value comes from the underlying system of telemetry, data pipelines, analytics, ML models, domain knowledge, and workflows that help engineers reason, predict, and act. When designed thoughtfully, AI doesn’t replace engineers. Instead, it augments their expertise and reduces cognitive load across complex network operations.

From Trough to Traction: 10 Real-World Lessons in Cloud and AI Efficiency

When CloudZero CTO Erik Peterson joined the SourceForge podcast in January 2026, he didn’t just talk about cloud costs. He reframed them as a launchpad for innovation, survival, and competitive advantage. Whether he was describing the “trough of lost innovation,” the “freemium tax,” or why efficiency is the next frontier of engineering culture, Erik’s expert insights go beyond FinOps hygiene.

Agentic AI Essentials: Adoption Pitfalls and How to Avoid Them

In the last article in this series, we explored how IT professionals and leaders can cut through the hype surrounding agentic AI and gain a deeper understanding of what the technology actually offers. Now, we turn to the practical side: how to integrate it effectively. Let’s explore the challenges and outline strategies that organizations of all sizes can use to adopt agentic AI with confidence.

Why Your Hotel's Review Responses Matter More Than You Think for Guest Loyalty

Price wars? Those are yesterday's battles. Location advantages? Sure, they help. But here's what really determines whether guests come back to your hotel: trust. And trust doesn't live on your homepage; it lives in your review section. Every time someone takes fifteen minutes out of their day to write about their stay, your reply (or radio silence) tells them exactly who you are as a brand.

[Webinar] Building Quality-Driven Agentic AI in Noisy Big Data Environments

Watch as Itiel Shwartz, Komodor CTO and Co-Founder as he shares hard-won lessons from developing an AI agent that processes millions of K8s events daily to deliver autonomous troubleshooting that reached 95%+ accuracy in benchmarking. This webinar covers: Building production ready systems that maintain reliability when 90% of your data is noise. How Komodor developed an AI SRE agent that processes millions of K8s events daily to deliver autonomous troubleshooting that reached 95%+ accuracy in benchmarking.

An introduction to GPU time-slicing

GPUs are no longer a niche component. Gamers know them for immersive graphics, workstation users rely on them for balanced performance, and in the age of AI, GPUs have become one of the most in-demand resources in modern infrastructure. They are also expensive. That reality creates two immediate constraints, for individuals and enterprises alike: GPU-backed instances should be provisioned deliberately, and once provisioned, they should be used efficiently.

AI Anomaly Detection: Catch AI Cost Surprises Before They Kill Margins

Consider this: traditional cloud cost monitoring was like checking your fuel gauge once a month — after the trip was already over. That model worked when infrastructure scaled slowly. You provisioned resources predictably and paid for stable, linear usage. AI breaks that model. Today, AI costs behave like a high-performance engine with a hypersensitive throttle. A small input, like a prompt change or a single power user, can dramatically increase your fuel burn in seconds.

Measuring Claude Code ROI and Adoption in Honeycomb

At Honeycomb, we’ve been using Claude Code across our engineering team for a while. Anecdotally, I had a sense of who the power users were, and I had seen some examples of complex usage. But I wanted to be able to confidently answer questions, like: Claude Code supports OpenTelemetry out of the box, which means sending telemetry to Honeycomb takes just a few minutes of configuration.

ChatOps that actually works: Grafana Cloud, Slack, and AI-powered observability

Context switching isn’t just inefficient—under pressure, it’s exhausting. It slows decision-making, increases the risk of mistakes, and makes even experienced engineers feel like they’re always a step behind the system they’re responsible for. At Grafana Labs, we want to build tools that meet you where you are. That's why we embedded Grafana Assistant, our context-aware AI assistant, directly in Grafana Cloud.

How to Troubleshoot BGP Faster with Kentik AI Advisor

A BGP session goes down because a transit provider exceeded the maximum prefix limit. How do you find the root cause — fast? In this 10-minute demo, we walk through two approaches using Kentik AI Advisor. First, we troubleshoot step by step using natural language: asking AI Advisor to identify the affected interface, check for interface flapping, and review syslog messages until we find the maximum-prefix violation. Then we show how custom network context and natural language runbooks let AI Advisor do the entire investigation autonomously — following the same four steps a senior engineer would.

MCP: Why AI Needs Git Intelligence

GitKraken CTO Eric Amodio breaks down the Model Context Protocol (MCP) and explains why Git intelligence is critical for AI agents at GitKon 2025. In this session, Eric covers: What MCP is and why every major AI company adopted it Why AI needs Git history, not just file system access How GitKraken MCP removes Git pain safely The future of agentic developer workflows How Commit Composer uses AI to organize commits without losing data.

GitKraken Insights | Engineering Intelligence in Minutes

Most software intelligence tools take months to implement, cost a fortune, and end up collecting dust. GitKraken Insights is different. It helps engineering leaders measure what matters: AI impact, code quality, delivery performance, and developer experience, all in one place. It’s the latest evolution of the GitKraken DevEx platform, trusted by over 40 million developers. Insights connects data from across your GitKraken tools to give you a complete picture of engineering health and value. We're talking DORA metrics, pull request metrics, and AI impact.

What is ServiceNow's AI Control Tower?

What happens when AI agents stop being scattered and start being steered? Customer service queues shrink, teams get time back for high-value work, and everyone finally works off the same data. That’s the power of the ServiceNow AI Control Tower—all your AI, all under control. No more fragmentation. No more busywork. Just visibility, control, and workflows that scale across the entire business.

Observability for GenAI Applications (Grafana OpenTelemetry Community Call)

In this episode, we’re diving into observability for Generative AI apps. AI helps us write code and monitor applications in production - but how do we observe the AI itself? And how do we make sense of complex, non-deterministic AI systems? We’re joined by two great guests: Ishan Jain, working on GenAI observability and Luccas Quadros, working on Grafana Assistant. Together, they bring both platform-level insights and real-world perspectives.

From idea to agent: Building AI workflows with relaxAI and n8n

Join us for this live online webinar as we explore how to design, build, and deploy practical AI agents using n8n’s workflow automation platform powered by relaxAI’s UK sovereign infrastructure. Our speaker, Ben Norris, AI Engineer at Civo, will guide you through the real-world process of creating intelligent agents that automate tasks across tools and services, all without deep coding expertise.

How to Use PostgreSQL AI for Query Writing and Optimization

PostgreSQL AI is gaining attention as SQL complexity increases in production environments. It addresses a common problem: extended queries that accumulate joins, nested logic, and edge cases. Without AI assistance, these queries are often harder to write and review, driving 20–40% of developer time into debugging. In practice, these challenges affect PostgreSQL users in different ways.

How GitKraken's AI-Powered Commit Composer Eliminates Git Cleanup Headaches

As developers, we’ve all been there: a frantic coding session, a few hasty commits, and suddenly our Git history looks like a patchwork quilt of “fix,” “oops,” and “stuff.” While git rebase -i is a powerful tool for cleaning up, it’s also a source of anxiety for many, often leading to more headaches than it solves. What if you could achieve a pristine, meaningful commit history without the fear of breaking things or hours spent squashing and rewriting?

Why AI Automation for ITOps Needs Context Graphs

AI automation in ITOps fails because execution loses decision context, and context graphs turn incident history into durable execution memory that systems can actually reuse. AI automation for ITOps fails because it remembers what it did, but not why. Fixing an issue depends on what was tried last time, what failed, what worked, which exceptions were approved, and under what conditions. That information rarely lives in the system.
Sponsored Post

Digital Twins Gone Wild: My Unexpected AI Doppelgänger

I recently tried using AI to create a digital twin of myself. I uploaded a photo, expecting a futuristic, slightly improved version of me... and what did I get in return? A picture of Kim Jong Un. Clearly, AI has a sense of humor-or a very different definition of "twin." Forget Arnold Schwarzenegger and Danny DeVito. Digital Twins 2-Now Starring My AI Doppelgänger From Speedscale's perspective, a digital twin is built from real production traffic, continuously updated, and executable in your test and CI/CD environments.

Announcing the Harness Human-Aware Change Agent | Harness Blog

AI that understands human insight and connects it to the changes that drive real incidents. At Harness, our story has always been about change — helping teams ship faster, deploy safer, and control the blast radius of every modification to production. Deployments, feature flags, pipelines, and governance are all expressions of how organizations evolve their software. Today, the pace of change is accelerating.

AI In 2026: Autonomous, Invisible, Expensive

With all we’ve seen from AI in the last several years, it can be easy to forget that it’s still in its very early days. As torrid as its evolution has been thus far, it will only intensify. As SVP of Engineering at a B2B SaaS company, I’ve had a front-row seat for much of this evolution. Here are three ways I see AI heading in 2026.

How AI amplifies your entire engineering culture

Anyone who has ever attempted to learn the guitar knows the lure of buying high-end gear. Surely, an expensive guitar and a best-in-class amplifier will hide the fact that you only know a few chords and maybe the lead line to that one song you keep hearing on the radio. What most players find out, however, is that spending thousands of dollars on gear doesn't change the fact that you're not that good yet.

How is the next wave of AI impacting the Indian cloud scene?

Gartner has predicted that 2026 will see a 10.6% increase in India’s total IT spend from 2025 (2025: USD 159 billion vs 2026: USD 176.3 billion), with data centres, cloud infrastructure, and AI-enabled technologies driving this growth. This isn’t just a budget increase; it’s a fundamental shift in where innovation happens, who owns the infrastructure, and how we translate AI potential into scalable impact.

Building & Enforcing an AI Policy

Just like "cloud" in years past, the term "AI" has permeated just about every tech space and product. And although AI has plenty of business benefits, there are also plenty of risks in using AI, especially when it comes to sensitive business information. End users may not understand (or care) about the negative impacts of AI, which is where IT comes in. In this stream, you'll learn about how you can build an AI policy that works and how to enforce it so that users actually follow it!

AI SRE Update: Your Feedback Shaped Our Latest Release

A note from Lauren Nagel, Mezmo's VP of Product: At Mezmo, we believe the best observability tools aren't just built for users, they're built with them. Since the launch of Mezmo's AI SRE agent, we've listened and learned from our customers. The feedback and insights have been invaluable in helping our teams refine and enhance the experience. Today, we're excited to share our latest release, packed with improvements and powerful new capabilities that make our AI SRE even faster and more intuitive.

Building AI-Ready Database Operations: A Deep Dive into Maturity and Actionable Habits

Now, we will explore how this foundational strength translates into organizational maturity and lay out the durable operating habits required to bridge the gap between reactive firefighting and strategic performance engineering.

What is a Scam Checker and How Can It Protect You Online?

Here's something that should worry you: online scams are evolving faster than ever before. We're not talking about clumsy Nigerian prince emails anymore. Last year, Americans handed over billions to digital con artists, and those figures? They're accelerating at an alarming rate. The truly frustrating part is that most people only discover they've been victimized after the damage is done.

AI Impact on software engineering (as I see it)

When I first started using AI (Cursor, to be more specific) for coding, I was very impressed to see how it could generate such high-quality code, and I understand why it's now one of the most widely used tools for software engineers. As I continued to use them more regularly, I realized they are far from perfect. Their effectiveness depends heavily on how they are used and the context in which they are applied.

Lightrun Runtime Context MCP | Lightrun

In this video, Lightrun's Moshe Sambol walks you through the power of Lightrun MCP and Runtime Context. A game-changer for AI-assisted development. This integration lets developers debug live issues, inspect real-world variables, and verify fixes across environments, all without leaving the IDE. With Lightrun MCP, you can: Capture live transaction state directly from Staging and Production. Identify root causes using real runtime values, not just static code. Verify fixes instantly without redeploying or context switching.

Observability with AI? Honeycomb with AI!

Since Honeycomb started, it has had a weakness: too many choices. Every field, custom or standard, hundreds of them, all are free to group, filter, and visualize in dozens of ways. Which ones are interesting? Honeycomb exists to help people understand custom software. It doesn’t pretend to know what matters in your application. That’s an interpretive task, not programmatic. Hey, computers can do interpretation now!

The next wave of AI: Open source, robotics & the future of India's tech powerhouse

As we kick off 2026, the tech landscape is being reshaped by the very breakthroughs discussed at Civo Navigate India 2025. This panel, featuring Josh Mesout, Murthy Chitlur, Chirotpal Das and Anjali Batra, laid the groundwork for the AI-driven world we are operating in today. From the rise of agentic AI and small language models to the massive shift toward open-source parity, these experts didn't just discuss trends; they provided the blueprint for building resilient, sovereign, and scalable AI infrastructure in India.

4 Ways AI Chat Helps Operations Teams Work Smarter and Faster

Operational teams live in constant motion. Systems change, incidents escalate, and information is spread across tools that don't speak the same language. The real bottleneck isn't lack of data. It's clarity. People spend more time searching, rewriting, summarizing, and coordinating than they do actually solving problems.

AI SRE in Practice: Diagnosing Configuration Drift in Deployment Failures

Deployments fail for dozens of reasons. Most of them are obvious from the error messages or pod events. But when a deployment rolls out successfully according to Kubernetes but your application starts experiencing latency spikes and error rate increases, the investigation becomes significantly harder. This scenario walks through a configuration drift incident where the deployment appeared healthy but available replicas were constantly flapping, creating cascading reliability issues.

Modern Image Workflows Need Speed - This Is Where AIEnhancer's Watermark Remover Fits

You know that annoying moment when you notice a watermark in a corner of a photo? It's small, maybe almost invisible, but it keeps distracting you. Sometimes you try erasing it manually, and it just... doesn't look right. AIEnhancer steps in here, helping clean images without making them feel over-edited or artificial. It's kind of like having someone do the tedious part for you, but faster.

How to Build Media Operations That Survive Full AI Automation

By the end of 2026, you will upload a product image and a budget to Meta, and its AI will generate the creatives, pick the audience, allocate spend across surfaces, and optimize in real time. Google’s Performance Max already automates bidding, asset selection, and cross‑channel allocation across Search, Shopping, YouTube, Display, and more.

Building reliable dashboard agents with Datadog LLM Observability

This article is part of our series on how Datadog’s engineering teams use LLM Observability to iterate, evaluate, and ship AI-powered agents. In this first story, the Graphing AI team shares how they instrumented their widget- and dashboard-generation agents with LLM Observability to detect regressions and debug failures faster. Visibility into how large language model (LLM) applications behave in real time is essential for building reliable AI-driven systems at Datadog.

Why agentic AI is the future of IT change management

Every enterprise depends on continuous changes to its IT environment. New code releases, infrastructure updates, configuration changes, and security patches are all crucial to support continuous innovation. These same changes are also a leading source of operational risk and one of the most common causes of failures at the network, infrastructure, and software layers, resulting in outages.

How AI OCR Is Reshaping Automated Data Extraction in Large-Scale Business Operations

Businesses handle massive amounts of data every day. Such data is obtained from invoices, bills, contracts, applications, and many other documents. Most of these documents are distributed in the form of scanned copies and images. As a result, whenever organizations resort to manual data entry in processing such data, the process turns out to be slow and filled with errors. However, to avoid these issues, organizations are now turning to AI-OCR solutions for better data extraction and increased operational efficiency.

AI in Contact Centers: Capabilities, Limits, and the Missing Decision Layer

AI in contact centers refers to the use of artificial intelligence technologies to automate customer interactions, support agents in real time, analyze conversations, and improve operational efficiency. In practice, this includes chatbots, virtual agents, intelligent routing, agent assist tools, sentiment analysis, and automated quality assurance systems designed to increase speed, consistency, and scale.

What is Runtime Context? A Practical Definition for the AI Era

TLDR: Runtime Context is live, execution-level access to a running production system. It lets engineers and AI agents ask precise questions of running code and get answers immediately, without redeploying or interrupting users. This is the new baseline for reliability.

Agentless First, Agents When Needed: A Hybrid Approach to Security Telemetry

Security data collection has become a first-class architectural concern for modern SOCs. Once collection is treated as a dedicated layer, separate from analytics and detection, the next question becomes practical: how should telemetry be collected in a way that aligns with this architecture? In the previous article, we examined why this shift occurred. Here, we focus on how different collection models (agent-based, agentless, and hybrid) fit into modern security data collection architectures.

The Operational Cost of Shadow AI: Securing Data Integrity in Modern Workflows

In the current hyper-accelerated digital landscape, operational efficiency is the bedrock of corporate scaling. However, a silent threat-the "Authenticity Gap"-is quietly eroding the reliability of enterprise data as unvetted Generative AI permeates modern workflows. For operations managers, this is a Level 1 silent risk that compounds into significant wealth erosion and project delays if left unmanaged.

Scaling Autonomous Operations with Agentic AI demo with Resolve

What does autonomous IT actually look like? This clip shows it in action. In this moment from our Scaling Autonomous Operations with Agentic AI webinar, RITA meets users where they work. Inside Slack. No portals. No tickets. Just answers. Watch RITA pull personalized knowledge in real time, synced directly from systems like SharePoint. Updates publish once and are instantly available everywhere. Then the real power kicks in.

When AI Speeds Up Change, Knowing First Becomes the Constraint

In a recent post, I argued that AI doesn’t fix weak engineering processes; rather it amplifies them. Strong review practices, clear ownership, and solid fundamentals still matter just as much when code is AI-assisted as when it’s not. That post sparked a follow-up question in the comments that’s worth sitting with: With AI speeding things up, how do teams realise something’s gone wrong before users do? It’s the right question to ask next.

Why AI-driven automation in incident response is viable now

This article explains why AI-driven automation in incident response is feasible now. Teams can finally safely delegate repetitive and time-critical response tasks to AI Agents, which operate with contextual awareness and human oversight. The result is faster response, higher service uptime, and less alert noise – without losing control. ‍

Your Cloud Economics Pulse For January 2026

Welcome to January’s Cloud Economics Pulse, CloudZero’s monthly look at cloud spend as AI moves from vibe to prod. And this related news flash — AI spend keeps hitting new highs. pilots to production. In last month’s Pulse, we explored the compounding effect of AI becoming part of everyday cloud operations. This month, we see that pattern harden into year-end results.

4 foundations you need to scale AI in engineering

As a baseline, engineering leaders need their teams to adopt AI tools to speed up velocity and ship faster. Most organizations have already rolled out AI coding assistants or are evaluating them, but there's a really big difference between buying a tool and successfully scaling it across an engineering organization. If you layer AI on top of a chaotic codebase or a disorganized service catalog, you accelerate the creation of legacy code.

Breaking the Iron Triangle: How AI-powered investigations change the economics of uptime

In engineering, there's a concept known as the Iron Triangle. With three sides—cost, quality, time—it's a framework intended to help you prioritize different aspects of project management Want fast, high-quality features? It'll cost you. Need to keep costs down while maintaining quality? That'll take time. And if you're trying to move fast and cheap? Well, good luck with quality. For years, this has been the brutal reality of running services on the web.

The Technical Architecture Behind Automated Video Generation Systems

I spent several weeks last year reverse-engineering how automated content pipelines actually work. Not because I wanted to build one necessarily. But because the proliferation of AI-generated video content raised questions I could not answer without understanding the underlying systems. How do these pipelines function? What are their actual capabilities and limitations? Where does technology stand today?

Top Realistic AI Image Generators for Practical Business Use

The gap between AI image generation demos and actual business deployment remains wider than most vendors acknowledge. Marketing materials showcase stunning outputs. Operational reality involves inconsistent results, workflow friction and outputs that require significant human correction before they reach production. For operations leaders evaluating these tools, the question is not which generator produces the most impressive single image. The question is which tool delivers reliable, realistic outputs at scale without disrupting existing workflows or requiring specialized technical expertise.

Is GPTHumanizer AI Legit? An Honest Hands-On Review (2026)

You write a draft blog with ChatGPT. You're happy with it. Then a detector slams you in the face with a "Likely AI Generated" label. But the worst part? It doesn't have to be bad content. Sometimes it's just... too smooth. too consistent. too ordinary. And too difficult to attract attention from readers. This market is now jam-packed with AI humanizers that are all basically the same: "make your writing more natural, make your writing more readable, make your writing sound "more human."".

How we built an AI SRE agent that investigates like a team of engineers

We built Bits AI SRE to help engineers investigate and solve production incidents, one of the most difficult aspects of operating distributed systems today. As environments grow more dynamic and complex, resolving issues becomes more challenging. Failures now span more services, involve noisier signals, and encompass larger volumes of telemetry data, making it hard for on-call engineers to find root causes quickly. Today, Bits AI SRE is already helping teams decrease time to resolution by up to 95%.

Automate flaky test fixes with the Bits AI Dev Agent and Test Optimization

Flaky tests are a significant source of inefficiency that impacts many engineering teams. Along with failing your build, they interrupt your entire development flow, generate excessive CI/CD noise, and, critically, compromise developer trust in the test suite itself. Datadog Test Optimization enables you to manage test suites at scale by pinpointing the flakiest tests, analyzing their history across hundreds of runs, and automatically surfacing the root cause.

How To Calculate Your OpenAI Cost Per API Call (And Why It Matters Now)

OpenAI doesn’t bill per feature, per customer, or per transaction. It bills per token, across multiple models, with usage patterns that can change by the hour. As a result, two API calls that support the same feature can have very different costs. Without a clear way to translate token-level pricing into something product, engineering, and finance teams can reason about, AI spend becomes difficult to forecast and harder to control.

Supercharge your LLM Using Production Data Context

Are your LLM coding agents (like Cursor or Claude Code) hallucinating fixes because they don't know what's actually happening in production? In this video, Matt from Speedscale shows you how to bridge the gap between your local IDE and live production traffic using the Model Context Protocol (MCP). Most observability tools just give you telemetry. Speedscale’s MCP server gives your agent the "inner workings" of actual API calls and payloads, so it can check its assumptions against reality. No more "vibe-coding" and hoping it works; let your agent find the 500 errors and rate limits for you.

The 54% Improvement Playbook: How Top Performers Integrate GenAI into ITSM

Don't just read the report—learn how to replicate its most impressive results. In our 2025 State of ITSM Report, a select group of top-performing organizations achieved a staggering 54.3% reduction in resolution time by strategically integrating GenAI. This live session moves beyond the data to share their playbook. We'll provide a step-by-step guide on how to pair GenAI with foundational ITSM practices and demonstrate how to weave these tools into your team's daily workflows to achieve maximum efficiency.

Agentic AI Essentials: Examining the Hype Around Agentic AI

In the first article of our Agentic AI Essentials series, we’ll establish what makes agentic AI distinct. We’ll look at the process of tool calling and examine how agentic systems convert intelligence into action. We’ll also explore the human fears, pressures, and ambitions that fuel the hype around agentic systems. By sorting the signal from the noise, IT decision-makers can take the first step toward making sound decisions around agentic AI adoption.

Operational Risk Management in High-Stakes Decision Environments

In high-stakes environments, every choice carries weight. Whether it is a complex financial process, a real-time cybersecurity response, or a tightly regulated operational workflow, small missteps can rapidly evolve into major failures. Organizations increasingly rely on integrated riskmanagement strategies that blend human judgment with technology. The goal is simple: reduce uncertainty before it becomes costly. But the path to that goal is rarely straightforward.

Let Your LLM Debug Using Production Recordings

Modern LLM coding agents are great at reading code, but they still make assumptions. When something breaks in production, those assumptions can slow you down—especially when the real issue lives in live traffic, API responses, or database behavior. In this post, I’ll walk through how to connect an MCP server to your LLM coding assistant so it can pull real production data on demand, validate its assumptions, and help you debug faster.

AI SRE in Practice: Resolving GPU Hardware Failures in Seconds

When a pod fails during a TensorFlow training job, the investigation usually starts with the obvious questions. The answers rarely come quickly, especially when the failure involves GPU hardware that most engineers don’t troubleshoot regularly. This scenario walks through an actual GPU hardware failure and shows how AI-augmented investigation changes both the time to resolution and the expertise required to handle it.

Cloud Strategy for 2026: the Year of Repatriation, Resilience, and Regional Rebalancing

This year is set to be a pivotal year for cloud strategy, with repatriation gaining momentum due to shifting legislative, geopolitical, and technological pressures. This trend has accelerated, with a growing focus on data sovereignty. These challenges have set the stage for 2026 to be the year of repatriation, resilience, and regional rebalancing. Here, Rob Coupland, Chief Executive Officer at Pulsant, offers his insights.

AI coding assistants are only as good as the context you give them

AI coding assistants have quickly become part of everyday development. Teams now rely on them to explain unfamiliar code, suggest configuration files, debug errors, and accelerate delivery across the stack. But as these tools move from experimentation into real production workflows, a consistent pattern is emerging: AI breaks down at the platform boundary.

Beyond the Blue Link: UX Patterns for Google's AI Overviews, AI Mode & Answer Engines

The blue link is dying—but not in the way we expected. When Google’s AI Overviews began appearing at the top of the search results page, the SEO community panicked. Publishers watched click-through rates plummet. The Pew Research Center confirmed their fears: searchers who encounter an AI summary are half as likely to click on traditional search results (8% vs. 15%).

Vibe coding tools observability with VictoriaMetrics Stack and OpenTelemetry

AI-powered coding assistants have transformed how developers write software. Tools like Claude Code, OpenAI Codex, Gemini CLI, Qwen Code, and OpenCode have introduced what many call “vibe coding” — a new paradigm where users describe their intent and AI agents handle the implementation details. But as these tools become integral to development workflows, a critical question emerges: how do we understand what’s happening under the hood?

Lightrun MCP: Your AI Assistant Now Debugs and Validates Production Code

Intermittent production bugs are hard to debug and rarely reproduce locally. Teams fall into a loop of adding logs, and every rollback slows them down. In this demo, R&D team leads Maor Yaffe and Or Golan show how an AI assistant can verify production issues using real runtime data, without redeploying. By connecting Cursor to Lightrun MCP, the agent inspects live production behavior, collects real variable values, and confirms the root cause with evidence instead of assumptions.

What the Latest Google "AI Mode" Means for Users Who Care about Privacy and Better Experiences

When Google introduced its AI highlights above the main search results, we thought that was all the company would push to prove its determination to turn traditional Google Search, praised by businesses for expansive SEO opportunities, into an AI-powered experience. But if you live in the U.S. and have recently paid attention to the Google homepage, there's a new button called "AI Mode." Well, it turns out the company is still working hard not to lose its dominance to competitors.

Top tips: RAG isn't the problem, context is. Here are 3 fixes.

Top Tips is a weekly column where we highlight what’s trending in the tech world and list ways to explore these trends. This week, we’ll be talking about how we can improve our retrieval-augmented generation (RAG) systems using contextual engineering. Prompt engineering has gained a lot of attention in the past year, and it’s finally time to move on to a better experience that transforms the way AI results are provided to us.

Make Your Engineering Processes Resilient. Not Your Opinions About AI

Why strong reviews, accountability, and monitoring matter more in an AI-assisted world Artificial intelligence has become the latest fault line in software development. For some teams, it’s an obvious productivity multiplier. For others, it’s viewed with suspicion. A source of low-quality code, unreviewable pull requests, and latent production risk. One concern we hear frequently goes something like this: It’s an understandable fear; and also the wrong conclusion.

When is it ok or not ok to trust AI SRE with your production reliability?

There’s a moment every engineer knows. An AI suggests a fix, it looks reasonable,maybe even obvious, but production is on the line and you hesitate before clicking execute. There’s a big difference between an AI that can recommend an action and one you’re willing to let take that action. All it takes is one bad call, one kubectl command that makes things worse, and suddenly every automated suggestion is a potential liability instead of a help.

Context is King: Why Network AI Needs Domain Knowledge to Work

Generic AI fails in network operations because it lacks the “institutional knowledge” of your specific environment and business priorities. Learn how Kentik’s Custom Network Context encodes your unique operational reality into AI Advisor, turning a generic chatbot into a context-aware teammate.

Context Engineering: How Dev Teams 10x Productivity with AI

Context engineering isn't just an AI buzzword. It's how high-performing dev teams are transforming productivity at scale. Chris Geoghegan, VP of Product at Zapier, breaks down why individual AI gains don't compound and what your team needs to do instead.In this GitKon session, learn how to.

Michael Burry Warns of Artificially Inflated Earnings

On November 10, 2025, Michael Burry, the investor famous for predicting the 2008 subprime mortgage crisis and featured in the film "The Big Short," posted on X, accusing American big tech giants of inflating their earnings. The criticism centers on a widespread accounting practice among companies that have invested in AI: the artificial extension of the useful life of IT equipment, primarily Nvidia GPUs, to mitigate the impact of depreciation on corporate balance sheets.

How companies are using Civo GPUs to accelerate AI innovation without runaway costs

Accessing high-performance GPUs shouldn’t feel like a bottleneck. Yet, as AI adoption accelerates, many teams are discovering that hyperscaler offerings often come with a hidden price: long wait times, opaque billing, and layers of unnecessary complexity. At Civo, we’ve seen a different way. Our GPUs enable companies to move faster while keeping infrastructure overhead and costs firmly under control.

How to Ensure AI-Generated Code is Reliable with Runtime Context

TLDR: AI coding assistants have sped up code delivery, but created a validation gap. Historic telemetry and static analysis cannot predict the behavior of unfamiliar, high-volume code. Lightrun’s Runtime Context MCP closes that gap, allowing AI assistants to verify behavior before it breaks, and resolve issues in real time.

Beyond the Hype: Building a Future-Proof Foundation for the AI-Native Enterprise

We are witnessing a fundamental transformation in how software is built. The industry has moved beyond the experimental phase of Machine Learning Operations and entered a complex new reality: the era of the AI Software Supply Chain. The adoption metrics confirm this shift is irreversible. Google reports that 90% of tech workers are now using AI as part of their daily work. Similarly, McKinsey data reveals that 88% of organizations use AI in at least one business function.

Build custom apps in seconds with conversational AI in App Builder

Using a drag-and-drop interface, engineering teams can create apps that support troubleshooting, improve day-to-day operations, and offer self-service access without leaving Datadog. With the new conversational AI feature, teams can turn an idea into a working app in seconds. Watch the video to see how it works..

Poisoning the Well: The Invisible Danger in Your AI Supply Chain

Welcome to the AI research bites. This series of short and informative talks showcases cutting-edge research work from ServiceNow AI Research team. The AI Research Bites are open to all, especially those interested in keeping up with the fast-paced AI research community.

Finetuning Gemma 3 on private data with Unsloth and CircleCI

Fine-tuning Large Language Models (LLMs) on private, domain-specific data can unlock significant value for your specific use case. When done correctly, you can create AI apps that understand your organization’s unique context. These apps can speak your brand’s voice and deliver remarkably accurate results that general models cannot match. However, finetuning is not always the right solution. Many teams rush into this complex technique without exploring simpler alternatives first.

Agentic AI Essentials: Your Guide to the Future of Automation

To mark the launch, we’re publishing Agentic AI Essentials, a four-part series to help organizations navigate the reality of agentic AI adoption. Across the series, we’ll look at the questions that matter most: what’s real versus hype, how to avoid adoption pitfalls, how to measure ROI, and how roles will evolve once agents are onboarded. Here’s a sneak peek at what’s in store.

How agentic IT operations transform IT Service Management (ITSM)

Enterprise ITOps leaders are realizing that legacy incident management processes are collapsing under the weight of today’s sprawling, hybrid-cloud enterprise environments. The fastest path from reactive firefighting to proactive, automated control is an agentic AI-powered incident assistant that can understand context, coordinate people, and take intelligent action at machine speed. Enterprise IT doesn’t look anything like it did even five years ago.

5 Observability & AI Trends Making Way for an Autonomous IT Reality in 2026

IT operations are changing faster than most people realize, making autonomous IT a 2026 reality, not a distant vision. Your team monitors tens of thousands of metrics, ingests terabytes of logs, and generates thousands of alerts daily. And somehow, you still find out about outages from customers before you see them in your tools. That gap between having visibility and actually understanding what’s happening has become the central problem.

AWS re:Invent 2025 AI-First Incident Management in Slack

Jacky Leybman from PagerDuty and Kaninie Knight from Slack share how their integration streamlines incident response and real-time collaboration. This session highlights practical workflows and measurable gains – such as faster triage and lower MTTR – achieved by connecting on-call operations directly in Slack.

Ep 24: Governing AI in the age of agentic systems and Model Context Protocol

On this episode of Masters of Data, we unpack David's new white paper on AI governance for agentic systems. He explains model context protocol (MCP) as "APIs for agents", how AI systems talk and execute tasks. The catch? Autonomous agents are insider threats that move fast and cause serious damage. David introduces the Model Control Plane (MoCop), a twelve-pillar framework designed to prevent your AI from going rogue. We cover his roadmap for security leaders to build real controls and telemetry. His advice: treat agents like interns with root access. Get ahead of this before your agents do.

Automating BGP Troubleshooting with Kentik AI Advisor

In this demo, we use Kentik AI Advisor to troubleshoot a real-world BGP misconfiguration that brings down a peering session with a transit provider. You’ll see how AI Advisor works both as a dedicated page and as an in-portal overlay, using natural language to identify the affected interface, correlate SNMP and syslog data, and pinpoint a maximum-prefix issue as the root cause. Then we accelerate and standardize the workflow with custom network context and AI-powered runbooks, so every engineer can troubleshoot BGP alerts like an expert.

Leveraging AI Crypto Trading Platforms for Smarter Investment Strategies

The world of cryptocurrency has experienced explosive growth over the past decade, transforming from a niche digital asset market into a global financial phenomenon. With this rapid expansion comes a new set of challenges for investors, including high market volatility, an overwhelming number of trading options, and the constant demand for real-time data analysis. Traditional trading strategies often struggle to keep up, leading to missed opportunities and heightened risks. To address these challenges, investors are increasingly turning to technology-driven solutions, most notably, AI crypto trading platforms.

How Agentic AI for ITOps Unlocks Value at Scale

Here’s a paradox for the AI era: organizations are obsessed with the promise of AI as the key to unlocking productivity and enterprise transformation, and IT teams are all-in on the advantages AI and automation offer — yet those same organizations are the ones holding that transformation back. While the majority of IT workers advocate for AI adoption, operational, cultural and budgetary barriers stand in the way of enterprises implementing AI at scale.

The Context Engineering Framework: 3 Shifts for AI-Powered Dev Teams

You’ve probably used AI earlier today. Maybe you asked it to debug a function, generate a test case, or explain a legacy codebase you just inherited. But here’s the thing: you didn’t just type a question and get an answer. You explained your problem, shared background context, pasted code snippets, clarified what you meant, then refined the output until it was actually useful. In other words, you were context engineering.

From Zero Tickets to High-ROI: AI + DEX in 2026 (w/ Samuele Gantner and Vedant Sampath)

Kicking off 2026, Tim and Tom welcome Nexthink Chief Product Officer Samuele Gantner and first-time guest CTO Vedant Sampath for a candid “three pillars” deep-dive on enterprise AI. They explore how AI is reshaping product and engineering: new tooling, new development cycles, and the shift from deterministic software to probabilistic agents—plus the critical role of evals, benchmarks, guardrails, and performance. Then they unpack Nexthink’s three-pillar framework.

From Market Noise to Clear Strategy: How AI Is Changing Business Intelligence

Modern businesses are drowning in data. Every click, transaction, customer interaction, and campaign generates information. Yet having more data does not automatically lead to better decisions. In fact, many organizations struggle because they are surrounded by insights but lack clarity. Reports contradict each other, dashboards multiply, and teams spend more time interpreting data than acting on it. This gap between data and direction is where artificial intelligence is reshaping business intelligence.

From Promise to Practice: What Real AI SRE Can Actually Do When Production Breaks

We’ve written before about the advantages of training an AI SRE on real telemetry data rather than generic Kubernetes documentation. We’ve explained why RAG augmentation based on actual high-scale workload patterns produces better results than LLMs trained on generic scenarios or forum threads. The theory makes sense, the architecture is sound, and the approach is defensible.

From Observability to Visibility: Why Tech Teams Should Treat Photos Like Production Assets

Modern operations is obsessed with one word: visibility. We instrument services, centralize logs, trace requests, and tune alerts because what we cannot see, we cannot reliably improve. The same pattern shows up outside the stack, in a place most teams ignore until it hurts: how people show up online. If you work in DevOps, SRE, ITSM, platform engineering, or cloud, you already know the downstream cost of "good enough." A slightly messy dashboard becomes a slow incident response. A vague runbook becomes tribal knowledge. A weak alert strategy becomes pager fatigue.

Peeking Under the Hood of Claude Code

Everyone is talking about Claude Code, but few people understand the machinery running in the background. Today, we’re opening up the terminal to see how Anthropic’s coding agent manages state, runs tests, and fixes its own bugs. From the Model Context Protocol (MCP) to its unique React-based terminal UI, find out what makes Claude Code the most "senior" feeling AI assistant on the market.

Is Claude Code Spying for OpenAI? #speedscale #anthropic #openai #claude #codingagent

While analyzing network traffic, we found huge amounts of telemetry including chat snippets, being sent to statsig.anthropic.com. The irony? Statsig was recently acquired by OpenAI. In this video, we use proxymock to intercept the traffic and show you exactly what’s being sent from your terminal to Anthropic (and technically, OpenAI’s infrastructure).