Operations | Monitoring | ITSM | DevOps | Cloud

The Next Enterprise AI Challenge: The Multi-Model Workplace

For the last two years, enterprise AI strategy has largely focused on one thing: adoption. Organizations encouraged employees to experiment with ChatGPT, Claude, Copilot, Gemini, and dozens of emerging AI tools in the hope that productivity gains would naturally follow. CIOs approved pilots, departments launched AI task forces, and leaders pushed teams to integrate AI into everyday work as quickly as possible. But the enterprise AI conversation is beginning to change.

How Datadog uses AI to build internal software delivery tools and improve system performance

At Datadog, we want our developers to become better at using AI tools with the end goal of building quality software, faster, that generates real value. This includes not only the products and features that our customers use, but also the internal tools that help keep our workflows running smoothly behind the scenes.

Accelerate investigations with AI in Datadog Incident Response

Engineering teams spend much of their incident response time investigating the problem and coordinating the response. Both tasks become harder when telemetry data lives in one place, deployment history is stored in another, and conversations unfold across chat channels and incident bridges. Responders often spend the first part of an incident rebuilding context before they can begin testing hypotheses and working toward resolution.

Don't 'control' your AI spend. Understand it and be intentional.

There’s a good interview making the rounds. BizTech sat down with IBM’s James Stevenson to talk about how financial institutions can get a handle on cloud and AI costs. The advice is solid: get visibility, kill idle resources, tighten governance, tag everything. And pull finance and engineering into the same room. I don’t disagree with it. But I read the whole piece and noticed where the gravity pulls: control costs, reduce waste, bring down spend. The headline says it (‘Q&A.

GLM-5.2 Review (2026): Zhipu AI's Open-Weight Coding Model, Honestly Assessed

Zhipu AI (now operating internationally as Z.ai) shipped GLM-5.2 in mid-June 2026, and the claim that grabbed attention was blunt: an open-weight model that beats GPT-5.5 on several long-horizon coding benchmarks for roughly one-sixth of the cost. It's an MoE model with 753 billion total parameters released under an unrestricted MIT license, which means you can self-host it or call it through a managed endpoint.

How One AI-Localized String Broke Our Build and Cost Me $6,000 (And What I Do Differently Now)

The string that broke our last release was four words long. It passed review, went green in the build, and shipped to our German locale with a corrupted placeholder that turned the checkout button into a runtime error. Customers there could not complete an order for most of a Saturday before a screenshot reached me. The broken button cost us roughly $6,000 in lost orders that weekend; the fix itself took ten minutes. What I do differently now started with understanding why it happened.

Making Testing Smarter: How AI in testing automation Supports Continuous Change

Selecting a freight forwarder in 2026 is no longer just about getting goods from point A to point B. You now need a partner that can handle customs clearance, protect delivery timelines, provide transparent shipment updates, and help you understand how sustainable your supply chain is. It matters when disruption to supplies, expectations of customers, and reporting on the environmental impact of operations all sit with one team managing operations.

The Three Pillars Were Built for Humans

It was 2am and I was paying for the privilege. Something was on fire in production, and I’d done the modern thing: I pointed an AI agent at it. It ingested the dashboards. It read the logs. It walked the traces. Then it handed me back a beautifully formatted paragraph that said, in effect, “latency is elevated on the checkout path.” I knew that. The page told me that.

6 Ways to Use the Hyperping MCP Server

When something goes down, the last thing you want is to alt-tab between a monitoring dashboard, your on-call tool, and three Slack threads to figure out what is happening and who owns it. That context is usually all there. It is just scattered. The Hyperping MCP server fixes that by putting your monitoring data inside the AI tools you already work in. Your agent can read monitor state, outage timelines, SLAs, and on-call schedules, and answer the questions you would normally chase across tabs.