Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on APIs, Mobile, AI, Machine Learning, IoT, Open Source and more!

How AI Is Powering the Next Era of IT Operations

AI is redefining the future of IT. In this Nexus Live 2025 keynote, ScienceLogic CEO and Founder Dave Link shares the vision behind Skylar AI, why the industry is shifting toward autonomous operations, and how organizations can move faster, smarter, and more proactively than ever before. In this session you’ll see.

LLM Cost Monitoring with OpenTelemetry

Teams running LLM applications in production face a cost problem that traditional APM tools were never designed to solve. CPU and memory costs are relatively predictable — a web service processing 1,000 requests per second costs roughly the same week over week. LLM API costs are not. A single user session can cost $0.01 or $5 depending on prompt length, model choice, conversation history, and how many retries happen inside your chain.

How to Catch AI Code Mistakes Before They Reach Production

AI can write code fast, but it makes mistakes humans often don't. In this session from Ole Lensmar, CTO of Testkube, breaks down the real quality risks of AI-generated code and how engineering teams can build guardrails before those bugs hit production. What you'll learn: Common mistakes LLMs make (and which ones are unique to AI) Whether you're a developer leaning on AI to ship faster or a QA lead trying to keep up with the pace of AI-generated code, this talk gives you a practical framework for staying ahead of quality issues.

Practical AI-Enabled Observability for Agents and LLMs

You’re told to “go build agents” without clear guidance on what that actually means, how to do it well, or how to know if it is working. You are not a data scientist. You are a software engineer. In this talk, a Datadog AI product leader Shri Subramanian breaks down what changes when you move from building applications to building AI agents, and why familiar approaches like traditional testing and linear delivery fall short. We will explore how agent development shifts the focus from code alone to data, prompts, and evaluation, and why functional reliability matters just as much as operational reliability.

The Next Phase of Agentic AI

The Enterprise AI Survey conducted by Digitate in collaboration with Sapio Research states that the journey of enterprise automation and AI adoption has evolved significantly. The initial waves focused primarily on improving accuracy, efficiency, and reducing costs. Now, the next phase, Agentic AI, is transforming this shift from mere automation to dynamic collaboration.

Operating agentic AI with Amazon Bedrock AgentCore and Datadog LLM Observability: Lessons from NTT DATA

This guest blog post is by Tohn Furutani, SRE Engineer at NTT DATA. Over the past year, the conversation around generative AI has shifted from single-shot use cases—such as summarization, Q&A, and chat interfaces—to agentic AI systems that can make decisions based on context, plan multistep actions, invoke tools, and adapt as conditions change.

AI agent observability: The developer's guide to agent monitoring

Most "agent observability best practices" content reads like a compliance checklist from 2019 with "AI" pasted over "microservices." Implement comprehensive logging. Establish evaluation metrics. Create governance frameworks. Not a single line of code. No mention of what happens when your agent silently picks the wrong tool on turn 3 and you need to figure out why.

Kosli and Adaptavist Partner to Automate Governance for AI driven Software Delivery

Today, Kosli and Adaptavist announce a strategic partnership to help regulated enterprises automate governance for AI driven software delivery - making it automated, continuous, and evidence-driven rather than a manual checkpoint that sits apart from DevOps and CI/CD. Adaptavist brings deep enterprise DevOps transformation expertise: assessment and strategy, DevSecOps integration, developer experience, and implementation across Atlassian, GitLab, and AWS.

Deterministic by Design: How Harness Grounds AI Agents in Structured Data | Harness Blog

When AI agents operate across a multi-module platform like Harness (from CI/CD to DevSecOps to FinOps), the number one goal is to give you answers that are correct, consistent, and grounded in real data. Getting there requires a deliberate architectural choice: when a question can be answered from structured platform data, the agent should use a schema-driven Knowledge Graph rather than raw API calls via MCP. The principle is simple: if the data is modeled, retrieval should be deterministic.