Operations | Monitoring | ITSM | DevOps | Cloud

Instrument LangGraph agents with Datadog: a practical guide

AI agents tend to function as black boxes, and it can be difficult to trace and understand agent workflows end-to-end in order to characterize performance. Particularly, you need visibility into the following: By tracing full agent runs with LLM Observability, Datadog AI Agent Monitoring enables you to visualize workflows with flame graphs and quickly spot sources of failures and latency.

The importance of taking the initiative (a chat with Chris Yates) | The Simple Talk Podcast

Taking the initiative. Prioritizing relationships. Doing the work nobody else wants to do. These are just some of the elements that contributed to Chris Yates’ rise from a developer to a DBA and, eventually, a Senior Vice President. As he explains to Steve Jones, “you are the CEO of your own brand.” Also in the episode: discover Chris’ thoughts on AI, the importance of community, and the one thing he’d now do differently if he were to start from scratch.

Safe Database Change at Scale with Flyway Enterprise | The Tony and Tonie show Ep45

AI-assisted coding may speed up delivery, but it can also increase the risk around database changes. Here’s how Flyway helps teams stay in control. Tony and Tonie discuss how Flyway Enterprise helps teams build control into the database change process: immediate change visibility, continuous risk reduction, and secure, traceable deployment from commit to production.

The Compliance Gap in Test Data Management

Compliance Without Compromise: Test Data Management That Finally Fits You know you shouldn't have sensitive production data in test environments. But every time you look at fixing it, the options feel impossible: enterprise tools that cost six figures and take months to implement, or DIY scripts that sort of work until they don't. So, it stays on the backlog.

The options within Test Data Management - Enterprise, DIY or Redgate

Compliance Without Compromise: Test Data Management That Finally Fits You know you shouldn't have sensitive production data in test environments. But every time you look at fixing it, the options feel impossible: enterprise tools that cost six figures and take months to implement, or DIY scripts that sort of work until they don't. So, it stays on the backlog.

Where to find lost engineering time in your delivery pipeline

If your infrastructure is configured outside version control through dashboards, scripts, or manual steps, environment drift is the expected outcome. Most teams have lived this scenario. A feature works in staging but breaks in production. Two hours later, someone finds a configuration setting that was changed in staging three weeks ago and never documented.

Root Cause Analysis: How Engineering Teams Fix Production Issues Faster?

When a production incident strikes, a sudden latency spike, a cascading API failure, a service returning 500s at scale, every minute of downtime has a cost. Root cause analysis (RCA) is the process that turns that chaos into a clear answer: what actually broke, and why. Not the symptom that triggered the alert. The underlying cause.