Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Observabilty for complex systems and related technologies.

Why database observability is key to successful cloud data platform adoption

Data is the lifeblood of businesses the world over, from the smallest startup to the largest enterprise. Making sure that it’s available when you need it, secured for authorized use, and recoverable from faults is vital to operating data platforms, no matter where your business is on its cloud journey. This can only be achieved by putting the right data into the hands of the right people, in a timely way, to make the right decisions about how to manage that platform effectively.

Monitoring Backstage with OpenTelemetry:Closing the observability blind spot

‘One small step for a man, but a huge leap for developers’ — me, when I realised how to observe my Backstage with OpenTelemetry. Backstage is often the “portal” through which we manage all our other systems, but who watches the watcher? Recently, we gave a KubeCon Talk, highlighting that monitoring Backstage itself is critical. When Backstage isn’t observable, it becomes a blind spot in your infrastructure.

Syslog Implementation: Servers, Integration and Best Practices

Syslog is a fundamental protocol for collecting messages and event data from various devices and applications across a network. Think of it as a universal language that allows your servers, routers, firewalls, and software to send their operational insights to a central logging point. Born from Unix systems, Syslog has evolved to become the industry standard, forming the backbone of effective log management and providing a unified view of your infrastructure's activity.

Kubernetes observability: How to enrich logs with GeoIP using the Kubernetes Monitoring Helm Chart

When your Kubernetes app suddenly has traffic spikes in a distant country, it can be difficult to determine why. Let’s say, for example, we have an e-commerce app that started to receive an unusual surge of visitors from Australia — something we never anticipated. We search for answers in our logs, but without geographic context, we don’t have the full insights we need.

Detect hallucinations in your RAG LLM applications with Datadog LLM Observability

Hallucinations occur when a large language model (LLM) confidently generates information that is false or unsupported. These responses can spread misinformation that jeopardizes safety, causes reputational damage, and erodes user trust. Augmented generation techniques, such as retrieval-augmented generation (RAG), aim to reduce hallucinations by providing LLMs with relevant context from verified sources and prompting the LLMs to cite these sources in their responses.

Simplifying Observability: Streamlining Telemetry with a Centralized Pipeline

Modern applications generate a deluge of telemetry data—logs, metrics, and traces—that hold the key to understanding system performance and reliability. However, managing this data effectively is a growing challenge for DevOps teams. Raw telemetry can overwhelm teams with complexity and noise even when collected via robust standards like OpenTelemetry.