Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Log Management, Log Analytics and related technologies.

Leveling up your observability practice - Part 2

Lessons from the front lines: Challenges in your observability maturity journey In our previous blog, we explored the observability maturity spectrum — revealing that while only 7% of organizations consider themselves experts, the majority (43%) are actively working to improve their practices. We saw how mature organizations achieve better outcomes, from faster root cause analysis to reduced user-reported incidents.

Agentic RAG on Dell AI Factory with NVIDIA and Elasticsearch Vector Database

We are excited to collaborate with Dell on the white paper,Agentic RAG on Dell AI Factory with NVIDIA. The white paper is a design reference document for developers outlining strategies and solution components to implement agentic retrieval augmented generation (RAG) applications. It’s a design point for organizations across industries, specifically healthcare, for the agentic RAG framework decision-making with AI-driven data retrieval.

Adding AI to Observability 2.0 for Dynamic Observability

The original premise of observability was to ensure system health, identify issues, and resolve those issues efficiently. As I recently outlined, the legacy approach (sometimes called Observability 1.0 now) relied heavily on metrics and tracing because logs were seen as too noisy or challenging. But, as most forward thinkers have identified now, logs are exactly the telemetry type that we need the most.

Are you ready for the next outage? How a to prepare for any crisis

We live in an “always on” world, so unplanned outages are more than just inconvenient. They can result in lost revenue, damaged reputations, and, more importantly, frustrated customers. While preventing outages is impossible, the most resilient teams must be prepared with a solid plan, a “technical go bag,” so to speak: a collection of tools, plans, and resources ready to activate at the first sign of trouble.

Future-proofing operations with generative AI

NOBODY PANIC! The Elastic AI assistant’s got you! Transform problem identification and resolution, and eliminate manual data chasing across silos with an interactive assistant that delivers context-aware information for SREs. Additional Resources: About Elastic Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale. Elastic’s solutions for search, observability, and security are built on the Elastic Search AI Platform — the development platform used by thousands of companies, including more than 50% of the Fortune 500.

Collecting Windows telemetry with Elastic: An introduction to the ETW Filebeat input

In the world of security, being able to use system telemetry of Windows hosts opens new possibilities for monitoring, troubleshooting, and securing IT environments. Recognizing this, Elastic has introduced new capabilities focused on Event Tracing for Windows (ETW) — a powerful Windows-native mechanism for capturing a vast array of system and application events. With these new additions, Elastic users can capture, analyze, and visualize Windows telemetry using the Elastic Search AI Platform.

Leveling up your observability practice - Part 1

Lessons from the front lines: Moving to observability maturity What separates the observability experts from the novices? It's a question that's been on my mind lately, especially after diving into our recent 2024 State of Observability Survey of over 500 practitioners. In my past roles as a DevOps engineer and a site reliability engineer (SRE), I've seen firsthand how a mature observability practice can be the difference between sleepless nights and smooth sailing.

Mastering Tail Sampling for OpenTelemetry: Cost-Effective Strategies with Cribl

Recently, I have seen a trend of enterprises moving toward OpenTelemetry (OTel) for application tracing. Tail sampling, in particular, has emerged as a preferred approach to gain actionable insights while balancing data volume and cost. OpenTelemetry offers developers and practitioners the ability to instrument their code with open-source tools, moving away from vendor-provided tools for application instrumentation.

Splunk's Path Towards Achieving FedRAMP Moderate Authorization for Splunk Observability

Splunk continues to partner with government agencies on their digital transformation journeys to help deliver their missions and provide faster and more intelligent services. We are committed to the success and support of the security requirements of our public sector customers, and I am thrilled to share the latest strategic investments Splunk is making to expand our FedRAMP program to include Splunk Observability Cloud for government customers.