The video discusses the advantages of using OpenTelemetry Collector over Sumo Logic's installed collectors for data collection and observability. It highlights key features and serves as a guide for organizations considering the transition to OpenTelemetry.
Chaos Engineering and reliability testing give you visibility into the actual reliability of your services by simulating real-world failure conditions. But what if you could dig into the testing and results data using AI to quickly uncover new insights? That’s the logic behind the Gremlin MCP Server. Released as part of Reliability Intelligence, the Gremlin MCP Server allows you to bring your LLM of choice to explore your Gremlin data and find opportunities to get more out of Gremlin.
Note: A version of this post originally appeared on the OpenTelemetry blog. Victoria Nduka is user experience designer and open source contributor making her way into the cloud native space. She writes about design, accessibility, and open source with the same curiosity she brings to her work. On May 29, I wrapped up my mentorship with Prometheus through the Linux Foundation mentorship program.
A few months ago, I built an MCP server for Toronto’s Open Data portal so an agent could fetch datasets relevant to a user’s question. I threw the first version together, skimmed the code, and everything looked fine. Then I asked Claude: “What are all the traffic-related data sources for the city of Toronto?” The tool call fired. I got relevant results. And then I hit an error: “Conversation is too long, please start a new conversation.” I had only asked one question.
At a high level, incident remediation is a part of the incident response process. An Incident response plan manages the incident lifecycle across planning, detection, investigation, and recovery. Meanwhile, incident remediation focuses on identifying root causes and implementing measures to prevent future occurrences.
SUSE Rancher Prime delivers SUSE secure Application Collection and a complete Kubernetes platform, turning an industry crisis into your strategic upgrade.
Kentik’s Phil Gervasi shows how modern data centers—especially those powering AI workloads—can spot and fix problems before they impact performance or budgets. See how Kentik’s Data Explorer helps you identify disruptive flows, reclaim wasted network capacity, and turn insights into real-time alerts. With monitor-only mode and integrations with systems like PagerDuty and ServiceNow, your network becomes its own early warning system—driving uptime, cost savings, and better AI performance.
Elasticsearch mappings turn logs from unstructured text into usable data. In this video, we explain what mappings are, how they define fields like text, number, and date, and why they matter. With the right mappings, Elasticsearch can filter error codes, sort by response time, and group results by browser, region, or version.
In the seventh episode of Masterclass 2025, learn the various aspects of an efficient asset management practice and map dependencies between IT infrastructure components to evaluate potential risks. We will go over how to manage hardware and software assets, discover assets using various methods, loan assets temporarily to contract employees, and auto-assign assets to users.