Operations | Monitoring | ITSM | DevOps | Cloud

August 2023

Operational Intelligence: 6 Steps To Get Started

The ability to make decisions quickly can mean the difference between success and stagnation. Of course, quick decisions aren’t necessarily the right decisions. The right decisions are the best informed, and the best way to get informed is through data. That’s what operational intelligence is all about. In this article, we’re diving into all things operational intelligence (OI), including key benefits, goals and how to get started.

Incident Management Today: Benefits, 6-Step Process & Best Practices

Disruptive cybersecurity incidents become more and more commonplace each day. Even if nothing is directly hacked, these incidents can harm your systems and networks. Navigating cybersecurity incidents is a constant challenge — the best way to stay ahead of the game is with effective incident management.

Dashboard Studio: How to Configure Show/Hide and Token Eval in Dashboard Studio

You may be familiar with manipulating tokens via `eval` or `condition`, or showing and hiding panels via `depends` in Classic (SimpleXML) dashboards, and wondering how to do that in Dashboard Studio. In this blog post, we'll break down how to accomplish these use cases in Dashboard Studio, using the same examples that were shown at.conf23.

Log Data 101: What It Is & Why It Matters

If you mention 'log data' at a crowded business event, you'll quickly be able to tell who's in IT and who isn't. For the average person, log data is about as thrilling as a dental appointment or reconciling a years-old bank account. At the mere mention of log data, their eyes glaze over as they search for an escape from the conversation. Conversely, IT professionals' eyes light up and they become animated when the topic of log data arises.

Distributed Systems Explained

Distributed systems might be complicated…luckily, the concept is easy to understand! A distributed system is simply any environment where multiple computers or devices are working on a variety of tasks and components, all spread across a network. Components within distributed systems split up the work, coordinating efforts to complete a given job more efficiently than if only a single device ran it.

Mean Time to Repair (MTTR): Definition, Tips and Challenges

The availability and reliability of any IT service ultimately govern end-user experience and service performance, both of which have significant business impact. These two concepts — availability and reliability — are particularly relevant in the era of cloud computing, where software drives business operations, but that software is often managed and delivered as a service by third-party vendors.

Big Data Analytics: Challenges, Benefits and Best Tools to Use

Imagine yourself with a folder containing millions of gigabytes of data. If you were asked to process it with an Excel spreadsheet, you wouldn’t need to be a data expert to know that’s impossible. We refer to that amount of data as “big data”. Big data requires advanced techniques, tools, and methods beyond what regular data analytics entails, which is where big data analytics comes in.

Splunk and the Four Golden Signals

Last October, Splunk Observability Evangelist Jeremy Hicks wrote a great piece here about the Four Golden Signals of monitoring. Jeremy’s blog comes from the perspective of monitoring distributed cloud services with Splunk Observability Cloud, but the concepts of Four Golden Signals apply just as readily to monitoring traditional on-premises services and IT infrastructure.

Code, Coffee, and Unity: How a Unified Approach to Observability and Security Empowers ITOps and Engineering Teams

In today's fast-paced and ever-changing digital landscape, maintaining digital resilience has become a critical aspect of business success. It is no longer just a technical challenge but a crucial business imperative. But when observability teams work in their own silos with tools, processes, and policies, disconnected from the security teams, it becomes more challenging for companies to achieve digital resilience.

IT Orchestration vs. Automation: What's the Difference?

As modern IT systems grow more elaborate, encompassing hardware and software across hybrid environments, the prospect of managing these systems often grows beyond the capacity an IT team can handle. Automation is one great way to help. But it's important to know that not all automation is the same — chatbots are probably not the solution your team is looking for to handle these incredibly complex systems.

What Is ITOPs? IT Operations Defined

IT operations, or ITOps, refers to the processes and services administered by an organization's IT staff to its internal or external clients. Every organization that uses computers has a way of meeting the IT needs of their employees or clients, whether or not they call it ITOps. In a typical enterprise environment, however, ITOps is a distinct group within the IT department. The IT operations team plays a critical role in accomplishing business goals.

Developing the Splunk App for Anomaly Detection

Anomaly detection is one of the most common problems that Splunk users are interested in solving via machine learning. This is highly intuitive, as one of the main reasons our Splunk customers are ingesting, indexing, and searching their systems’ logs and metrics is to find problems in their systems, either before, during, or after the problem takes place. In particular, one of the types of anomaly detection that our customers are interested in is time series anomaly detection.

Introducing the Splunk App for Behavioral Profiling

Splunk is the platform for a million use cases, used to investigate operational data across security, observability, fraud, business intelligence and many other domains. But, in my time at Splunk, I’ve come to realize that all of our customers face challenges that stem from the same core problem: Within exploding data volumes, finding the anomalously behaving entities that are most threatening to the resilience of their organization.

Unveiling Splunk UBA 5.3: Power and Precision in One Package

In the face of an ever-evolving cybersecurity landscape, Splunk never rests. Today, we're ecstatic to share the release of Splunk User Behavior Analytics (UBA) 5.3, delivering power and precision in one package, and pushing the boundaries of what's possible in user and entity behavior analytics.

Performance Testing: Types, Tools & Best Practices

To maximize the performance and value of your software apps, networks and systems, it’s critical to eliminate performance bottlenecks. Performance testing has become critical in every organization to reveal and fix performance bottlenecks, ensuring the best experience to end users. This article explains what performance testing is, its importance, and the various types of performance testing.

Cloud Analytics 101: Uses, Benefits and Platforms

Cloud analytics is the process of storing and analyzing data in the cloud and using it to extract actionable business insights. Simply one shade of data analytics, cloud analytics algorithms are applied to large data collections to identify patterns, predict future outcomes and produce other information useful to business decision-makers.

From Disruptions to Resilience: The Role of Splunk Observability in Business Continuity

In today's market, companies undergoing digital transformation require secure and reliable systems to meet customer demands, handle macroeconomic uncertainty and navigate new disruptions. Digital resilience is key to providing an uninterrupted customer experience and adapting to new operating models. Companies that prioritize digital resilience can proactively prevent major issues, absorb shocks to digital systems and accelerate transformations.

Enhancements To Ingest Actions Improve Usability and Expand Searchability Wherever Your Data Lives

Splunk is happy to announce improvements to Ingest Actions in Splunk Enterprise 9.1 and the most recent Splunk Cloud Platform releases which enhance its performance and usability. We’ve seen amazing growth in the usage of Ingest Actions over the last 12 months and remain committed to prioritizing customer requests to better serve cost-saving, auditing, compliance, security and role-based access control (RBAC) use cases.

Follow Splunk Down a Guided Path to Resilience

The dynamic digital landscape brings risk and uncertainty for businesses in all industries. Cyber criminals use the advantages of time, money, and significant advances in technology to develop new tactics and techniques that help them evade overlooked vulnerabilities. Critical signals — like failures, errors, or outages — go unnoticed, leading to downtime and costing organizations hundreds of thousands of dollars.

ML-Powered Assistance for Adaptive Thresholding in ITSI

Adaptive thresholding in Splunk IT Service Intelligence (ITSI) is a useful capability for key performance indicator (KPI) monitoring. It allows thresholds to be updated at a regular interval depending on how the values of KPIs change over time. Adaptive thresholding has many parameters through which users can customize its behavior, including time policies, algorithms and thresholds.

IT Event Correlation: Software, Techniques and Benefits

IT event correlation is the process of analyzing IT infrastructure events and identifying relationships between them to detect problems and uncover their root cause. Using an event correlation tool can help organizations monitor their systems and applications more effectively while improving their uptime and performance.

Modeling and Unifying DevOps Data

“How can we turn our DevOps data into useful DevSecOps data? There is so much of it! It can come from anywhere! It’s in all sorts of different formats!” While these statements are all true, there are some similarities in different parts of the DevOps lifecycle that can be used to make sense of and unify all of that data. How can we bring order to this data chaos? The same way scientists study complex phenomena — by making a conceptual model of the data.

Dark Data: Discovery, Uses, and Benefits of Hidden Data

Dark data is all of the unused, unknown and untapped data across an organization. This data is generated as a result of users’ daily interactions online with countless devices and systems — everything from machine data to server log files to unstructured data derived from social media. Organizations may consider this data too old to provide value, incomplete or redundant, or limited by a format that can’t be accessed with available tools.

Data Lakes Explored: Benefits, Challenges, and Best Practices

A data lake is a data repository for terabytes or petabytes of raw data stored in its original format. The data can originate from a variety of data sources: IoT and sensor data, a simple file, or a binary large object (BLOB) such as a video, audio, image or multimedia file. Any manipulation of the data — to put it into a data pipeline and make it usable — is done when the data is extracted from the data lake.

Pipeline Efficiency: Best Practices for Optimizing your Data Pipeline

Data pipelines are the foundational support to any business analytics project. They are growing more critical than ever as companies are leaning on their insights to drive their business: 54% of enterprises said it was vital to their future business strategies. Data pipelines play a crucial role as they perform calculations and transformations used by analysts, data scientists, and business intelligence teams.