Glassbeam versus splunk = holistic versus instant gratification

Tuesday, September 22, 2015

In the late 1960s and early 1970s, psychologist Walter Mischel, who at the time was a professor at Stanford University, led a series of studies on delayed gratification. The results of these well-cited studies became known as “The Stanford Marshmallow Experiment”. An excerpt from his study is quoted here – “Delayed gratification, or deferred gratification, is the ability to resist the temptation for an immediate reward and wait for a later reward. Generally, delayed gratification is associated with resisting a smaller but more immediate reward in order to receive a larger or more enduring reward later. A growing body of literature has linked the ability to delay gratification to a host of other positive outcomes, including academic success, physical health, psychological health, and social competence“

When we get asked about comparison with Splunk our reply is in a similar vein to Walter. It really boils down to the difference between Data Search and Data Analysis. There are many varieties of search capable software with the best-known being Google Search, which quickly processes search queries against the very big data domain of the World Wide Web. There are also less well-known search applications that are geared more specifically to the discriminating needs of the corporate enterprise and for generating big data analytics. In this category we have:

Each of these applications employ its own form of proprietary indexing to make data easily retrievable. The indexing operation breaks the data down into uniquely distinguishable elements that are stored separately from the data in a file known as an index. These indexed elements can be events, and they can also be time-stamped. Once a data file is indexed, it can be later searched using simple or compound search arguments augmented by Boolean and other logical operators.

However, this is not true Data Analysis

Data analysis takes a giant step beyond the indexed search requirement of knowing what one wishes to search for by also requiring one to know the context of the search in terms of what one wishes to analyze. Metadata – which is data describing other data – is the most popular form of applying context to the data being analyzed. Applying context to incoming data is an important first step towards rigorous data analytics.

Glassbeam invented SCALARTM , a hyper-scale platform designed to process data streams and files of unstructured and semi-structured content. SCALARTM employs Glassbeam’s three domain specific languages (DSL) that work together to transform raw data into a structured format that can then be used to produce descriptive, predictive, and prescriptive analytics.

The second step in preparing for data analysis is Glassbeam’s SPL programming language. SPL parses the data that now has context applied to it. Analyzing this contextual log file is the first step of the SPL development process. Defining the objects contained within operational data is key to coding SPL. Objects defined are then related to other objects in SPL.

Together, Context and SPL are used to transform raw data into analytical dashboards, each of which represents one Use Case amongst many. These dashboards illustrate the resulting analytics in whatever type of view (text and graphical) that is most easily comprehended by the user.

Instant Gratification or Long-term Satisfaction?

Having armed our reader with an overview of “Data Search” and “Data Analysis”, we can now return to address the question suggested by the title of this paper and determine which of these two approaches (Data Search and Data Analysis) best represents long-term satisfaction.

The data returned in an indexed search will yield quick results to the user (and therefore instant gratification), who enters simple and complex queries to find data elements of importance. However, because search does not apply context to the data it indexes (e.g. by applying metadata), its results are limited by searching for what one knows – i.e. the user must know what he or she is searching for before any search can be executed.

On the other hand, the data analysis returned by Glassbeam’s SCALAR PLATFORM has context via the metadata that was applied to the raw data. The contextual data is then parsed into a singular document store (that includes all files ingested) with Namespaces that define the starting and ending boundaries of the relevant data objects to be analyzed.

The clear benefit of transforming data for analysis over indexing data for search is that transforming data does not require the user to know what he or she is looking for before performing an analysis. Instead, the contextual parsed data output by SCALAR enables the user to process the data analytically against what is known about the data (e.g. via metadata) rather than having to know specific elements within the data. Without context, searching data doesn’t permit any real analytical processing of the data.

Many Glassbeam users were once Splunk indexed data search users or had at least experimented with Splunk prior to turning to Glassbeam and its data analysis approach to IoT Industrial Analytics. These Glassbeam users learned, first-hand, the long-term value of the upfront data science investment required for the data analysis approach. They also learned, through their experience, that the real and lasting business value comes from analytical processing rather than simple indexed searches.