Glassbeam studio architecture

SURAJ ATREYA
Wednesday, February 17, 2016

GLASSBEAM STUDIO is a one of a kind software which helps in data transformation and preparation, visualizing data, deployment and much more. The Glassbeam Studio technology is modeled on a client-server architecture with functionalities balanced neatly between the client and the server. This software is architected to offer an infinitely scalable, seamlessly, and functionally compelling way to transform even the most complex machine data into valuable business insights.

Metadata extraction

At a very high level, the below block diagram shows a broad overview of what happens when a log is uploaded from a user.

The Studio engine is capable of detecting the format of an incoming log file. This auto-detection works for many of the commonly known log formats such as CSV with an option of user input delimiter, syslog, json, generic event log file and so on. Once the log format is detected, the next step is to detect the date format if it exists. This is important to do time-series analysis. Using all the meta information (largely auto-detected) a sample SPL is generated.

When the user clicks on “Generate Output” button the log bundle along with the auto-generated SPL are sent to our hyper scale SCALAR platform for processing. SCALAR will first read the SPL information and starts parsing each file included in the log bundle. Furthermore, it will index data for searching and insert data into Cassandra for large scale analytics and machine learning.

Glassbeam integrates with the THINGWORX MACHINE LEARNING platform for predictive scoring on the parsed log data. Data from Cassandra is extracted and formatted in a specific fashion from where it is fed into ThingWorx Machine Learning. The output can be consumed inside Glassbeam’s Complex Event Processing engine.

Complexity

Studio is an IDE and a one stop shop solution to ease the process of structuring information of an unstructured log file, maintaining SPL, and visualizing parsed output. Studio is built such that the functionalities in the front end complement the functionalities in the back end to help the user interact with much ease keeping the complexity underneath.

SCALAR is made up of several microservices each running in its own cluster. Some of them run in its own cluster and Studio makes it transparent to the user without ever knowing that all the heavy lifting is happening behind the scenes. Studio is built on the philosophy of reactive programming which includes programming for resiliency, scalability and responsiveness. It uses Akka – a distributed concurrent framework to process data concurrently. The ability to handle multiple requests concurrently can easily be handled using Akka and can be scaled out as required.

With the increased complexity to handle such log files, it becomes imperative to have a tool to structure it with ease. We use two constructs extensively – Namespace and Table. Namespace and Table form the bedrock of our technology and help structure log files.To handle complex log files, Studio expects some user inputs such as selecting a portion of the log file to select a particular section which forms the Namespace and the data section which forms the Table. The front-end has nice visual aids to help the user navigate with ease to help model the complex logs.

In the upcoming versions of Studio we will be eliminating most of the user inputs and we plan to automatically understand the log file structure using machine learning. Since we already have a large quantities of varying log structure, we are building a machine learning model to learn from these different log structures. With this knowledge, we can aid the user who comes with a different log structure with very minimal user inputs. The model learns continuously every time a user inputs or corrects the format of the log file. With this model, the user will be suggested what is the possible section with a high accuracy. The user will only have to review the sections and correct a few changes and should be ready in a matter of minutes.

We are building a library of ontologies for different domains: Wireless Networking, Medical Devices, Telecom etc. With a host of ontologies, our machine learning model will extract the correct properties of the log file for a given domain.

We are pretty excited about Glassbeam Studio and the technology stack we have chosen. With the advent of Internet of Things, Glassbeam Studio eases the data preparation, transformation, visualization and machine learning all in one stack without having to build a custom solution.

Piqued your interest? Like to learn more – simply sign up for a DEMO

Home Section banner