In this section, we will investigate how Glassbeam’s DSL called SPL (Semiotic Parsing Language) helps in parsing multi-structured machine logs.
SPL allows a log file to be treated as a hierarchical document consisting of multiple segments (or sections). Each hierarchical segment is called a namespace. This allows for zeroing in on the exact section to parse specific elements from, thus localizing the scope of extracts.
To define the boundary of a namespace, SPL provides the use of BEGIN and END. The developer needs to define the BEGIN of a namespace using a regular expression that uniquely identifies the beginning of a section that is being defined as a namespace.
Notice, the corresponding DEFINE NAMESPACEs in the SPL file, with the BEGINS WITH sections, which key to the beginning of the sections.
The namespaces may also have an ENDS WITH section. However, if it is not present when another namespace begins, it “closes” the previous namespace. Anything that is part of a namespace but is not parsed is considered UNPARSED and will be separately available.
The table is the logical entity where the parsed data is stored temporarily while parsing. This is where Columns and data types (String, int, real, etc) are defined. This is also where descriptive labels for UI and indexing rules are defined.
The DEFINE TABLE directive defines the ICON, a parsing methodology, and columns where the parsed data is stored. Note that the TABLE is an easy to understand representation of parsed data even though SCALAR does not have tables, in the database sense.
ICONs provide a simple way to parse supported log formats without the use of complex regular expressions. There are multiple types of ICONs defined in SPL for specific log formats and the platform allows the creation of more such ICONs. There is one icon for one type of log format.
Supported Icons: NVPair, Align Basic, List Basic, Syslog, CSV, XML, JSON
Col functions provide various transformation functions on the columns of the table being parsed by SPL. Some examples of column functions are:
- Transformation Functions like colsplit, coljoin, colcase, colcopy
- Computational functions like colcalc
- Specific function to add global variables (Context) - addcontext
Supported column functions:
COLFILL: Fill an empty column with the value from the previous row
COLDROP: To drop the specified column from the table
COLJOIN: Join more number of columns or literals and assign to result in column
COLREP: Replaces the regular expression match in a column with the specified string
COLSPLIT: Split the column into pieces specified by the back references of a regular expression
COLCOPY: Copy the data from one column to one or more columns
COLCASE: Conditionally assigns new values into result column
COLCALC: To perform various transformations on operational data. For example:
- SDF2EPOCH (Pattern)
- The values obtained through backreferences can be used in the table
- Values assigned as global variables can be added to columns in a table
Variable Name/Value Pair
=== SYSTEM INFORMATION ===
Log Date: Thu Aug 21 23:59:59 2008
OS Release: 1.1.0
Serial Number: 037DF674
Domain Name: medisoft.com
SPL code sample
DEFINE NAMESPACE ex1 DESCRIPTION 'Module 1'
DEFINE NAMESPACE ex1.sysInfo DESCRIPTION 'System Information'
BEGINS WITH /=== SYSTEM INFORMATION ===/
DEFINE TABLE Sys_Info NAMESPACE ex1.sysInfo DESCRIPTION 'System Information'
COLUMN sys_log_date [s(64):n] <label = 'Date'> AS 'Log Date'
COLUMN sys_os_release [s(64):n] <label = 'OS Release'> AS 'OS Release'
COLUMN sys_serial_no [s(64):n] <label = 'Serial'> AS 'Serial Number'
COLUMN sys_model [s(64):n] <label = 'Model'> AS 'Model'
COLUMN sys_host_name [s(64):n] <label = 'Host'> AS 'Hostname'
COLUMN sys_domain_name [s(64):n] <label = 'Domain'> AS 'Domain Name'
COLUMN sys_company [s(64):n] <label = 'Company'>
COLCOPY (sys_domain_name, sys_company)
=== VOLUME INFORMATION ===
Volume Size(GB) Used(GB) Avail(GB) Raid Group
------ ------- -------- --------- -----------
/vol0 1000 11 989 rdg001
/vol1 1000 108 892 rdg002
/vol2 1000 95 905 rdg003
/vol3 1000 31 969 rdg004
SPL Code Sample
DEFINE NAMESPACE ex1.volumeInfo DESCRIPTION 'Volume Information'
BEGINS WITH /=== VOLUME INFORMATION ===/
# Above namespace defines the beginning of the section for Volume information
DEFINE TABLE Volume_Info NAMESPACE ex1.volumeInfo DESCRIPTION 'Volume Information'
COLUMN vol_name [s(32):n] <label = 'Volume'> AS 'Volume' [L]
COLUMN vol_size [s(32):n] <label = 'Size', units = 'GB'> AS 'Size(GB)' [L]
COLUMN vol_used [s(32):n] <label = 'Used', units = 'GB'> AS 'Used(GB)' [L]
COLUMN vol_available [s(32):n] <label = 'Available', units = 'GB'>AS 'Avail(GB)' [L]
COLUMN vol_raid_grp [s(32):n] <label = 'RAID Group'> AS 'Raid Group'[L]
In essence, SPL in combination with Glassbeam’s processing platform permits building visually appealing dashboards from data obtained from machine logs:
In fact, once the data is structured, it can also be used for generating machine learning models which are then applied in real time to fresh incoming data.
If you would like to catch up on Part 1 of this series, please click here.