Recap of 2 wonderful conferences

Friday, May 8, 2015

It’s been an exhausting, yet exciting, past week at Glassbeam. We attended, sponsored, networked, and learned a lot at 2 large industry events – the LiveWorx conference in Boston and the TSW Conference in Santa Clara.

LiveWorx, as many of you know, was a conference organized by PTC/ThingWorx and one of the biggest IoT events of 2015. Luminaries from the industry including Jim Heppelman, Steve Wozniak, Professor Michael Porter and many others articulated their vision of the white-hot IoT ecosystem. As a sponsor, we were right in the midst of all the excitement – rubbing shoulders with customers, prospects, partners, industry analysts and numerous other key players. In conjunction with the conference, Harbor Research released an INDUSTRY REPORT that talks about the benefits of our partnership with ThingWorx and we also issues a PRESS RELEASE that talked about our first joint win with ThingWorx (at a medical device company).

Perhaps the biggest takeaway from the event was just the natural feeling of how ubiquitous IoT is going be in our lives moving forward; and how ‘things’ and products just cannot afford to be unconnected anymore. Companies that build products that don’t form an intelligent, closed, feedback loop with their manufacturers are going to be easily outmaneuvered and outperformed by their smarter, IoT-savvy peers. Platforms that connect devices and machines and transports data in a secure fashion are going to be important ecosystems of the future and Analytics will continue to play a central role in gleaning business intelligence

TSW was right next door at the Santa Clara Convention Center. Here, we showcased our pedigree in the Support horizontal along with exciting new use cases like defining the Quality of Experience (QOE) (Read related PRESS RELEASE) using machine learning capabilities. We launched a NEW WHITE PAPER that we authored along with noted industry analyst John Ragsdale that talked of how proactive support and other key features are critical to boosting productivity and lowering costs at Support Organizations.

Our booth had a steady stream of interested visitors – many of them were interested in viewing a new Video that we recently launched that explains how we helps centralize tribal knowledge and other key benefits for Support teams. Again, tons of interest in use cases at the intersection of IoT and Support Analytics and how these were fostering entirely new economics and business models in Support Centers across verticals.

These predictive and prescriptive scenarios allow customer service teams to dramatically reduce the mean time to resolution (MTTR), enhancing customer satisfaction, elevating product reputation and increasing customer service efficiencies.

The concept of predictive and prescriptive analysis touches another issue: MTTR requires a redefinition. If teams are able to address issues before them become problem, then “resolution” now has a whole new meaning. Tune in to my next blog as we delve deeper into this discussion

  1. Number and length of procedures conducted by hospitals over days, weeks, quarters etc
  2. Usage Reports that identify key button presses and duration of presses by different doctors, hospitals and surgery centers
  3. Fault metrics by type of fault across the install base

Acting on these ANALYTICS, the company was able to refine it’s product roadmap to align more closely with customer needs – a tactic that is now paying great dividends in the form of increased customer satisfaction.

For Companies in the business of manufacturing these complex machines, safety has, and will always will be, be paramount . More importantly, servicing these machines with zero downtime is fast becoming a key business priority. Let’s see how log analytics of data generated by machines can act as service differentiators for these companies.

Forward-looking companies are introducing diagnostics as a service with proactive maintenance tasks being the key focus – the overall goal being Remote monitoring. Machine data from the logs is transmitted wirelessly to a platform ready to ingest the incoming data stream. Log processing includes understanding whether the data is time-based or non-time based. Once this is determined, the logs are processed using a parsing language to derive valuable operational intelligence.

What can this remote monitoring technology do?

This new service runs securely over the internet to ensure smooth running of thousands of escalators that are deployed at remote sites. This service ingests operational data including configuration change logs, micro-controller logs, error logs, and such. In turn, the logs are processed to report potential issues back to the operators even before they arise.

How does this service do that?

Logs from the escalators across the sites are automatically collected, classified, normalized, and aggregated into a log vault. A historical analysis is conducted with the new data set that comes in. Within minutes, powered by an ultimate machine log parsing language (SPL), any log file from the escalator is read and decoded to extract real-time intelligence. Dashboards collate the intelligence in the form of reports that help operators to take preventive actions. These actions range from predicting a potential fault to scheduling firmware upgrades.

Behind the scenes

The GLASSBEAM PLATFORM built on SCALAR indexes the logs as time- or non-time-based data. They are then stored in memory databases, making the platform truly capable of parallel processing hundreds of thousands of queries against Petabytes of machine data. Because these databases are essentially distributed collectors of objects, they can be cached in memory across cluster nodes making it easy to manipulate through various parallel operators. As a result, dynamic queries can be run and reports extracted in real-time.

Sitting on top of the analytics engine is the Rules and Alerts capability, where business rules can be defined. And alerts configured whenever an event appears to deviate from a standard behavioral pattern. Further, real time data streaming is possible because our stack is now INTEGRATED WITH APACHE SPARK.

It is scalable and repeatable at every site. The service is deployed over the cloud, making it possible for both IT and business stakeholders to harness real-time value from the same machine data source instantly at nearly zero overhead costs.

A drop in hydraulic pressure can be caused by several factors, and the fault may not always point to the hydraulic systems. To isolate this issue, field operators will need to bring down the turbine, send out the field personnel with test equipment, and check with the hydraulic system’s supplier for a solution. This could mean days if not weeks before a resolution is found.

End-to-end operational analytics

With all the data sets out there, wind farms can gain real-time visibility into operations and component-level behavior through the analysis of streaming data captured by the electro-mechanical heavy equipment, sensors, and controllers.

Wind farms can identify specific units and/or components, which could need further inspection in the future and can completely eliminate unplanned activities between scheduled maintenance processes.

Let’s ask ourselves – Would it not be great if these activities could occur:

  • Specific units and/or components that could need further inspection in the future and potential maintenance could get triggered as alerts from an asset knowledgebase?
  • Wind farms were equipped with a solution to predict the supplier’s (in this case, hydraulic systems’) performance over the life-cycle of all the installed turbines?
  • Wind farms are in a position to procure only the most efficient configurations of the hydraulic system that impact production the least?
  • We can quantify how a component’s performance is impacting the financial performance on a daily basis?

Brand new approach

This highly fragmented data requires complex analysis and correlation across the different types of datasets

The aim is to augment intuitive decision making with statistically-based technical support and operating procedures. Combine that with visualization that is so richly detailed as to provide a full 360 degree view of all the assets in wind farm.

Glassbeam’s Hyper scale IoT platform SCALAR can ingest structured, unstructured or multi-structured data from any type of machine and prepare it for analysis. Combined with our machine learning / Predictive analytics layer built on top of Apache Spark and you have a fast, scalable analytics solution for processing large-scale IoT machine data at your fingertips. With our rich set of APIs and an easy to use drag and drop type of dashboard builder, creating custom solutions has never been this easy.

The future prism

An IoT analytics platform for the Wind Energy industry can perform the following tasks:

  • Trigger the related events that deviate from domain-specific machine learning models’ outcomes
  • Grab part-level performance metrics and predict trends based on pre-determined baseline data
  • Set up watch lists for configuration changes and correlate that with the supplier’s part performance statistics
  • Measure the status of the components’ attributes and visually correlate over the asset base and their lifecycle
  • Support out-of-the-box reporting to procure replacement parts or schedule maintenance activities based on prescriptive analytics
  • Provide cognitive intelligence to front-line operators, analysts, and business planners for guidance and advice, ahead of impending machine/component failures
  • Ensure maintenance contract negotiations are based on the prescriptive data of the lifespan of a part/system

Building a wind energy operation analytics platform requires a flexible and adaptable technology framework and Glassbeam’s next generation platform is ideally suited for this purpose. Our RESOURCES PAGE has numerous case studies, white papers and analyst reports that articulate this capability in more detail.

Related reading

But here is the catch: these information rich datasets are unstructured, complex, and come in a variety of formats. Take the example of the CT scanner equipment: error logs typically reveal a lot of useful pointers for a field engineer to for resolving device issues. Apart from the logs, there are mechanical parts, for instance the LED lights that indicate something is wrong and electronic parts such as resistors, transmitters etc. that needs to be checked to ensure they are functioning as intended.

What if field technicians could know beforehand that a particular CT scanner is going to go down?

Business leaders are looking to avoid multi-year lock in for deployments. Instead, the expectations are to find a solution that is lean and adaptive. This lean solution is expected to handle complexity, inter-connectedness, and diversity in data sources. As a result, finding ways to discover insights from assets that already exist in any equipment across the medical institution’s install base is crucial to effective decision making.

Convergence is the secret sauce

It isn’t about confirming the information and decisions we already know about the device or have considered in the past, it is about gleaning predictive insights like “Hey your CT scanner in the Entomology lab located in building Z is likely to fail, because I can see part no: 1EX300980 resister is getting overloaded”.

This knowledge we have about the machines and an intervention using “prescriptive” log analytics can reduce device failures significantly. How?

Enter, Predictive Analytics

We recently made available a state-of-art predictive analytics solution on our SCALAR PLATFORM. This new age machine is ready to take on hundreds of thousands of data sources and make available information that can be ingested easily.

Here’s what medical institutions should expect in the future: an intelligence hospital infrastructure (IHI). This infrastructure will have state-of-the-art IOT Analytics on telehealth devices, offer remote patient monitoring machines by intelligent sensors, will include Technology pathways to discover equipment stability in real time that include combining learning from every potential data source. Futher, this infrastructure will offer abyss-level institution-wide visibility and have capabilities for converged device management.

Big data analytics means better hospital infrastructure and by association, better health. To learn more about our offering in this vertical, please visit our MEDICAL DEVICE PAGE

Related blogs and resources:

Having very large partitions is bad because then many queries can hit the same resource and memory requirements to hold that resource go up. Very large partitions may cause out of memory errors. Very small partitions are also bad because it increases the disk reads and network overhead for moving the data around. An ideal partition is one that serves a query by doing one reasonable size disk read. Assuming a query which reads 6 months worth of data and generates aggregated values, query performance improvements can be done in following ways:

The flipside to this is that the entire data of a partition needs to be moved from the node to the client. The bottleneck is the size of partition, which is not in our control and may be quite large.

Flex your query muscles: Take small bites with pagination

Data access layers can send large amounts of data in several small pages to the client, instead as one humongous chunk!

With the right ingredients (aggregation logic, client-side compatibility) each page can be processed independently. This way, large amounts of data are not retained at any point during processing, and the partition size limitation goes away. However, moving the data to the client still remains the bottleneck.

To address this we leverage Spark, a Mapreduce framework, and a great solution to reduce the need of data movement to the client. By running the business logic on the same node, only a small amount of the aggregated data needs to be moved to the coordinator. Therefore, aggregation can now be done on the node itself.

Spark, with built-in features for aggregations, reads one partition at a time, making it a great case for distributed processing. Combine this with the pagination provided by modern data access layers, and we get the best of both worlds.

By supporting a rich set of query processing standards and building on powerful distributed frameworks, Glassbeam’s Big Data platform offers optimal partitioning, data distribution, and query processing. In all, the data we handle with our distributed Cassandra cluster can be infinite, limited only the laws of physics.

The presentation can be viewed ONLINE (Thanks DataStax). Incidentally, we’ve launched a new Website page that has this presentation as well as many other VIDEOS. Be sure to check it out.

Just when things were going right…

WatchService API’s MODIFY event raised a new issue

For a lot of practical purposes,WATCHSERVICE’S ENTRY_MODIFY EVENT is sufficient for getting a file event. But just getting a file event isn’t enough. The file event should trigger when the file has been completely copied to the watched path. Otherwise, the file may be sent to parsing whilst the file is being copied from the customer end. This is not acceptable and will lead to race conditions. This scenario is especially valid when the files are sent via FTP/SFTP/SCP. When the file is sent via FTP/SFTP/SCP, the files are copied in chunks and WatchService API will treat each chunk as a file and hence trigger as many ENTRY_MODIFY events as the number of chunks. This is the drawback of WatchService API as it cannot let the client code know when the file got completely copied.

The reason that was cited in some of the forums for this seeming limitation is that Oracle JDK aims to achieve portability by letting go some of the platform specific dependency. Thus events such as CLOSE_WRITE which clearly marks the end of file even though the underlying protocol used may be FTP/SFTP/SCP are not supported. Since CLOSE_WRITE event was a vital event, we decided to let go of JDK WatchService API altogether. Instead, we adapted an open source project aimed specifically to address this problem and exposes all the native inotify’s events including CLOSE_WRITE.

We have open sourced this “Watcher” module HTTPS://GITHUB.COM/GLASSBEAM/WATCHER under GPL v3 license.

In conclusion, we realized that although, APIs such as WatchService are designed to be portable, it doesn’t suffice for many scenarios. The API must have had options to enable some of the inotify events thereby using the same API at the discretion of the user. Akka helped to scale and react to events asynchronously and therefore we gained tremendously by not spinning threads ourselves.