Learn-teach-repeat: apache solr workshop @ glassbeam

Tuesday, June 24, 2014

At Glassbeam, we use many different open source technologies. Our stack consists of Apache Cassandra, Apache Solr, Apache Spark, Scala, Play, H2 to name few. We try to give back and share with the community. We call it learn-teach-repeat initiative. As part of this initiative we host frequent meetups and training sessions to share the lessons learned while using these technologies to build a scalable platform for machine data search and analytics.

One of our meetup groups is BANGALORE BABY APACHE SOLR GROUP. We started this group with the purpose of bringing more people into the search world and help them get started with APACHE SOLR. Apache Solr is an enterprise search platform which provides most of the features required to build a scalable and feature-rich search application. We have built Glassbeam’s Explorer – an application which can be used to search and explore terabytes of raw data, on top of Apache Solr.

While it’s easy to add basic search in your application with GBs of data, scaling search requires a lot of work. There are very few training available in Bangalore(in India for that matter) to help people understand complete Solr stack. This initiative is to get people quickly on the Search Track and then continue to solve complex problems through sharing/learning aka our meetups.

We recently hosted SECOND WOKSHOP ON SOLR at Glassbeam’s office in Bangalore. We got an overwhelming response for the the workshop (unfortunately we had to put lot of people on wait-list because of space constraints). We saw attendance from all type of people ranging from junior developers, architects to product managers. In the hands-on session, we built the backend for following demo application: HTTP://SAUMITRA.ME/SOLRDEMO/. Delivering workshop with the help of an full fledged UI application helped participants understand practical application of various Solr features. We covered following topics in the workshop:

  1. Search Engines 101
  2. What is Solr? Use cases and architecture
  3. Solr schema, config, tokenizers and filters
  4. Indexing data:
    • From disk using SolrJ
    • Importing from database(MySQL) with DataImport Handler
  5. Querying Solr (Filters, Faceting, highlighting, sorting, grouping, boosting, range, function and fuzzy queries)
  6. Using ‘More Like This’ component to show similar docs
  7. Adding ‘Auto Suggest’ component to auto complete user queries
  8. Using ‘Clustering’ component to cluster similar results.
  9. SolrCloud
    • Architecture
    • Setting up a multinode cluster with Zookeeper
    • Creating a distributed index
    • Collections API
  10. Solr Admin UI

Just like FIRST WORKSHOP, the session was very well received and we were glad to hear positive feedback from participants. We would like to thank Shalin Mangar, Noble Paul and Varun Thacker from LUCIDWORKS for helping with the workshops. These sessions are a great way to connect with like-minded people, share use cases and gain new insights about the latest and hottest technologies in big data ecosystem. We are super excited with the response so far and look forward to organize more of these. It is not only about teaching what you know; it is also about learning what others know!

Join these meetup groups below and follow us on twitter @GLASSBEAMto stay up to date about the events under our learn-teach-repeat initiative:


Below are the slides from the workshop