Perl and the Elasticsearch percolator

Intro

For our most recent hack day, I worked with my colleagues Antonio Barone and Nelio Nunes to implement an alert-me-when function for our site. In production, we use Solr at the moment as the search engine to serve our frontend. Unfortunately, implementing alerting functionality with it means you have to go down a do-it-yourself route.

So, for our hack, we used Elasticsearch, which has a really nice built-in feature called Percolator that is exactly what we needed. Borrowing the percolation definition from the Search::Elasticsearch::Client::Direct CPAN module:

“Percolation is search inverted: instead of finding docs which match a particular query, it finds queries which match a particular document.”

Since we wanted to present the final result with some real data, we needed to switch the frontend to use Elasticsearch instead of Solr, hence we needed to import data into Elasticsearch from Solr. That was easily achieved using River, a plugin for Elasticsearch to import Solr data. (The details of the migration are off-topic for this post.)

The Percolate queries are just normal documents with a specific format, so they can be decorated with any extra information and queried as any other index. The query documents are stored in an arbitrary index under the reserved type name .percolator.

Some code

Here are somes snippets of code from our implementation…the code is part of a Dancer app.

  1. Check if the search return results otherwise store the query
  2. Check if the new product matches any query and send the email

Conclusions

  • Elasticsearch offers scalable alert-me-when functionality out of the box with the Percolator feature; doing the same in Solr would have required writing it ourselves.
  • The perl module provide a good interface to Elasticsearch, but there are still some rough edges around the Percolator interface. Mainly it looks to be a read-only interface, so in order to add queries to the percolator index requires the use of the standard interface. This ends up exposing some Elasticsearch internals that would be nice to mask behind the percolator interface.
Print Friendly
This entry was posted in Hack Days, Perl, Search and tagged , by Fabio Ponciroli. Bookmark the permalink.

About Fabio Ponciroli

Fabio is a Senior Software Engineer at Yoox-Net-A-Porter Group where he works in one of the backend teams mainly responsible for the catalog API used by the different e-commerce sites of the company. He has extensive experience in working with Perl, NodeJS, Scala and related ecosystems. He is originally from Milan in Italy where he got his master degree in Telecommunication engineering at Polytechnic of Milan. He spent a number of years working in Milan as a consultant in the telecommunication industry. During this time he worked for various companies including Vodafone, H3G, FastWeb. In 2007 Fabio moved to London and has work in different companies, from start-ups to corporates. Github: barbasa

Leave a Reply