Adopting Elasticsearch to drive ecommerce search, recommendations and personalisation
Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. It is a JSON document store built upon the Apache Lucene search engine and can be used with the language and platform of your choice.
With Microsoft SQL Server Full Text Search not being actively developed and with the backing of a whole community of developers behind it delivering regular updates, enhancements and support, Elasticsearch is now being adopted as the industry standard for building fast, full-text search functionality. With Elasticsearch ideally suited for use with ecommerce sites it can now be configured as the search provider for tradeit.
Why choose Elasticsearch?
One of the main advantages of Elasticsearch is speed, particularly when faced with large data sets such as a large number of products in an ecommerce store. Slow response times deliver a poor user experience and cause higher bounce rates but Elasticsearch can achieve fast search responses because it searches an index instead of the text directly. By caching almost all queries it precomputes the results, so when something is actually searched for, it is found (or not) very quickly. Elasticsearch can return sub-second responses on huge datasets giving it near real time performance.
Elasticsearch is built to scale. Distributing the processing load across multiple nodes allows Elasticsearch to be easily scaled across servers and balance the load between those nodes in a cluster. Elasticsearch will run fine on any machine but can be scaled across hundreds of servers and contain petabytes of information. Growing the number of clusters is almost entirely automatic and pain free so scaling is easy.
Enhanced Queries & Relevancy
Elasticsearch uses JSON as the serialisation format for documents and is supported by various programming languages. This allows you to construct complex queries and fine tune them to help deliver the relevant results you want from a search. It provides a way of ranking and grouping those result, and provides aggregations which can explore trends and patterns of data.
Elasticsearch can support all commonly-used data types including Text, Numbers, and Dates. It also supports more complex types such as objects, geo data types, nested types, arrays and many others.
Stability & Reliability
Running from a cluster of dedicated servers not only provides scalability to whatever size you need but also means that Elasticsearch is very robust. Should there be any issues with a server, there is an automatic failover, ensuring your search is continually operational in the event of an issue. Data is automatically replicated to prevent any loss in case of server failure.
Functional enhancements of Elasticsearch
Alongside the high-level benefits of using Elasticsearch, there are a number of functional benefits too. Lets examine some of the new functionality available with Elasticsearch in tradeit, and what is being delivered in the near future.
Keyword matches now cater for fuzziness on search passes allowing for spelling mistakes or mistyping, so if the user enters the term incorrectly the same results would be returned.
Keyword matches are also extended to include language inflections so the same set of products would be returned whichever of the related words was included in the search. For example searching for 'swimming', 'swimmer' or 'swimmers' would return the same set of results.
Keyword matching will also remove 'stop words' so if words like 'to', 'the', 'i', 'and' etc... were included in the search term those words would be removed so only the other words are matched. For example, searching for 'the product' or 'product' would return exactly the same results.
Exact Phrase Matching
By using Elasticsearch, exact matches are boosted to ensure they rank above fuzzy matches. The amount they are boosted is configurable via tradeit.
Partial Matches Within Keywords
Elasticsearch has the ability to match search terms, anywhere within keywords.
Elasticsearch enables the level of fuzziness to be made configurable for each search pass including turning it off, setting it to be automatic (where it determines the number of characters that can be changed depending on the length of the search term - this is the current standard), or specifying the number of characters that can be changed.
Boost Product By Metric
Using the Elasticsearch rank feature query, merchants can boost products within listing and search results based on a metric. This is configurable by creating a set of rules (same as component rules and conditions). When any of those rules are matched, the merchant can choose to boost the results by a chosen metric.
For example, 'for all customers, boost in stock products', or 'When viewing the monthly offers category, boost products with the highest sales price'.
NOTE: Metrics calculated at runtime would not be able to boost listing and search results.
In Elasticsearch you can define and manage a list of synonyms which can be managed via tradeit's admin. As synonyms differ in relevance across industry sectors and product sets, there is no set of default synonyms that is applicable to all. Merchants would need to provide their own list of synonyms relevant to their particular data set but for example a search for 'Laptop' would return results for all 'Laptops' as well as 'Macbooks', 'Notebooks' and 'Netbooks' if the correct synonyms are set up.
Aggregated Multiple Search Passes
Elasticsearch can support multiple search passes in one, so all search passes could be executed in one request to Elasticsearch and it will provide a set of results for each search pass.
Multiple Sort Fields
Elasticsearch allows merchants to configure two-dimensional sorting which enables them to combine options like A-Z and in stock, so users will see A-Z of all those items that are in stock first, followed by A-Z of all those items that aren't in stock.
In Stock Sort
An additional sort option field of ‘In Stock’ has been enabled to allow merchants to display items that are in stock ahead of those that aren't when they apply that sort option.
Boost Products By Product Group
Any products in boosted product groups will always appear at the top of the search results and product listing when the configured sort option is selected. Ideal for pushing groups like new products, in season products or particular brands.
Weighted Search Passes
When aggregated search passes are enabled, you can weight the importance of each pass to promote more exact matches and promote matches based on certain fields that are deemed more important (i.e. product name).
'More Like This' Product Metric
Using Elasticsearch's 'More Like This' query we can power a new metric that automatically lists products similar to the products being viewed. The simplest way consists of asking for other products that are similar to the one provided using tf-idf (term frequency-inverse document frequency), a numerical statistic that reflects how important a word is to a document. The higher the tf-idf, the more 'alike' it is to the product being viewed. In very simplistic terms, lets say for example we have a product that is a hair brush, we may ask for all other products with 'hair brush' in their 'product name' and in their 'description' fields, limited to the 10 closest matches, to be returned.
This significantly reduces the configuration required by the merchant to manually define products that are similar to each other, but the flexibility of Elasticsearch means there are numerous selectable parameters which can help merchants hone the results of their user's searches too, all of which can be controlled by the admin system in tradeit.
Recommended Search Terms
Using SQL, suggested search terms appear in the search fly-out where the customer entered search term partially matches a previously used search term that returned results. This can be useful for sites that are well established and have had a number of months, or years, worth of data built up of user searches, but for new sites with no, or very limited amounts of, data, it doesn't really assist the user.
By using Elasticsearch instead, fuzzy suggestions based on indexed products, rather than previous search terms, can be returned meaning results are delivered immediately and with more logic than relying on previous user searches.
Suggested categories appear in the search fly-out where the category name matches the search term but this is greatly improved via Elasticsearch through the introduction of fuzziness, synonyms, and analyzers.
Speak to us to learn more about how we can configure Elasticsearch as the search provider for tradeit on your ecommerce site.