Performing two searches in Solr for every search on the website would not be optimal. However, we need to identify the fields before performing the search. Another easier way to do this is to incorporate the dictionary in the product catalog index itself.
For this, we will have to create fields in our index matching the dictionary key fields. Then during indexing, we need to populate the key fields with words that match the product. Let us take an example to understand this. In our case, let us say that we are dealing with three fields in our dictionary, clothes_type
, clothes_color
, and brand
. We would create three new fields in our product index, key_clothes_type
, key_clothes_color
, and key_brand
. These fields would contain product-specific information that matches with our dictionary.
For the product, wrangler jeans
, the information in these fields would be:
key_clothes_type : jeans key_clothes_color : blue key_brand : wrangler
For the next product, skinny fit black jeans
, the information would be:
key_clothes_type : jeans key_clothes_color : black key_brand :
Here key_brand
would be empty as we do not identify blue jeans
as a brand in our dictionary.
Now the search would contain higher boosts to these fields to give more importance to the dictionary. A sample query would be:
q=blue%20jeans&qf=text%20cat^2%20name^2%20brand^2%20clothes_type^2%20clothes_color^2%20clothes_occassion^2&key_clothes_type^4&key_clothes_color^4&key_brand^4&pf=text%20cat^3%20name^3%20brand^3%20clothes_type^3%20clothes_color^3%20clothes_occassion^3&key_clothes_type^5&key_clothes_color^5&key_brand^5&fl=name,brand,clothes_type,clothes_color,score
This will give us results that would be comparable to the somewhat clean implementation we described earlier and would be much faster as we are doing a single search instead of two searches. The only problem over here is with the creation of an index where we will have to figure out which words to populate our product dictionary fields with, as it will be an intersection between the field values of the product and the dictionary values for that field.