You will have come across e-commerce sites that show facets in a hierarchy. Let's take a look at and check how hierarchy is handled there. A search for "shoes"
provides the following hierarchy:
Department Shoes -> Men -> Outdoor -> Hiking & Trekking -> Hiking Boots
How is this hierarchy built into Solr and how do searches happen on it?
In earlier versions of Solr, this used to be handled by a tokenizer known as solr.PathHierarchyTokenizerFactory
. Each document would contain the complete path or hierarchy leading to the document, and searches would show multiple facets for a single document.
For example, the shoes
hierarchy we saw earlier can be indexed as:
doc #1 : /dept_shoes/men/outdoor/hiking_trekking/hiking_boots doc #2 : /dept_shoes/men/work/formals/
The PathHierarchyTokenizerFactory
class will break this field, say, into the following tokens:
doc #1 : /dept_shoes, /dept_shoes/men, /dept_shoes/men/outdoor, /dept_shoes/men/outdoor/hiking_trekking, /dept_shoes/men/outdoor/hiking_trekking/hiking_boots doc #2 : /dept_shoes, /dept_shoes/men, /dept_shoes/men/work, /dept_shoes/men/work/formals
The initial query would contain the facet.field
value as hierarchy
:
facet.field="hierarchy"&facet.mincount=1
The facet.prefix
parameter can be used to drill down into the query:
facet.prefix="/dept_shoes/men/outdoor"
This will list down sub-facets of outdoor-mens-shoes:
Here we have to take care of creating the hierarchy during indexing, which can be a tedious task.
A better way to handle this in the new Solr 4.x version is by using pivot facets. Pivot facets are implemented by splitting the hierarchical information or bread crumbs across multiple fields, with one field for each level of the hierarchy. For the earlier example, the fields for pivot faceting would be:
doc #1 => hr_l0:dept_shoes, hr_l1:men, hr_l2:outdoor, hr_l3:hiking_trekking, hr_l4:hiking_boots doc #2 => hr_l0:dept_shoes, hr_l1:men, hr_l2:work, hr_l3:formals
The initial query for creating the facet pivots would be:
facet.pivot=hr_l0,hr_l1,hr_l2,hr_l3,hr_l4
To implement this in our index, we will need to add a dynamic field of type string
to our schema.xml
file:
<dynamicField name="hr_*" type="string" indexed="true" stored="true" />
Let us index the data_shoes.csv
file and then run pivot faceting and see the results:
java -Dtype=text/csv -jar solr/example/exampledocs/post.jar data_shoes.csv
q=shoes&qf=text%20cat^2%20name^2%20brand^2%20clothes_type^2%20clothes_color^2%20clothes_occassion^2&pf=text%20cat^3%20name^3%20brand^3%20clothes_type^3%20clothes_color^3%20clothes_occassion^3&fl=*,score&facet=true&facet.pivot=hr_l0,hr_l1,hr_l2,hr_l3,hr_l4
On implementing pivot faceting using this query, we should be getting the following output:
To drill down, all we need to do is add the respective field and value as a filter query in our Solr search query.