Книга: Apache Solr Search Patterns
Назад: Faceting with the option of multi-select
Дальше: Faceting with size

Faceting with hierarchical taxonomy

You will have come across e-commerce sites that show facets in a hierarchy. Let's take a look at and check how hierarchy is handled there. A search for "shoes" provides the following hierarchy:

Department Shoes -> Men -> Outdoor -> Hiking & Trekking -> Hiking Boots
Faceting with hierarchical taxonomy

Hierarchical facets on www.amazon.com

How is this hierarchy built into Solr and how do searches happen on it?

In earlier versions of Solr, this used to be handled by a tokenizer known as solr.PathHierarchyTokenizerFactory. Each document would contain the complete path or hierarchy leading to the document, and searches would show multiple facets for a single document.

For example, the shoes hierarchy we saw earlier can be indexed as:

doc #1 : /dept_shoes/men/outdoor/hiking_trekking/hiking_boots doc #2 : /dept_shoes/men/work/formals/

The PathHierarchyTokenizerFactory class will break this field, say, into the following tokens:

doc #1 : /dept_shoes, /dept_shoes/men, /dept_shoes/men/outdoor, /dept_shoes/men/outdoor/hiking_trekking, /dept_shoes/men/outdoor/hiking_trekking/hiking_boots doc #2 : /dept_shoes, /dept_shoes/men, /dept_shoes/men/work, /dept_shoes/men/work/formals

The initial query would contain the facet.field value as hierarchy:

facet.field="hierarchy"&facet.mincount=1

The facet.prefix parameter can be used to drill down into the query:

facet.prefix="/dept_shoes/men/outdoor"

This will list down sub-facets of outdoor-mens-shoes:

Here we have to take care of creating the hierarchy during indexing, which can be a tedious task.

A better way to handle this in the new Solr 4.x version is by using pivot facets. Pivot facets are implemented by splitting the hierarchical information or bread crumbs across multiple fields, with one field for each level of the hierarchy. For the earlier example, the fields for pivot faceting would be:

doc #1 => hr_l0:dept_shoes, hr_l1:men, hr_l2:outdoor, hr_l3:hiking_trekking, hr_l4:hiking_boots doc #2 => hr_l0:dept_shoes, hr_l1:men, hr_l2:work, hr_l3:formals

The initial query for creating the facet pivots would be:

facet.pivot=hr_l0,hr_l1,hr_l2,hr_l3,hr_l4

To implement this in our index, we will need to add a dynamic field of type string to our schema.xml file:

<dynamicField name="hr_*"  type="string"  indexed="true"  stored="true" />

Let us index the data_shoes.csv file and then run pivot faceting and see the results:

 java -Dtype=text/csv -jar solr/example/exampledocs/post.jar data_shoes.csv 
q=shoes&qf=text%20cat^2%20name^2%20brand^2%20clothes_type^2%20clothes_color^2%20clothes_occassion^2&pf=text%20cat^3%20name^3%20brand^3%20clothes_type^3%20clothes_color^3%20clothes_occassion^3&fl=*,score&facet=true&facet.pivot=hr_l0,hr_l1,hr_l2,hr_l3,hr_l4 

On implementing pivot faceting using this query, we should be getting the following output:

Faceting with hierarchical taxonomy

Pivot faceting output

To drill down, all we need to do is add the respective field and value as a filter query in our Solr search query.

Назад: Faceting with the option of multi-select
Дальше: Faceting with size

Solr
Testing
dosare
121