The major problem that flash sale sites face is the sudden and large amount of traffic. Generally, people are notified in advance about the time of the sale, so at that exact moment, a large number of customers hit the site to purchase the objects on sale. Therefore, we see a sudden spike in traffic and low traffic when there is no flash sale happening.
Another problem is that, as soon as a product is sold out, it should be moved to the bottom of the search result. We have already seen how this situation can be handled in the previous section. However, this requires very frequent updates to the Solr index. Ideally, as soon as a sale happens, the inventory status should be updated in the index. This is a general problem, but with flash sale sites, the problem becomes more acute. This is because at the time when the sale opens, there is a rush for a certain product. Moreover, the site can lose customers if inventory is not properly tracked and reported during flash sale time.
Thus, when we combine both the scenarios, we have a site that has a sudden spike in traffic, and we also need to keep the inventory status updated in the index to prevent over-selling of the product. Solr NRT indexing would help a lot in this scenario. A soft commit at certain durations could be used to reflect the changes in the index. To implement NRT in our index, we need to take care of two things in our solrconfig.xml
file.
A soft commit is much faster since it only makes index changes visible and does not fsync
index files or write a new index descriptor to disk.
We need to ensure that the DirectoryFactory
directive used for creating the Solr index is NRTCachingDirectoryFactory
class:
<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}">
We need to ensure the time duration for soft commits. This is handled via the autoSoftCommit
directive:
<autoSoftCommit> <maxTime>30000</maxTime> </autoSoftCommit>
This indicates that, every 30 seconds (30,000 milliseconds), changes in the Solr index will become available irrespective of whether they have been written to disk or not. An autoCommit
directive specifies the duration when the index will be written to disk. It is important to note that, in the case of a system failure, if the changes that are available via soft commit are not written to disk, the transaction log will be lost. If the soft commit has been written to the transaction log, it will be replayed when the Solr server restarts. If no hard commit has been made, the transaction log can contain a lot of documents that can result in a considerable Solr server startup time.
The third problem is that a product on sale should be searchable only after the exact sale time of the product. For example, if a product is supposed to go on sale at 2 pm in the afternoon, it should be in the index before 2 pm but should be searchable only after 2 pm.
In order to handle this scenario, we need to include time-sensitive data in our search index. To do this, we need to add two additional fields in the Solr index that define the start and end time of the sale for that particular product:
<field name="sale_start" type="date" indexed="true" stored="true"/> <field name="sale_end" type="date" indexed="true" stored="true"/>
Once this information is stored in the index, all we need to do is add another filter query to get the time sensitive products as a part of our search result. The filter query will be:
fq=+sale_start:[* TO NOW]+sale_end:[NOW+1HOUR TO *]
However, this filter query is very inefficient because NOW
calculates the time every time the query is run. Therefore, filter query caching does not happen. A better way to do this would be to round off the time to the nearest hour or minute to cache the filter query for that duration. Thus, our filter query would become:
fq=+sale_start:[* TO NOW/HOUR]+sale_end:[NOW/HOUR+1HOUR TO *]
Now, as soon as the end time for the sale goes by, the product automatically goes offline.