Now that we have studied the system architecture of an ad distribution network and the various components, let us look at the requirements of an ad distribution system from the viewpoint of performance. Of course, performance is of primary importance. We saw that there are multiple ways in which an ad publisher generates revenue from an ad network. CTR is the most preferred way of measuring the performance of an ad and hence that of the ad network.
CTR stands for Click Through Rate. It is defined as the division of the number of clicks made on an advertisement by the total number of times the advertisement was served (impressions).
In order to deliver a good CTR, the ad being displayed needs to be close to the context of the page currently being viewed by the user. In order to derive the context, we need to run a search with the title and metadata on the page and identify the ads related to that page. Let us create a sample Solr schema for an ad distribution network.
The schema for a listing ad can contain the following fields:
<field name="adid" type="lowercase" indexed="true" stored="true" required="true" omitNorms="true"/> <field name="keywords" type="wslc" indexed="true" stored="true"/> <field name="category" type="wslc" indexed="true" stored="true"/> <field name="position" type="string" indexed="true" stored="true" multiValued="true"/> <field name="size" type="string" indexed="true" stored="true"/>
We have the field name as adid
, which is a unique ID associated with an advertisement. We have the keywords
field related to adid
. The keywords herein are whitespace tokenized and appear in lowercase. category
is another field that specifies the category of ads. It can be used to categorize an ad to be displayed on an e-commerce website, a blog, or some other specific website. position
specifies the position in which the ad is to be displayed, and size
specifies the size of the ad in pixels.
Our query for fetching an ad will be based solely on the keyword. The JavaScript client will pass the category
, position
, and size
parameters related to an ad on the website. It will also pass the page title and metadata on the page. Our search for an ad would be based on the value in the keywords
field. We will be performing an OR match between the content on the page (title and metadata) against the keywords for that advertisement.
A simple filtering query on the category
, position
, and size
parameters should return the ads that can be displayed on the page. In addition, there are certain parameters that we discussed earlier such as the campaign, the start and stop dates, and the number of impressions or budget related to the ad. All these parameters would also need to be added into the schema and queried during the fetching of the ad:
<field name="startdt" type="date" indexed="true" stored="false" /> <field name="enddt" type="date" indexed="true" stored="false" /> <field name="campainid" type="int" indexed="true" stored="true"/> <field name="impressions" type="int" indexed="true" stored="true"/>
Note that if we store the number of impressions to be served in the Solr index, then we will have to continually update the index as soon as an impression is served. This increases both reads and writes on the Solr index and the system is required to be more real time.
Changes in impressions should be immediately reflected in the Solr index. Only then would the Solr query be able to fetch the ads that need to be displayed. If an ad has served its budged impressions, it should not be served further. This can happen only if the impressions served are updated immediately into the ad index. The NRT indexing feature based on soft commits (discussed in , Solr in E-commerce) in Solr and SolrCloud can be used to achieve this.
A targeted ad is based on a user's browsing history and profile information. How can we get the user's browsing history or profile information? We need to drop cookies into the user's browser when he or she visits a certain site. The merchant who provides ads to the ad publisher also drops cookies into the user's browser. Let us look at an example scenario to understand this system.
Suppose a user is browsing for certain products, say t-shirts
. Each of the user's actions results in the addition or update of cookies on his or her browser. Therefore, if the user views some t-shirts, there would be a cookie on his browser containing information on that product. The advertising system provides the merchant with a piece of code that is used to drop the cookie. The merchant may also register with the ad network and ask for certain CTR / CPM plans. Such an ad distribution system would have tie-ups with ad publishers or websites where the ads are to be displayed. The ad system would provide a JavaScript code to the ad publisher system to be added to the page on which the ad is to be displayed.
Suppose the user now goes to some other website, say , which is a content and news website. Given that the ad distribution system has a tie-up with , there is a JavaScript code on the home page that fetches ads from our ad distribution system. This JavaScript code will read all cookies on the user's browser and pass them to the ad distribution system. The ad distribution system now knows that the user was earlier viewing certain t-shirts on . Using this information, ads offering t-shirts earlier seen by the user are displayed on the user's browser.
It is also possible to capture the user's profile information, such as age, sex, online shopping preferences, and location, and use it to display targeted ads to the user. These types of targeted ads have gained a lot of popularity as they are close to the user's interest.
A sample Solr schema for the ad network for targeted ads would contain the following fields in addition to the fields meant for listing ads:
<field name="merchant" type="string" indexed="true" stored="true" /> <field name="pincode" type="int" indexed="true" stored="true" />
These fields specify the merchant and the user location as part of the user profile.
When searching for an ad to be displayed to a particular user, the search query also incorporates cookie information regarding the merchant and the user. The products to be displayed for a particular merchant are picked up from the user's cookie.