Modular Prediction of Rare Events to Deliver Enhanced ROI

We at RevX strive for enabling a good ROI for our advertising partners. Most of our customers measure ROI in terms of eCPA (Cost per Acquisition). RevX technology intelligently buys media on programmatic exchanges and prediction models are central to achieving the twin goals of delivering good ROI for customers and ensuring the business runs on healthy margins.

The Difficult Part: Predicting Purchase Events

One potential approach to delivering good ROI can be to build a prediction model for purchase event and then use this model directly to bid with a bid value = CPA Goal x Probability of Purchase (without optimizing for click payouts). However, purchase events are very rare and hence difficult to predict. Typically, there may be 1 in 10,000 impressions that will eventually lead to a purchase. Hence, there are not enough training samples for the positive class. 

We internally experiment with many approaches and algorithms to tackle this challenging problem. One such approach is oversampling the rare class/event and then re-adjusting the predictions with the oversampling ratio. However, in our experience, oversampling hasn’t proved to be a powerful technique when it comes to predicting rare events like purchase/sale.  

Modular Prediction of Rare Events to Deliver Enhanced ROI RevX Analytics  

Powerful Models: Blessing or Curse?

If we want to have more prediction power using more data signals hoping that perhaps a complex model can “powerfully” learn rare events, it may make things worse. The “curse of dimensionality” will cause more issues, especially for a rare events problem. To learn reasonably well, the model will require huge amounts of training data. We experimented with this technique wherein a single model with a large set of attributes was used to predict purchase event but such models fell short of expectations in terms of both quality and accuracy of prediction.

Eventually, we adopted a modular approach wherein we divided the problem in two parts- using one model to predict click probability and using a second model to assess the quality of click in terms of conversion/ROI.

RevX Approach: Powerful CTR Model + Simpler Conversion Model

We use a fairly powerful click prediction model which predicts clicks with good AUC (measure of quality of model) and over-prediction scores. These models are fairly stable and accurate, and leverage various user, publisher, and campaign attributes. However, the problem with click-based prediction for campaigns that measure ROI in terms of cost per transaction is that the click prediction doesn’t take into account the quality of clicks. In the mobile app world, there are many apps and ad formats that give high CTR but don’t result in a conversion. A click prediction model will bid high for such high CTR inventory and the campaign will end up buying impressions at high cost resulting into higher CPA and lower ROI.

Modular Prediction of Rare Events to Deliver Enhanced ROI RevX

RevX solves this problem by deploying a simpler model for ROI optimization. We do not use the model to predict conversion rates with high precision. Rather, we use it only to filter out bad inventory - inventory that results in high CTR and low conversion rate. This also tackles click fraud problem to a large extent; a click has no value if it can’t drive transaction and ROI for customers.

Having a simpler model with only a few important signals solves the curse of dimensionality especially when predicting a rare event like purchase. Moreover, shifting the role of conversion model from bid pricing to bid filtering reduces the prediction burden on the difficult-to-learn model significantly.

At RevX, we are constantly exploring and experimenting with more data science techniques to make digital advertising more intelligent and effective.

Sandip Acharyya