论文标题
CTR预测的视觉编码和偏见
Visual Encoding and Debiasing for CTR Prediction
论文作者
论文摘要
提取表达性视觉特征对于视觉搜索广告系统中的准确点击率(CTR)预测至关重要。当前的商业系统使用现成的视觉编码器来促进快速的在线服务。但是,提取的视觉特征是粗粒和/或有偏见的。在本文中,我们提出了一个视觉编码框架,用于CTR预测以克服这些问题。该框架是基于对比度学习的,该学习将正对靠近,并在视觉特征空间中将负面对推开。为了获得细粒度的视觉特征,我们通过单击数据进行对比度学习,以微调视觉编码器。为了减少样本选择偏见,首先,我们通过利用无偏见的自我判断和点击监督信号来离线训练视觉编码器。其次,我们在在线CTR预测器中合并了一个偏见的网络,以通过将高印象项目与所选项目进行对比,并具有较低的印象来调整视觉特征。关于十亿个规模数据集和在线实验的离线实验表明,所提出的框架可以做出准确且无偏见的预测。
Extracting expressive visual features is crucial for accurate Click-Through-Rate (CTR) prediction in visual search advertising systems. Current commercial systems use off-the-shelf visual encoders to facilitate fast online service. However, the extracted visual features are coarse-grained and/or biased. In this paper, we present a visual encoding framework for CTR prediction to overcome these problems. The framework is based on contrastive learning which pulls positive pairs closer and pushes negative pairs apart in the visual feature space. To obtain fine-grained visual features,we present contrastive learning supervised by click through data to fine-tune the visual encoder. To reduce sample selection bias, firstly we train the visual encoder offline by leveraging both unbiased self-supervision and click supervision signals. Secondly, we incorporate a debiasing network in the online CTR predictor to adjust the visual features by contrasting high impression items with selected items with lower impressions.We deploy the framework in the visual sponsor search system at Alibaba. Offline experiments on billion-scale datasets and online experiments demonstrate that the proposed framework can make accurate and unbiased predictions.