论文标题
点击式:学习最佳嵌入表以进行点击率预测
OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction
论文作者
论文摘要
从模型性能和内存使用情况的角度来看,学习嵌入表在点击率(CTR)预测中起着基本作用。嵌入式表是一个二维张量,其轴分别表示特征值和嵌入尺寸。为了学习一个有效且有效的嵌入表,最近的作品要么分别为特征场分配各种嵌入尺寸,并分别减少嵌入数量,要么掩盖嵌入式表参数。但是,所有这些现有作品都无法获得最佳的嵌入表。一方面,由于数据集中的大量功能,各种嵌入尺寸仍然需要大量内存。另一方面,减少嵌入数量通常会受到性能降解的损害,这在CTR预测中是无法忍受的。最后,修剪嵌入参数将导致稀疏的嵌入表,很难部署。为此,我们提出了一个最佳的嵌入式表学习框架,该框架提供了一种实用且通用的方法,可以为各种基本CTR模型找到最佳的嵌入式表。具体而言,我们建议通过可学习的修剪阈值来修剪有关相应特征的重要性的冗余嵌入。此外,我们将各种嵌入维度分配为一个候选架构。为了有效地搜索最佳嵌入尺寸,我们设计了一个均匀的嵌入维度采样方案,以同样训练所有候选体系结构,这意味着与体系结构相关的参数和可学习的阈值在一个超级网中同时训练。然后,我们提出了一种基于超级网的进化搜索方法,以找到每个字段的最佳嵌入尺寸。公共数据集上的实验表明,接口可以学习一个紧凑的嵌入式表,可以进一步改善模型性能。
Learning embedding table plays a fundamental role in Click-through rate(CTR) prediction from the view of the model performance and memory usage. The embedding table is a two-dimensional tensor, with its axes indicating the number of feature values and the embedding dimension, respectively. To learn an efficient and effective embedding table, recent works either assign various embedding dimensions for feature fields and reduce the number of embeddings respectively or mask the embedding table parameters. However, all these existing works cannot get an optimal embedding table. On the one hand, various embedding dimensions still require a large amount of memory due to the vast number of features in the dataset. On the other hand, decreasing the number of embeddings usually suffers from performance degradation, which is intolerable in CTR prediction. Finally, pruning embedding parameters will lead to a sparse embedding table, which is hard to be deployed. To this end, we propose an optimal embedding table learning framework OptEmbed, which provides a practical and general method to find an optimal embedding table for various base CTR models. Specifically, we propose pruning the redundant embeddings regarding corresponding features' importance by learnable pruning thresholds. Furthermore, we consider assigning various embedding dimensions as one single candidate architecture. To efficiently search the optimal embedding dimensions, we design a uniform embedding dimension sampling scheme to equally train all candidate architectures, meaning architecture-related parameters and learnable thresholds are trained simultaneously in one supernet. We then propose an evolution search method based on the supernet to find the optimal embedding dimensions for each field. Experiments on public datasets show that OptEmbed can learn a compact embedding table which can further improve the model performance.