无限推荐网络：以数据为中心的方法

论文标题

无限推荐网络：以数据为中心的方法

Infinite Recommendation Networks: A Data-Centric Approach

论文作者

Sachdeva, Noveen, Dhaliwal, Mehak Preet, Wu, Carole-Jean, McAuley, Julian

论文摘要

我们利用神经切线内核及其对培训无限范围的神经网络的等效性，以设计$ \ infty $ -AE：具有无限宽瓶颈层的自动编码器。结果是具有单个高参数和封闭形式解决方案的高度表达但简单的推荐模型。利用$ \ infty $ -ae的简单性，我们还开发了蒸馏库来综合微小的，高保真的数据汇总，这些数据总结从极其大且稀疏的用户交互矩阵中提取最重要的知识，以高效且准确的随后数据插入，例如模型培训，架构搜索，进行数据定位的方法，从而在范围内进行数据访问，以便将数据定为数据，以便在范围内进行数据访问，以便在范围内提高效果，以便在其中的推荐使用，我们的范围是我们的范围，我们的范围是我们的挑战，我们的范围是我们的挑战，我们的范围是我们的挑战，我们的范围是我们的范围，我们的范围是我们的范围，我们的范围是我们的挑战，而我们的范围均可访问我们的范围。随后的建模，独立于学习算法。我们特别利用可区分的牙龈样采样的概念来处理固有的数据异质性，稀疏性和半结构性，同时可扩展到具有数亿个用户 - 项目交互的数据集。我们提出的两种方法都大大优于各自的最先进，当一起使用时，我们观察到96-105％的$ \ infty $ -ae在完整数据集中的性能中的性能，而原始数据集大小的0.1％，导致我们探索了违反直觉的问题：您需要更多的数据？您需要更好的建议？

We leverage the Neural Tangent Kernel and its equivalence to training infinitely-wide neural networks to devise $\infty$-AE: an autoencoder with infinitely-wide bottleneck layers. The outcome is a highly expressive yet simplistic recommendation model with a single hyper-parameter and a closed-form solution. Leveraging $\infty$-AE's simplicity, we also develop Distill-CF for synthesizing tiny, high-fidelity data summaries which distill the most important knowledge from the extremely large and sparse user-item interaction matrix for efficient and accurate subsequent data-usage like model training, inference, architecture search, etc. This takes a data-centric approach to recommendation, where we aim to improve the quality of logged user-feedback data for subsequent modeling, independent of the learning algorithm. We particularly utilize the concept of differentiable Gumbel-sampling to handle the inherent data heterogeneity, sparsity, and semi-structuredness, while being scalable to datasets with hundreds of millions of user-item interactions. Both of our proposed approaches significantly outperform their respective state-of-the-art and when used together, we observe 96-105% of $\infty$-AE's performance on the full dataset with as little as 0.1% of the original dataset size, leading us to explore the counter-intuitive question: Is more data what you need for better recommendation?

下载PDF全文

下载文献需遵守相关版权规定

论文标题