论文标题

长尾学习需要功能学习

Long-Tailed Learning Requires Feature Learning

论文作者

Laurent, Thomas, von Brecht, James H., Bresson, Xavier

论文摘要

我们提出了一个由自然数据(例如文本或图像)启发的简单数据模型,并使用它来研究学习特征的重要性,以实现良好的概括。我们的数据模型遵循长尾巴的分布,从某种意义上说,一些罕见的子类别在培训集中几乎没有代表。在这种情况下,我们提供了证据表明,当且仅当它确定正确的特征时,学习者才能成功,此外,还会得出非反应概括误差的界限,这些误差界限精确地量化了一个人必须为不学习特征所支付的惩罚。

We propose a simple data model inspired from natural data such as text or images, and use it to study the importance of learning features in order to achieve good generalization. Our data model follows a long-tailed distribution in the sense that some rare subcategories have few representatives in the training set. In this context we provide evidence that a learner succeeds if and only if it identifies the correct features, and moreover derive non-asymptotic generalization error bounds that precisely quantify the penalty that one must pay for not learning features.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源