暴露异常暴露：可以从少数，一个和零离群图像中学到的东西

论文标题

暴露异常暴露：可以从少数，一个和零离群图像中学到的东西

Exposing Outlier Exposure: What Can Be Learned From Few, One, and Zero Outlier Images

论文作者

Liznerski, Philipp, Ruff, Lukas, Vandermeulen, Robert A., Franks, Billy Joe, Müller, Klaus-Robert, Kloft, Marius

论文摘要

由于表征与正常数据不同的所有事物的棘手性，传统上仅利用正常样本的情况将异常检测（AD）视为无监督的问题。但是，最近发现，通过利用大量随机图像来代表异常性，可以大大改善无监督的图像广告。一种被称为异常暴露的技术。在本文中，我们表明，专门的广告学习方法似乎是最先进的表现的不必要的，此外，人们只需通过一小部分异常暴露数据而实现强大的性能，这与AD领域中的共同假设相矛盾。我们发现，训练有素的标准分类器和半监督的一级方法，可识别正常样品，相对较少的随机自然图像能够在具有Imagenet的已建立的AD基准上胜过当前的最新技术状态。进一步的实验表明，即使是一个精心挑选的离群样本也足以在此基准（79.3％AUC）上实现不错的性能。我们研究了这一现象，发现一级方法对训练异常值的选择更为强大，这表明在某些情况下，这些方案仍然比标准分类器更有用。此外，我们还包括描述结果所持的场景的实验。最后，当人们使用Clip（最近的基础模型）所学的表示形式时，无需培训样本，该模型在零弹位设置中实现了CIFAR-10和Imagenet上最新的AD结果。

Due to the intractability of characterizing everything that looks unlike the normal data, anomaly detection (AD) is traditionally treated as an unsupervised problem utilizing only normal samples. However, it has recently been found that unsupervised image AD can be drastically improved through the utilization of huge corpora of random images to represent anomalousness; a technique which is known as Outlier Exposure. In this paper we show that specialized AD learning methods seem unnecessary for state-of-the-art performance, and furthermore one can achieve strong performance with just a small collection of Outlier Exposure data, contradicting common assumptions in the field of AD. We find that standard classifiers and semi-supervised one-class methods trained to discern between normal samples and relatively few random natural images are able to outperform the current state of the art on an established AD benchmark with ImageNet. Further experiments reveal that even one well-chosen outlier sample is sufficient to achieve decent performance on this benchmark (79.3% AUC). We investigate this phenomenon and find that one-class methods are more robust to the choice of training outliers, indicating that there are scenarios where these are still more useful than standard classifiers. Additionally, we include experiments that delineate the scenarios where our results hold. Lastly, no training samples are necessary when one uses the representations learned by CLIP, a recent foundation model, which achieves state-of-the-art AD results on CIFAR-10 and ImageNet in a zero-shot setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题