论文标题

在其自然栖息地中训练OOD探测器

Training OOD Detectors in their Natural Habitats

论文作者

Katz-Samuels, Julian, Nakhleh, Julia, Nowak, Robert, Li, Yixuan

论文摘要

分布(OOD)检测对于部署在野外的机器学习模型很重要。最近的方法使用辅助分离器数据将模型正规化以改进OOD检测。但是,这些方法是一个有力的分布假设,即辅助离群数据与分布(ID)数据完全可分离。在本文中,我们提出了一个利用野生混合数据的新型框架,该框架自然由ID和OOD样品组成。这样的野生数据很丰富,并且在将机器学习分类器部署在自然栖息地中时自由出现。我们的关键思想是制定一个受约束的优化问题,并展示如何进行批判性解决问题。我们的学习目标使OOD检测率最大化,但要受到ID数据的分类错误以及ID示例的OOD错误率的限制。我们广泛评估了关于常见的OOD检测任务的方法,并证明了卓越的性能。

Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild. Recent methods use auxiliary outlier data to regularize the model for improved OOD detection. However, these approaches make a strong distributional assumption that the auxiliary outlier data is completely separable from the in-distribution (ID) data. In this paper, we propose a novel framework that leverages wild mixture data, which naturally consists of both ID and OOD samples. Such wild data is abundant and arises freely upon deploying a machine learning classifier in their natural habitats. Our key idea is to formulate a constrained optimization problem and to show how to tractably solve it. Our learning objective maximizes the OOD detection rate, subject to constraints on the classification error of ID data and on the OOD error rate of ID examples. We extensively evaluate our approach on common OOD detection tasks and demonstrate superior performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源