论文标题
二阶最佳异常假设检验
Second-Order Asymptotically Optimal Outlier Hypothesis Testing
论文作者
论文摘要
我们重新审视Li \ emph {等}(TIT 2014)的异常假设检验框架,并在广义的Neyman-Pearson标准下得出了最佳测试的基本限制。在异常假设检验中,给出了一个观察到的序列,其中大多数序列都是生成i.i.d的。来自名义分布。任务是辨别从异常分布生成的外围序列集。名义和异常分布是\ emph {unknown}。我们研究了错误分类错误,错误警报和错误拒绝的概率之间的权衡,这些测试满足了这些误差概率降低率的弱条件,这是序列长度的函数。具体来说,我们提出了一个基于阈值的测试,以确保错误分类错误和错误警报概率的指数衰减。我们研究了对错误拒绝概率的两个限制,其中一个约束是它是一个不变常数,另一个是它具有指数衰减率。对于这两种情况,我们都将虚假拒绝概率的界限表征为阈值的函数,每对名义和异常分布,并在广义的Neyman-Pearson标准下证明了我们测试的最佳性。我们首先考虑最多一个外围序列的情况,然后将结果概括为多个外围序列的情况,在多个外围序列中,外围序列的数量是未知的,并且每个外围序列都可以遵循不同的异常分布。
We revisit the outlier hypothesis testing framework of Li \emph{et al.} (TIT 2014) and derive fundamental limits for the optimal test under the generalized Neyman-Pearson criterion. In outlier hypothesis testing, one is given multiple observed sequences, where most sequences are generated i.i.d. from a nominal distribution. The task is to discern the set of outlying sequences that are generated from anomalous distributions. The nominal and anomalous distributions are \emph{unknown}. We study the tradeoff among the probabilities of misclassification error, false alarm and false reject for tests that satisfy weak conditions on the rate of decrease of these error probabilities as a function of sequence length. Specifically, we propose a threshold-based test that ensures exponential decay of misclassification error and false alarm probabilities. We study two constraints on the false reject probability, with one constraint being that it is a non-vanishing constant and the other being that it has an exponential decay rate. For both cases, we characterize bounds on the false reject probability, as a function of the threshold, for each pair of nominal and anomalous distributions and demonstrate the optimality of our test under the generalized Neyman-Pearson criterion. We first consider the case of at most one outlying sequence and then generalize our results to the case of multiple outlying sequences where the number of outlying sequences is unknown and each outlying sequence can follow a different anomalous distribution.