论文标题
最佳更改点检测和本地化
Optimal Change-Point Detection and Localization
论文作者
论文摘要
给定$ \ mathbb {r}^n $中的times系列$ {\ bf y} $,具有零件的均值和独立组件,更改点检测和变更点本地化的双问题分别构成了时间的存在,以检测平均变化点的平均值和估计这些变更点位置的时间。在这项工作中,我们严格地表征了问题的最佳速率,并发现了从全球测试问题到局部估计问题的相变现象。引入更改点的能量的适当定义,我们首先在单个更改点设置中建立最佳检测阈值为$ \ sqrt {2 \ log \ log \ log \ log \ log(n)} $。当能量刚好在检测阈值之上时,将更改点定位的问题纯粹是参数:它仅取决于平均值的差异,而不再取决于更改点的位置。有趣的是,对于大多数变更点的位置,可以在能量水平较小的情况下检测和定位它们。在多个更改点设置中,我们建立了能量检测阈值,并类似地表明,特定更改点的最佳定位误差纯粹是参数。在此过程中,还建立了所有变更点位置的向量的Hausdorff和$ L_1 $估计损失的最佳最佳费率。引入了两个实现这些最佳速率的程序。第一个是最小二乘的估计器,其新的多尺度罚款有利于分布良好的变化点。第二个是两步的多尺度后处理过程,其计算复杂性可以低至$ o(n \ log(n))$。值得注意的是,这两个程序可能存在许多可能的低能,因此无法检测到的更改点,即使存在这些滋扰参数,也仍然能够检测和定位高能量更改点。
Given a times series ${\bf Y}$ in $\mathbb{R}^n$, with a piece-wise contant mean and independent components, the twin problems of change-point detection and change-point localization respectively amount to detecting the existence of times where the mean varies and estimating the positions of those change-points. In this work, we tightly characterize optimal rates for both problems and uncover the phase transition phenomenon from a global testing problem to a local estimation problem. Introducing a suitable definition of the energy of a change-point, we first establish in the single change-point setting that the optimal detection threshold is $\sqrt{2\log\log(n)}$. When the energy is just above the detection threshold, then the problem of localizing the change-point becomes purely parametric: it only depends on the difference in means and not on the position of the change-point anymore. Interestingly, for most change-point positions, it is possible to detect and localize them at a much smaller energy level. In the multiple change-point setting, we establish the energy detection threshold and show similarly that the optimal localization error of a specific change-point becomes purely parametric. Along the way, tight optimal rates for Hausdorff and $l_1$ estimation losses of the vector of all change-points positions are also established. Two procedures achieving these optimal rates are introduced. The first one is a least-squares estimator with a new multiscale penalty that favours well spread change-points. The second one is a two-step multiscale post-processing procedure whose computational complexity can be as low as $O(n\log(n))$. Notably, these two procedures accommodate with the presence of possibly many low-energy and therefore undetectable change-points and are still able to detect and localize high-energy change-points even with the presence of those nuisance parameters.