最佳更改点检测和本地化

论文标题

最佳更改点检测和本地化

Optimal Change-Point Detection and Localization

论文作者

Verzelen, Nicolas, Fromont, Magalie, Lerasle, Matthieu, Reynaud-Bouret, Patricia

论文摘要

给定$ \ mathbb {r}^n $中的times系列$ {\ bf y} $，具有零件的均值和独立组件，更改点检测和变更点本地化的双问题分别构成了时间的存在，以检测平均变化点的平均值和估计这些变更点位置的时间。在这项工作中，我们严格地表征了问题的最佳速率，并发现了从全球测试问题到局部估计问题的相变现象。引入更改点的能量的适当定义，我们首先在单个更改点设置中建立最佳检测阈值为$ \ sqrt {2 \ log \ log \ log \ log \ log（n）} $。当能量刚好在检测阈值之上时，将更改点定位的问题纯粹是参数：它仅取决于平均值的差异，而不再取决于更改点的位置。有趣的是，对于大多数变更点的位置，可以在能量水平较小的情况下检测和定位它们。在多个更改点设置中，我们建立了能量检测阈值，并类似地表明，特定更改点的最佳定位误差纯粹是参数。在此过程中，还建立了所有变更点位置的向量的Hausdorff和$ L_1 $估计损失的最佳最佳费率。引入了两个实现这些最佳速率的程序。第一个是最小二乘的估计器，其新的多尺度罚款有利于分布良好的变化点。第二个是两步的多尺度后处理过程，其计算复杂性可以低至$ o（n \ log（n））$。值得注意的是，这两个程序可能存在许多可能的低能，因此无法检测到的更改点，即使存在这些滋扰参数，也仍然能够检测和定位高能量更改点。

Given a times series ${\bf Y}$ in $\mathbb{R}^n$, with a piece-wise contant mean and independent components, the twin problems of change-point detection and change-point localization respectively amount to detecting the existence of times where the mean varies and estimating the positions of those change-points. In this work, we tightly characterize optimal rates for both problems and uncover the phase transition phenomenon from a global testing problem to a local estimation problem. Introducing a suitable definition of the energy of a change-point, we first establish in the single change-point setting that the optimal detection threshold is $\sqrt{2\log\log(n)}$. When the energy is just above the detection threshold, then the problem of localizing the change-point becomes purely parametric: it only depends on the difference in means and not on the position of the change-point anymore. Interestingly, for most change-point positions, it is possible to detect and localize them at a much smaller energy level. In the multiple change-point setting, we establish the energy detection threshold and show similarly that the optimal localization error of a specific change-point becomes purely parametric. Along the way, tight optimal rates for Hausdorff and $l_1$ estimation losses of the vector of all change-points positions are also established. Two procedures achieving these optimal rates are introduced. The first one is a least-squares estimator with a new multiscale penalty that favours well spread change-points. The second one is a two-step multiscale post-processing procedure whose computational complexity can be as low as $O(n\log(n))$. Notably, these two procedures accommodate with the presence of possibly many low-energy and therefore undetectable change-points and are still able to detect and localize high-energy change-points even with the presence of those nuisance parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题