论文标题
基于Frank-Wolfe的算法,用于近似泰勒的M-估计器
Frank-Wolfe-based Algorithms for Approximating Tyler's M-estimator
论文作者
论文摘要
泰勒(Tyler)的M-估计器是一种众所周知的程序,用于稳健且重尾的协方差估计。泰勒本人提出了一种用于计算其估计器的迭代固定点算法,但是,它需要超级线性(按数据的大小)运行时进行运行时,这可能会大大刺激。据我们所知,我们在这项工作中提出了第一个用于计算泰勒估算器的算法。一个变体使用标准的Frank-Wolfe步骤,第二个变体还考虑\ textit {away-steps}(afw),第三个是afw(gafw)的\ textit {geodesic}版本。事实证明,AFW要求最多需要日志因子,仅在迭代中只有线性时间,而GAFW则以线性时间(最高为日志系数)运行,以$ n $(数据点数)制度。尽管基础优化问题不是凸或平滑,但所有三个变体都显示出以标准假设为标准假设的最佳解决方案。在额外的相当温和的假设下,当(归一化)数据点为I.I.D时,它具有概率1。事实证明,来自整个单元球,AFW和GAFW的连续分布的样品被证明会以线性速率收敛。重要的是,这三个变体都是无参数的,并且使用自适应步骤尺寸。
Tyler's M-estimator is a well known procedure for robust and heavy-tailed covariance estimation. Tyler himself suggested an iterative fixed-point algorithm for computing his estimator however, it requires super-linear (in the size of the data) runtime per iteration, which maybe prohibitive in large scale. In this work we propose, to the best of our knowledge, the first Frank-Wolfe-based algorithms for computing Tyler's estimator. One variant uses standard Frank-Wolfe steps, the second also considers \textit{away-steps} (AFW), and the third is a \textit{geodesic} version of AFW (GAFW). AFW provably requires, up to a log factor, only linear time per iteration, while GAFW runs in linear time (up to a log factor) in a large $n$ (number of data-points) regime. All three variants are shown to provably converge to the optimal solution with sublinear rate, under standard assumptions, despite the fact that the underlying optimization problem is not convex nor smooth. Under an additional fairly mild assumption, that holds with probability 1 when the (normalized) data-points are i.i.d. samples from a continuous distribution supported on the entire unit sphere, AFW and GAFW are proved to converge with linear rates. Importantly, all three variants are parameter-free and use adaptive step-sizes.