论文标题
代表变量选择方法的锤误误的比较
A Comparison of Hamming Errors of Representative Variable Selection Methods
论文作者
论文摘要
Lasso是一种在线性模型中选择变量选择的著名方法,但是当变量适度或密切相关时,它会面临挑战。这激发了替代方法,例如使用非凸惩罚,增加脊正规化或进行路径后阈值。在本文中,我们将套索与其他5种方法进行比较:弹性网,SCAD,正向选择,阈值套索和向后选择。假设回归系数是从两点混合物中得出的,并且革兰氏矩阵是块状的对角线,那么我们从理论上测量了它们的性能。通过得出锤误差和相图的收敛速率,我们获得了有关不同方法的利弊的有用结论。
Lasso is a celebrated method for variable selection in linear models, but it faces challenges when the variables are moderately or strongly correlated. This motivates alternative approaches such as using a non-convex penalty, adding a ridge regularization, or conducting a post-Lasso thresholding. In this paper, we compare Lasso with 5 other methods: Elastic net, SCAD, forward selection, thresholded Lasso, and forward backward selection. We measure their performances theoretically by the expected Hamming error, assuming that the regression coefficients are iid drawn from a two-point mixture and that the Gram matrix is block-wise diagonal. By deriving the rates of convergence of Hamming errors and the phase diagrams, we obtain useful conclusions about the pros and cons of different methods.