论文标题
动态数据的不同系数线性判别分析
Varying Coefficient Linear Discriminant Analysis for Dynamic Data
论文作者
论文摘要
线性判别分析(LDA)是统计和机器学习中的重要分类工具。本文研究了动态数据的不同系数LDA模型,贝叶斯的判别方向是某些暴露变量的函数,以解决该异质性。我们提出了一种基于B-Spline近似的新最小二乘估计方法。与动态线性编程规则\ citep {jiang2020dynamic}相比,数据驱动的判别过程在计算上更有效。我们还建立了相应的估计误差约束和多余分类风险的收敛率。 $ L_2 $距离中的估计误差对于低维度是最佳的,对于高维度几乎是最佳的。关于合成数据和实际数据的数值实验都证实了我们提出的分类方法的优越性。
Linear discriminant analysis (LDA) is an important classification tool in statistics and machine learning. This paper investigates the varying coefficient LDA model for dynamic data, with Bayes' discriminant direction being a function of some exposure variable to address the heterogeneity. We propose a new least-square estimation method based on the B-spline approximation. The data-driven discriminant procedure is more computationally efficient than the dynamic linear programming rule \citep{jiang2020dynamic}. We also establish the convergence rates for the corresponding estimation error bound and the excess misclassification risk. The estimation error in $L_2$ distance is optimal for the low-dimensional regime and is near optimal for the high-dimensional regime. Numerical experiments on synthetic data and real data both corroborate the superiority of our proposed classification method.