论文标题
聚类,多重共线性和奇异向量
Clustering, multicollinearity, and singular vectors
论文作者
论文摘要
令$ a $为矩阵,其伪矩阵$ a^{\ dagger} $并设置$ s = i-a^{\ dagger} a $。我们证明,在重新排序$ a $的列后,矩阵$ s $具有块 - 划线形式,每个块对应于一组线性依赖的列。这使我们可以在$ a $中识别冗余列。我们探讨了有监督和无监督的学习,特征选择,聚类和最小二乘解决方案解决方案的敏感性的某些应用。
Let $A$ be a matrix with its pseudo-matrix $A^{\dagger}$ and set $S=I-A^{\dagger}A$. We prove that, after re-ordering the columns of $A$, the matrix $S$ has a block-diagonal form where each block corresponds to a set of linearly dependent columns. This allows us to identify redundant columns in $A$. We explore some applications in supervised and unsupervised learning, specially feature selection, clustering, and sensitivity of solutions of least squares solutions.