论文标题
切线系统发育PCA
Tangent phylogenetic PCA
论文作者
论文摘要
系统发育PCA(P-PCA)是用于观察的PCA版本,是系统发育树的叶子节点。 P-PCA解释了由于共有的进化历史而导致的观察并非独立的事实。该方法可用于欧几里得数据,但在进化生物学中,需要将其应用于流形的数据,尤其是形状。我们将P-PCA的概括用于在Riemannian歧管上的数据,称为切线P-PCA。因此,切线P-PCA可以考虑形状空间的非线性结构以及系统发育协方差。我们在球体上显示了模拟结果,表明了行为良好的误差分布和估计器的快速收敛性。此外,我们将该方法应用于哺乳动物的数据集,该数据集用LDDMM公制的具有里程碑式的歧管上的点表示。
Phylogenetic PCA (p-PCA) is a version of PCA for observations that are leaf nodes of a phylogenetic tree. P-PCA accounts for the fact that such observations are not independent, due to shared evolutionary history. The method works on Euclidean data, but in evolutionary biology there is a need for applying it to data on manifolds, particularly shapes. We provide a generalization of p-PCA to data lying on Riemannian manifolds, called Tangent p-PCA. Tangent p-PCA thus makes it possible to perform dimension reduction on a data set of shapes, taking into account both the non-linear structure of the shape space as well as phylogenetic covariance. We show simulation results on the sphere, demonstrating well-behaved error distributions and fast convergence of estimators. Furthermore, we apply the method to a data set of mammal jaws, represented as points on a landmark manifold equipped with the LDDMM metric.