论文标题

REML方法研究的比较

Comparison of REML methods for the study of phenome-wide genetic variation

论文作者

Pavlyshyn, Damian, Johnstone, Iain M., Sztepanacz, Jacqueline L.

论文摘要

现在有充分的文献证明,功能相关性状之间的遗传协方差导致跨多变量性状组合的遗传变异的分布不均匀,并且可能是表型空间的很大一部分,而对进化不可访问。这个近乎无效的遗传空间的大小如何转化为更广泛的现象水平。现在可以解决这些问题的高维表型数据,但是将这些数据纳入遗传分析仍然是一个挑战。多特征的遗传分析(不仅仅是少数特征)是缓慢的,并且在适合REML时通常无法收敛。这使得估计遗传协方差($ \ mathbf {g} $)的挑战,更不用说研究其特性了。我们提出了一种先前提出的REML算法,该算法对于平衡嵌套的半A型设计的特异性环境,对于定量遗传学的常见。我们表明,当特征数量较大时,它基本上要优于其他常见方法,并且我们使用它来研究$ \ mathbf {g} $的估计特征值的偏见以及几乎无效的遗传子空间的大小。我们表明,观察到的高维偏差在质量上与基于I.I.D的样品协方差矩阵的更简单设置中的渐近近似所证实的偏差相似。矢量观察以及解释几乎无效的遗传子空间的估计大小需要在遗传变异的高维研究中谨慎谨慎。我们的结果为未来研究的基础奠定了基础,该研究表征了估计的遗传特征值的渐近近似值,以及用于全遗传变异研究的统计无效分布。

It is now well documented that genetic covariance between functionally related traits leads to an uneven distribution of genetic variation across multivariate trait combinations, and possibly a large part of phenotype-space that is inaccessible to evolution. How the size of this nearly-null genetic space translates to the broader phenome level is unknown. High dimensional phenotype data to address these questions are now within reach, however, incorporating these data into genetic analyses remains a challenge. Multi-trait genetic analyses, of more than a handful of traits, are slow and often fail to converge when fit with REML. This makes it challenging to estimate the genetic covariance ($\mathbf{G}$) underlying thousands of traits, let alone study its properties. We present a previously proposed REML algorithm that is feasible for high dimensional genetic studies in the specific setting of a balanced nested half-sib design, common of quantitative genetics. We show that it substantially outperforms other common approaches when the number of traits is large, and we use it to investigate the bias in estimated eigenvalues of $\mathbf{G}$ and the size of the nearly-null genetic subspace. We show that the high-dimensional biases observed are qualitatively similar to those substantiated by asymptotic approximation in a simpler setting of a sample covariance matrix based on i.i.d. vector observation, and that interpreting the estimated size of the nearly-null genetic subspace requires considerable caution in high-dimensional studies of genetic variation. Our results provide the foundation for future research characterizing the asymptotic approximation of estimated genetic eigenvalues, and a statistical null distribution for phenome-wide studies of genetic variation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源