使用病例对照母子基因型数据对父母起源效应的有效推断

论文标题

使用病例对照母子基因型数据对父母起源效应的有效推断

Efficient inference of parental origin effects using case-control mother-child genotype data

论文作者

Tian, Yuang, Zhang, Hong, Bureau, Alexandre, Hochner, Hagit, Chen, Jinbo

论文摘要

父母的起源效应在哺乳动物发育和混乱中起着重要作用。病例控制的母子对基因型数据可用于检测父母的起源效应，并且通常很方便地在实践中收集。评估父母起源效应的大多数现有方法都不包含任何协变量，这可能是控制混杂因素所必需的。我们建议通过逻辑回归模型对父母的起源效应进行建模，并具有包括母体和儿童基因型，父母起源和协变量在内的预测因子。父母的起源可能不会从目标遗传标记的基因型中完全推断出来，因此我们建议使用与目标标记密切相关的标记基因型来提高推断效率。基于改性的轮廓可能性以回顾性的方式开发了计算坚固的统计推理过程。设计了一种计算可行的期望最大化算法，以估计修改曲线可能性涉及的所有未知参数。这种算法与常规期望最大化算法不同，因为它基于修改而不是原始配置文件的可能性函数。该算法的收敛是在某些轻度的规律条件下建立的。这种期望最大化算法还可以方便地处理缺失的儿童基因型。在某些轻度的规律条件下，为拟议的估计量建立了较大的样本特性，包括弱一致性，渐近正态性和渐近效率。通过广泛的模拟研究和对实际数据集的应用来评估有限样本属性。

Parental origin effects play an important role in mammal development and disorder. Case-control mother-child pair genotype data can be used to detect parental origin effects and is often convenient to collect in practice. Most existing methods for assessing parental origin effects do not incorporate any covariates, which may be required to control for confounding factors. We propose to model the parental origin effects through a logistic regression model, with predictors including maternal and child genotypes, parental origins, and covariates. The parental origins may not be fully inferred from genotypes of a target genetic marker, so we propose to use genotypes of markers tightly linked to the target marker to increase inference efficiency. A computationally robust statistical inference procedure is developed based on a modified profile likelihood in a retrospective way. A computationally feasible expectation-maximization algorithm is devised to estimate all unknown parameters involved in the modified profile likelihood. This algorithm differs from the conventional expectation-maximization algorithm in the sense that it is based on a modified instead of the original profile likelihood function. The convergence of the algorithm is established under some mild regularity conditions. This expectation-maximization algorithm also allows convenient handling of missing child genotypes. Large sample properties, including weak consistency, asymptotic normality, and asymptotic efficiency, are established for the proposed estimator under some mild regularity conditions. Finite sample properties are evaluated through extensive simulation studies and the application to a real dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题