论文标题
将异质亚组与COX回归的图形结构变量选择先验相结合
Combining heterogeneous subgroups with graph-structured variable selection priors for Cox regression
论文作者
论文摘要
癌症研究中的重要目标是基于分子测量值(例如基因表达数据)和新预后生物标志物(例如基因)鉴定患者的风险。在临床实践中,这通常是具有挑战性的,因为患者队列通常很小,并且可能是异质的。在经典亚组分析中,仅使用一个特定队列的数据拟合单独的预测模型。但是,当样本量很小时,这可能会导致功率损失。另一方面,所有队列的简单集合可能会导致偏见的结果,尤其是当队列是异质的时。在这种情况下,我们提出了一种适合连续分子测量和生存结果的新贝叶斯方法,该方法可以识别重要的预测因子,并为每个队列提供单独的风险预测模型。它允许在队列之间共享信息,从而通过假设链接不同队列内部和跨不同队列的图形链接预测变量来增加功率。该图有助于识别在不同人群中同时预后的功能相关基因和基因的途径。结果表明,就预测性能而言,我们所提出的方法优于标准方法,当样本量较小时,可变选择的功率增加。
Important objectives in cancer research are the prediction of a patient's risk based on molecular measurements such as gene expression data and the identification of new prognostic biomarkers (e.g. genes). In clinical practice, this is often challenging because patient cohorts are typically small and can be heterogeneous. In classical subgroup analysis, a separate prediction model is fitted using only the data of one specific cohort. However, this can lead to a loss of power when the sample size is small. Simple pooling of all cohorts, on the other hand, can lead to biased results, especially when the cohorts are heterogeneous. For this situation, we propose a new Bayesian approach suitable for continuous molecular measurements and survival outcome that identifies the important predictors and provides a separate risk prediction model for each cohort. It allows sharing information between cohorts to increase power by assuming a graph linking predictors within and across different cohorts. The graph helps to identify pathways of functionally related genes and genes that are simultaneously prognostic in different cohorts. Results demonstrate that our proposed approach is superior to the standard approaches in terms of prediction performance and increased power in variable selection when the sample size is small.