潜在的相似性标识了表型预测的重要功能连接

论文标题

潜在的相似性标识了表型预测的重要功能连接

Latent Similarity Identifies Important Functional Connections for Phenotype Prediction

论文作者

Orlichenko, Anton, Qu, Gang, Zhang, Gemeng, Patel, Binish, Wilson, Tony W., Stephen, Julia M., Calhoun, Vince D., Wang, Yu-Ping

论文摘要

目的：诸如脑年龄和液体智力等内型型是疾病状况的重要生物标志物。但是，识别这些生物标志物的大脑成像研究通常会遇到有限的受试者和高维成像特征，从而阻碍了可重复性。因此，我们开发了一种可解释的多元分类/回归算法，称为潜在相似性（LATSIM），适用于小样本量，高特征维度数据集。方法：LATSIM将度量学习与内核相似性函数和软磁性聚合结合在一起，以识别受试者之间与任务相关的相似性。主题间的相似性用于使用多范式fMRI数据来提高三个预测任务的性能。通过LATSIM的计算效率使贪婪的选择算法成为可能，是一种解释性方法。结果：在费城神经发育队列（PNC）数据集上，LATSIM在小样本量上的预测准确性明显更高。与其他方法确定的连接相比，LATSIM确定的连接具有出色的判别能力。我们确定了4个功能性脑网络，这些功能富含连接，以预测脑年龄，性别和智力。结论：我们发现，预测任务的大多数信息仅来自几个（1-5）连接。此外，我们发现默认模式网络在所有预测任务的顶部连接中都过分代表。意义：我们为小样本，高特征维度数据集提出了一种新型算法，并使用它来识别fMRI数据中的连接。我们的工作应该导致有关算法设计和神经科学研究的新见解。代码和演示可从https://github.com/aorliche/latentsimility/获得。

Objective: Endophenotypes such as brain age and fluid intelligence are important biomarkers of disease status. However, brain imaging studies to identify these biomarkers often encounter limited numbers of subjects and high dimensional imaging features, hindering reproducibility. Therefore, we develop an interpretable, multivariate classification/regression algorithm, called Latent Similarity (LatSim), suitable for small sample size, high feature dimension datasets. Methods: LatSim combines metric learning with a kernel similarity function and softmax aggregation to identify task-related similarities between subjects. Inter-subject similarity is utilized to improve performance on three prediction tasks using multi-paradigm fMRI data. A greedy selection algorithm, made possible by LatSim's computational efficiency, is developed as an interpretability method. Results: LatSim achieved significantly higher predictive accuracy at small sample sizes on the Philadelphia Neurodevelopmental Cohort (PNC) dataset. Connections identified by LatSim gave superior discriminative power compared to those identified by other methods. We identified 4 functional brain networks enriched in connections for predicting brain age, sex, and intelligence. Conclusion: We find that most information for a predictive task comes from only a few (1-5) connections. Additionally, we find that the default mode network is over-represented in the top connections of all predictive tasks. Significance: We propose a novel algorithm for small sample, high feature dimension datasets and use it to identify connections in task fMRI data. Our work should lead to new insights in both algorithm design and neuroscience research. Code and demo are available at https://github.com/aorliche/LatentSimilarity/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题