论文标题
生存SEQ2SEQ:基于序列序列序列的生存模型
Survival Seq2Seq: A Survival Model based on Sequence to Sequence Architecture
论文作者
论文摘要
本文介绍了一个新型的非参数深层模型,用于在存在审查数据和竞争风险的情况下估算事件时间(生存分析)。该模型是基于序列到序列(SEQ2SEQ)架构设计的,因此我们将其命名为存活seq2seq。我们模型编码器的第一个复发性神经网络(RNN)层由带有衰减(GRU-D)细胞的门控复发单元组成。这些细胞具有有效地将纵向数据集的不交流值归为具有很高速率的纵向数据集,例如电子健康记录(EHR)。生存SEQ2SEQ的解码器为每个竞争风险生成概率分布函数(PDF),而无需假设风险的任何先前分布。利用RNN单元格,解码器能够生成光滑且几乎无尖峰的PDF。这超出了现有的非参数深层模型的生存分析能力。关于合成和医疗数据集的培训结果证明,就预测的准确性和生成的PDF质量而言,生存SEQ2SEQ超过了其他现有的深层生存模型。
This paper introduces a novel non-parametric deep model for estimating time-to-event (survival analysis) in presence of censored data and competing risks. The model is designed based on the sequence-to-sequence (Seq2Seq) architecture, therefore we name it Survival Seq2Seq. The first recurrent neural network (RNN) layer of the encoder of our model is made up of Gated Recurrent Unit with Decay (GRU-D) cells. These cells have the ability to effectively impute not-missing-at-random values of longitudinal datasets with very high missing rates, such as electronic health records (EHRs). The decoder of Survival Seq2Seq generates a probability distribution function (PDF) for each competing risk without assuming any prior distribution for the risks. Taking advantage of RNN cells, the decoder is able to generate smooth and virtually spike-free PDFs. This is beyond the capability of existing non-parametric deep models for survival analysis. Training results on synthetic and medical datasets prove that Survival Seq2Seq surpasses other existing deep survival models in terms of the accuracy of predictions and the quality of generated PDFs.