用于异质时间和纵向数据的差异自动编码器

论文标题

用于异质时间和纵向数据的差异自动编码器

A Variational Autoencoder for Heterogeneous Temporal and Longitudinal Data

论文作者

Öğretir, Mine, Ramchandran, Siddharth, Papatheodorou, Dimitrios, Lähdesmäki, Harri

论文摘要

变分自动编码器（VAE）是一种流行的深层变量模型，用于通过学习数据的低维潜图来分析高维数据集。它同时学习了生成模型和推理网络，以执行近似后验推断。最近提出的可以处理时间和纵向数据的VAE的扩展具有在医疗保健，行为建模和预测性维护中的应用。但是，这些扩展并不说明异质数据（即包含连续和离散属性的数据），这在许多现实生活中很常见。在这项工作中，我们提出了将现有的时间和纵向VAE扩展到异质数据的异质纵向VAE（HL-VAE）。 HL-VAE为高维数据集提供了有效的推断，并包括用于连续，计数，分类和序数数据的可能性模型，同时考虑了丢失的观察结果。我们通过模拟和临床数据集证明了模型的功效，并表明我们提出的模型在缺失的价值插补和预测准确性方面实现了竞争性能。

The variational autoencoder (VAE) is a popular deep latent variable model used to analyse high-dimensional datasets by learning a low-dimensional latent representation of the data. It simultaneously learns a generative model and an inference network to perform approximate posterior inference. Recently proposed extensions to VAEs that can handle temporal and longitudinal data have applications in healthcare, behavioural modelling, and predictive maintenance. However, these extensions do not account for heterogeneous data (i.e., data comprising of continuous and discrete attributes), which is common in many real-life applications. In this work, we propose the heterogeneous longitudinal VAE (HL-VAE) that extends the existing temporal and longitudinal VAEs to heterogeneous data. HL-VAE provides efficient inference for high-dimensional datasets and includes likelihood models for continuous, count, categorical, and ordinal data while accounting for missing observations. We demonstrate our model's efficacy through simulated as well as clinical datasets, and show that our proposed model achieves competitive performance in missing value imputation and predictive accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题