论文标题
自动编码器作为癌症与癌症研究深度分类网络的重量初始化
Autoencoders as Weight Initialization of Deep Classification Networks for Cancer versus Cancer Studies
论文作者
论文摘要
癌症仍然是我们这个时代最具破坏性的疾病之一。自动分类肿瘤样品的一种方法是分析其衍生的分子信息(即其基因表达特征)。在这项工作中,我们旨在区分三种不同类型的癌症:甲状腺,皮肤和胃。为此,我们比较了denoising自动编码器(DAE)的性能,用作深神经网络的重量初始化。尽管我们在这项工作中解决了不同的领域问题,但我们采用了Ferreira等人的相同方法。在训练分类模型时,我们评估了两种不同的方法:(a)预先培训DAE后修复权重,以及(b)允许对整个分类网络进行微调。此外,我们应用两种不同的策略将DAE嵌入分类网络中:(1)仅导入编码层,以及(2)通过插入完整的自动编码器。我们最好的结果是通过DAE进行了无监督的特征学习,然后将其完全进口到分类网络,然后通过监督培训进行微调,在识别癌性甲状腺样品时,F1得分达到98.04%+/- 1.09。
Cancer is still one of the most devastating diseases of our time. One way of automatically classifying tumor samples is by analyzing its derived molecular information (i.e., its genes expression signatures). In this work, we aim to distinguish three different types of cancer: thyroid, skin, and stomach. For that, we compare the performance of a Denoising Autoencoder (DAE) used as weight initialization of a deep neural network. Although we address a different domain problem in this work, we have adopted the same methodology of Ferreira et al.. In our experiments, we assess two different approaches when training the classification model: (a) fixing the weights, after pre-training the DAE, and (b) allowing fine-tuning of the entire classification network. Additionally, we apply two different strategies for embedding the DAE into the classification network: (1) by only importing the encoding layers, and (2) by inserting the complete autoencoder. Our best result was the combination of unsupervised feature learning through a DAE, followed by its full import into the classification network, and subsequent fine-tuning through supervised training, achieving an F1 score of 98.04% +/- 1.09 when identifying cancerous thyroid samples.