深层FNN预测模型中乳腺癌转移的过度拟合的经验研究

论文标题

深层FNN预测模型中乳腺癌转移的过度拟合的经验研究

Empirical Study of Overfitting in Deep FNN Prediction Models for Breast Cancer Metastasis

论文作者

Xu, Chuhan, Coen-Pirani, Pablo, Jiang, Xia

论文摘要

过度拟合被定义为当前模型完美拟合了特定数据集，从而导致泛化，最终可能会影响预测未来数据的准确性。在这项研究中，我们使用了关于乳腺癌转移的EHR数据集来研究深层喂养神经网络（FNNS）预测模型的过度拟合。我们包括了11种深FNNS模型的超参数，并采用了一种经验方法来研究这些超参数中的每一个如何影响预测性能，并且在给出大量值时如何影响预测性能和过度拟合。我们还研究了一些有趣的超参数对如何相互作用，以影响模型性能和过度拟合。我们研究的11个超参数包括激活功能；重量初始化器，隐藏层的数量，学习率，动量，衰减，辍学率，批处理大小，时代，L1和L2。我们的结果表明，大多数单个超参数均通过模型预测性能和过度拟合而进行负或正面校正。特别是，我们发现，过度拟合的整体趋于与学习率，衰减，批处理侧和L2负相关，但往往与动量，时代和L1呈正相关。根据我们的结果，与大多数其他超级参数（包括L1，L2和辍学率）相比，学习率，衰减和批次大小可能对过度拟合和预测性能更大，这些效果旨在最小化过度拟合。我们还发现了一些有趣的相互作用对超参数，例如学习率和动量，学习率和衰减以及批处理大小和时代。关键词：深度学习，过度拟合，预测，网格搜索，进食神经网络，乳腺癌转移。

Overfitting is defined as the fact that the current model fits a specific data set perfectly, resulting in weakened generalization, and ultimately may affect the accuracy in predicting future data. In this research we used an EHR dataset concerning breast cancer metastasis to study overfitting of deep feedforward Neural Networks (FNNs) prediction models. We included 11 hyperparameters of the deep FNNs models and took an empirical approach to study how each of these hyperparameters was affecting both the prediction performance and overfitting when given a large range of values. We also studied how some of the interesting pairs of hyperparameters were interacting to influence the model performance and overfitting. The 11 hyperparameters we studied include activate function; weight initializer, number of hidden layers, learning rate, momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show that most of the single hyperparameters are either negatively or positively corrected with model prediction performance and overfitting. In particular, we found that overfitting overall tends to negatively correlate with learning rate, decay, batch sides, and L2, but tends to positively correlate with momentum, epochs, and L1. According to our results, learning rate, decay, and batch size may have a more significant impact on both overfitting and prediction performance than most of the other hyperparameters, including L1, L2, and dropout rate, which were designed for minimizing overfitting. We also find some interesting interacting pairs of hyperparameters such as learning rate and momentum, learning rate and decay, and batch size and epochs. Keywords: Deep learning, overfitting, prediction, grid search, feedforward neural networks, breast cancer metastasis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题