论文标题
具有理论保证的稀疏深度学习的有效差异推断
Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee
论文作者
论文摘要
稀疏的深度学习旨在应对深层神经网络大量存储消耗的挑战,并恢复目标功能的稀疏结构。尽管已经取得了巨大的经验成功,但大多数稀疏的深度学习算法缺乏理论支持。另一方面,另一项作品提出了理论框架,这些框架在计算上是不可行的。在本文中,我们在尖峰和单板培训下进行了完全贝叶斯治疗的稀疏深神经网络,并通过连续放松Bernoulli分布来开发一组计算有效的变化推断。提供了变异后收缩率,这证明了所提出的变异贝叶斯方法的一致性是合理的。值得注意的是,我们的经验结果表明,这种变异程序在贝叶斯预测分布方面提供了不确定性量化,并且还可以通过训练稀疏的多层神经网络来完成一致的变量选择。
Sparse deep learning aims to address the challenge of huge storage consumption by deep neural networks, and to recover the sparse structure of target functions. Although tremendous empirical successes have been achieved, most sparse deep learning algorithms are lacking of theoretical support. On the other hand, another line of works have proposed theoretical frameworks that are computationally infeasible. In this paper, we train sparse deep neural networks with a fully Bayesian treatment under spike-and-slab priors, and develop a set of computationally efficient variational inferences via continuous relaxation of Bernoulli distribution. The variational posterior contraction rate is provided, which justifies the consistency of the proposed variational Bayes method. Notably, our empirical results demonstrate that this variational procedure provides uncertainty quantification in terms of Bayesian predictive distribution and is also capable to accomplish consistent variable selection by training a sparse multi-layer neural network.