论文标题

MP-gelu贝叶斯神经网络:Gelu非线性的力矩传播

MP-GELU Bayesian Neural Networks: Moment Propagation by GELU Nonlinearity

论文作者

Hirayama, Yuki, Takamaeda-Yamazaki, Sinya

论文摘要

贝叶斯神经网络(BNN)一直是研究不确定性定量研究的重要框架。确定性变分推断,一种推理方法之一,利用力矩传播来计算预测分布和目标函数。不幸的是,得出矩需要在非线性函数中计算昂贵的泰勒膨胀,例如整流的线性单元(relu)或sigmoid函数。因此,需要一种比常规功能更快地实现速度传播的新非线性函数。在本文中,我们提出了一种新型的非线性函数,称为MONM,传播高斯误差线性单元(MP-GELU),该函数能够快速衍生BNN中的第一和第二矩。 MP-GELU通过将非线性应用于输入统计量来实现矩的分析计算,从而减少了非线性函数所需的计算昂贵计算。在有关回归任务的经验实验中,我们观察到,与基于RELU的BNN相比,提出的MP-GELU具有更高的预测准确性和更高的不确定性质量。

Bayesian neural networks (BNNs) have been an important framework in the study of uncertainty quantification. Deterministic variational inference, one of the inference methods, utilizes moment propagation to compute the predictive distributions and objective functions. Unfortunately, deriving the moments requires computationally expensive Taylor expansion in nonlinear functions, such as a rectified linear unit (ReLU) or a sigmoid function. Therefore, a new nonlinear function that realizes faster moment propagation than conventional functions is required. In this paper, we propose a novel nonlinear function named moment propagating-Gaussian error linear unit (MP-GELU) that enables the fast derivation of first and second moments in BNNs. MP-GELU enables the analytical computation of moments by applying nonlinearity to the input statistics, thereby reducing the computationally expensive calculations required for nonlinear functions. In empirical experiments on regression tasks, we observed that the proposed MP-GELU provides higher prediction accuracy and better quality of uncertainty with faster execution than those of ReLU-based BNNs.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源