部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Sequential Bayesian Neural Subnetwork Ensembles

论文作者

Jantre, Sanket, Bhattacharya, Shrijita, Urban, Nathan M., Yoon, Byung-Jun, Maiti, Tapabrata, Balaprakash, Prasanna, Madireddy, Sandeep

论文摘要

深层合奏已成为一种强大的技术，可通过利用模型多样性来提高预测性能并增强各种应用程序的模型鲁棒性。但是，传统的深层合奏方法通常在计算上很昂贵，并且依赖于确定性模型，这可能会限制其灵活性。此外，尽管密集模型的稀疏子网表现出了匹配其密集的表现，甚至增强鲁棒性的希望，但现有的诱导稀疏性的方法通常会产生与培训单个密集模型相当的培训成本，因为它们要么在训练过程中逐渐修剪网络，要么在训练期间逐渐修剪训练或应用thresholding thresholding训练后。鉴于这些挑战，我们提出了一种方法，用于连续结合动态贝叶斯神经子网，该方法在整个训练过程中始终保持模型的复杂性降低，同时在单个正向传球中产生不同的合奏。我们的方法涉及一个初始探索阶段，以识别参数空间内的高性能区域，然后进行多个利用稀疏模型紧凑性的剥削阶段。这些剥削阶段在能量景观中迅速融合到不同的最小值，对应于高性能的子网，它们共同形成了多种多样，强大的合奏。我们从经验上证明，我们提出的方法在预测准确性，不确定性估计，分布外检测和对抗性鲁棒性方面优于传统的密集和稀疏确定性和贝叶斯合奏模型。

Deep ensembles have emerged as a powerful technique for improving predictive performance and enhancing model robustness across various applications by leveraging model diversity. However, traditional deep ensemble methods are often computationally expensive and rely on deterministic models, which may limit their flexibility. Additionally, while sparse subnetworks of dense models have shown promise in matching the performance of their dense counterparts and even enhancing robustness, existing methods for inducing sparsity typically incur training costs comparable to those of training a single dense model, as they either gradually prune the network during training or apply thresholding post-training. In light of these challenges, we propose an approach for sequential ensembling of dynamic Bayesian neural subnetworks that consistently maintains reduced model complexity throughout the training process while generating diverse ensembles in a single forward pass. Our approach involves an initial exploration phase to identify high-performing regions within the parameter space, followed by multiple exploitation phases that take advantage of the compactness of the sparse model. These exploitation phases quickly converge to different minima in the energy landscape, corresponding to high-performing subnetworks that together form a diverse and robust ensemble. We empirically demonstrate that our proposed approach outperforms traditional dense and sparse deterministic and Bayesian ensemble models in terms of prediction accuracy, uncertainty estimation, out-of-distribution detection, and adversarial robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题