稀疏神经添加剂模型：可解释的深度学习，并通过小组稀疏性选择功能选择

论文标题

稀疏神经添加剂模型：可解释的深度学习，并通过小组稀疏性选择功能选择

Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity

论文作者

Xu, Shiyun, Bu, Zhiqi, Chaudhari, Pratik, Barnett, Ian J.

论文摘要

可解释的机器学习表现出令人印象深刻的性能，同时保持了解释性。特别是，神经添加剂模型（NAM）为黑盒深度学习提供了解释性，并在大型广义添加剂模型中实现了最新的准确性。为了赋予NAM功能选择并改善概括，我们提出了采用群稀疏正规化（例如组Lasso）的稀疏神经添加剂模型（SNAM），其中每个特征都是由可训练参数集群的子网络学习的。我们使用新型技术来研究SNAM的理论特性，以应对非参数真理，从而从经典的稀疏线性模型（例如Lasso）延伸，该模型仅适用于参数真实。具体来说，我们表明，具有亚级别和近端梯度下降的sn骨被证明趋于零训练损失为$ t \ to \ infty $，而SNAM的估计错误渐近地消失为$ n \ to \ infty $。我们还证明，类似于拉索（Lasso）的SNAM可以具有确切的支持恢复，即适当的正则化。此外，我们表明，Snam可以很好地概括并保留“可识别性”，从而恢复每个功能的效果。我们通过广泛的实验来验证我们的理论，并进一步证明了SNAM的良好准确性和效率。

Interpretable machine learning has demonstrated impressive performance while preserving explainability. In particular, neural additive models (NAM) offer the interpretability to the black-box deep learning and achieve state-of-the-art accuracy among the large family of generalized additive models. In order to empower NAM with feature selection and improve the generalization, we propose the sparse neural additive models (SNAM) that employ the group sparsity regularization (e.g. Group LASSO), where each feature is learned by a sub-network whose trainable parameters are clustered as a group. We study the theoretical properties for SNAM with novel techniques to tackle the non-parametric truth, thus extending from classical sparse linear models such as the LASSO, which only works on the parametric truth. Specifically, we show that SNAM with subgradient and proximal gradient descents provably converges to zero training loss as $t\to\infty$, and that the estimation error of SNAM vanishes asymptotically as $n\to\infty$. We also prove that SNAM, similar to LASSO, can have exact support recovery, i.e. perfect feature selection, with appropriate regularization. Moreover, we show that the SNAM can generalize well and preserve the `identifiability', recovering each feature's effect. We validate our theories via extensive experiments and further testify to the good accuracy and efficiency of SNAM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题