在神经网络中使用偶数激活功能

论文标题

在神经网络中使用偶数激活功能

A Use of Even Activation Functions in Neural Networks

论文作者

Gao, Fuchang, Zhang, Boyu

论文摘要

尽管有兴趣将深度学习技术应用于科学发现，但由于可能的功能的广阔景观和深层神经网络的“黑匣子”性质，学习准确描述科学数据的可解释公式非常具有挑战性。成功的关键是有效地将有关数据的基础结构的现有知识或假设整合到深度学习模型的体系结构中，以指导机器学习。当前，这种集成通常是通过自定义损失函数来完成的。在这里，我们提出了一种替代方法，通过构建反映该结构的自定义激活功能来整合现有知识或数据结构的假设。具体而言，我们研究了一种常见情况，当从数据中学到的多元目标函数$ f $部分可交换，\ emph {i.e。} $ f（u，v，w，w，w）= f（v，v，u，w）$ for $ u，v \ in \ in \ mathbb {r}^d $。例如，对于在左右翻转下不变的图像的分类，满足了这些条件。通过理论证明和实验验证，我们表明在其中一个完全连接的层中使用均匀的激活函数可以改善神经网络的性能。在我们的实验9维回归问题中，用指定的“海鸥”激活函数$ \ log（1+x^2）$代替非对称激活函数之一，从而实现了网络性能的实质性改善。令人惊讶的是，即使激活功能也很少用于神经网络。我们的结果表明，自定义激活功能在神经网络中具有很大的潜力。

Despite broad interest in applying deep learning techniques to scientific discovery, learning interpretable formulas that accurately describe scientific data is very challenging because of the vast landscape of possible functions and the "black box" nature of deep neural networks. The key to success is to effectively integrate existing knowledge or hypotheses about the underlying structure of the data into the architecture of deep learning models to guide machine learning. Currently, such integration is commonly done through customization of the loss functions. Here we propose an alternative approach to integrate existing knowledge or hypotheses of data structure by constructing custom activation functions that reflect this structure. Specifically, we study a common case when the multivariate target function $f$ to be learned from the data is partially exchangeable, \emph{i.e.} $f(u,v,w)=f(v,u,w)$ for $u,v\in \mathbb{R}^d$. For instance, these conditions are satisfied for the classification of images that is invariant under left-right flipping. Through theoretical proof and experimental verification, we show that using an even activation function in one of the fully connected layers improves neural network performance. In our experimental 9-dimensional regression problems, replacing one of the non-symmetric activation functions with the designated "Seagull" activation function $\log(1+x^2)$ results in substantial improvement in network performance. Surprisingly, even activation functions are seldom used in neural networks. Our results suggest that customized activation functions have great potential in neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题