将极端长度的序列分类，并将内存不断应用于恶意软件检测

论文标题

将极端长度的序列分类，并将内存不断应用于恶意软件检测

Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

论文作者

Raff, Edward, Fleshman, William, Zak, Richard, Anderson, Hyrum S., Filar, Bobby, McLean, Mark

论文摘要

机器学习中的最新作品一直在解决越来越多的大小的投入，网络安全呈现序列分类问题特别极端。对于Windows可执行的恶意软件检测，输入可能超过$ 100 $ MB，这对应于$ t = 100,000,000 $步骤的时间序列。迄今为止，处理此类任务的最接近方法是Malconv，Malconv是一个卷积神经网络，能够处理高达$ t = 2,000,000 $步骤。 CNNS的$ MATHCAL {O}（t）$内存阻止了CNN在恶意软件中的进一步应用。在这项工作中，我们开发了一种新的时间最大池池的方法，这使得序列长度$ t $不变。这使Malconv $ 116 \ times $ $更高的内存效率更高，并且在其原始数据集中训练$ 25.8 \ times $，同时删除输入长度限制到Malconv。我们通过开发一种新的全球渠道门控设计，将这些收益重新投资以改善Malconv架构，从而使我们有一个能够以有效的方式学习跨1亿个时间步长的注意力机制，这是原始的Malconv CNN所缺乏的功能。我们的实施可以在https://github.com/neuromorphiccomputationresearchprogram/malconv2上找到

Recent works within machine learning have been tackling inputs of ever-increasing size, with cybersecurity presenting sequence classification problems of particularly extreme lengths. In the case of Windows executable malware detection, inputs may exceed $100$ MB, which corresponds to a time series with $T=100,000,000$ steps. To date, the closest approach to handling such a task is MalConv, a convolutional neural network capable of processing up to $T=2,000,000$ steps. The $\mathcal{O}(T)$ memory of CNNs has prevented further application of CNNs to malware. In this work, we develop a new approach to temporal max pooling that makes the required memory invariant to the sequence length $T$. This makes MalConv $116\times$ more memory efficient, and up to $25.8\times$ faster to train on its original dataset, while removing the input length restrictions to MalConv. We re-invest these gains into improving the MalConv architecture by developing a new Global Channel Gating design, giving us an attention mechanism capable of learning feature interactions across 100 million time steps in an efficient manner, a capability lacked by the original MalConv CNN. Our implementation can be found at https://github.com/NeuromorphicComputationResearchProgram/MalConv2

下载PDF全文

下载文献需遵守相关版权规定

论文标题