论文标题

通过转移学习,具有联合静态和动态特征的快速有效的恶意软件检测

Fast and Efficient Malware Detection with Joint Static and Dynamic Features Through Transfer Learning

论文作者

Ngo, Mao V., Truong-Huu, Tram, Rabadi, Dima, Loo, Jia Yi, Teo, Sin G.

论文摘要

在恶意软件检测中,动态分析在受控环境中提取了恶意软件样本的运行时行为,静态分析使用反向工程工具提取功能。前者面临着恶意软件样本的抗虚拟化和回避行为的挑战,而后者面临着代码混淆的挑战。为了解决这些弊端,提议通过汇总动态和静态特征来开发检测模型的先验作品,从而利用这两种方法的优势。但是,仅仅使动态和静态特征串联会引起不平衡贡献的问题,这是由于特征向量对恶意软件检测模型的性能的异质维度。然而,动态分析是一项耗时的任务,需要一个安全的环境,从而导致检测延迟和维持分析基础架构的高成本。在本文中,我们首先引入了一种通过与同样限制的维度深入学习的串联潜在特征来构造聚合特征的方法。然后,我们开发了一种知识蒸馏技术,以将知识从教师模型从汇总功能中学到的知识转移到仅根据静态功能训练的学生模型,并使用训练有素的学生模型来检测新的恶意软件样本。我们使用86709个样品(包括良性和恶意软件样本)的数据集进行了广泛的实验。实验结果表明,通过我们方法构建的汇总特征训练的教师模型优于最先进的模型,其检测准确性最高可提高2.38%。蒸馏的学生模型不仅可以达到教师模型的高性能(准确性为97.81%),而且还可以显着减少检测时间(从70046.6 ms到194.9 ms),而无需动态分析。

In malware detection, dynamic analysis extracts the runtime behavior of malware samples in a controlled environment and static analysis extracts features using reverse engineering tools. While the former faces the challenges of anti-virtualization and evasive behavior of malware samples, the latter faces the challenges of code obfuscation. To tackle these drawbacks, prior works proposed to develop detection models by aggregating dynamic and static features, thus leveraging the advantages of both approaches. However, simply concatenating dynamic and static features raises an issue of imbalanced contribution due to the heterogeneous dimensions of feature vectors to the performance of malware detection models. Yet, dynamic analysis is a time-consuming task and requires a secure environment, leading to detection delays and high costs for maintaining the analysis infrastructure. In this paper, we first introduce a method of constructing aggregated features via concatenating latent features learned through deep learning with equally-contributed dimensions. We then develop a knowledge distillation technique to transfer knowledge learned from aggregated features by a teacher model to a student model trained only on static features and use the trained student model for the detection of new malware samples. We carry out extensive experiments with a dataset of 86709 samples including both benign and malware samples. The experimental results show that the teacher model trained on aggregated features constructed by our method outperforms the state-of-the-art models with an improvement of up to 2.38% in detection accuracy. The distilled student model not only achieves high performance (97.81% in terms of accuracy) as that of the teacher model but also significantly reduces the detection time (from 70046.6 ms to 194.9 ms) without requiring dynamic analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源