论文标题

基于静态分析和机器学习,建立Android恶意软件探测器的公平比较和现实的评估框架

Towards a Fair Comparison and Realistic Evaluation Framework of Android Malware Detectors based on Static Analysis and Machine Learning

论文作者

Molina-Coronado, Borja, Mori, Usue, Mendiburu, Alexander, Miguel-Alonso, Jose

论文摘要

与其他网络安全领域一样,机器学习(ML)技术已经成为检测Android恶意软件的有前途的解决方案。从这个意义上讲,迄今为止已经提出了许多采用各种算法和特征集的建议,通常会报告不可思议的检测性能。但是,缺乏可重复性和缺乏标准评估框架使这些建议难以比较。在本文中,我们使用常见的评估框架对10项有关Android恶意软件检测的有影响力的研究作品进行了分析。我们已经确定了五个因素,如果在创建数据集和设计探测器时未考虑,则会显着影响训练有素的ML模型及其性能。特别是,我们分析了(1)存在重复样品的效果,(2)标签(Goodware/Greeyware/Malware)属性,(3)类不平衡,(4)使用逃避技术的应用以及(5)应用程序的演变。基于这一广泛的实验,我们得出结论,基于ML的检测器已经得到了乐观的评估,这证明了良好的已发表结果是合理的。我们的发现还强调,考虑到上述因素,必须生成现实的实验场景,以促进基于ML基于ML的Android恶意软件检测解决方案的兴起。

As in other cybersecurity areas, machine learning (ML) techniques have emerged as a promising solution to detect Android malware. In this sense, many proposals employing a variety of algorithms and feature sets have been presented to date, often reporting impresive detection performances. However, the lack of reproducibility and the absence of a standard evaluation framework make these proposals difficult to compare. In this paper, we perform an analysis of 10 influential research works on Android malware detection using a common evaluation framework. We have identified five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models and their performances. In particular, we analyze the effect of (1) the presence of duplicated samples, (2) label (goodware/greyware/malware) attribution, (3) class imbalance, (4) the presence of apps that use evasion techniques and, (5) the evolution of apps. Based on this extensive experimentation, we conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results. Our findings also highlight that it is imperative to generate realistic experimental scenarios, taking into account the aforementioned factors, to foster the rise of better ML-based Android malware detection solutions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源