论文标题

使用Android软件包相似性网络上的社区检测的Android恶意软件聚类

Android Malware Clustering using Community Detection on Android Packages Similarity Network

论文作者

Karbab, ElMouatez Billah, Debbabi, Mourad, Derhab, Abdelouahid, Mouheb, Djedjiga

论文摘要

针对应用程序存储库的每日数量的Android恶意应用程序(APP)正在增加,它们的数量压倒了指纹的过程。为了解决这个问题,我们提出了一个增强的Cypider框架,这是一组技术和工具,旨在通过构建恶意应用程序的可扩展且混淆的弹性相似性网络基础结构来对移动恶意软件进行系统检测。我们的方法基于我们提出的概念,即恶意社区,在该概念中,我们认为具有共同特征的恶意实例是同一恶意软件家族中最有可能的一部分。使用这个概念,我们大概假设具有不同作者的多个类似的Android应用程序最有可能是恶意的。具体而言,Cypider利用此假设来检测已知恶意软件家族和零日恶意应用程序的变体。 Cypider在相似性网络上应用社区检测算法,该算法将被认为是可疑和可能是恶意社区的子图纸提取。此外,我们提出了一种新颖的指纹技术,即社区指纹,基于每个恶意社区的一级机器学习模型。此外,我们提出了一个增强的Cypider框架,它需要更少的内存,X650,而与原始版本相比,构建相似性网络X700的时间更少,而不会影响框架的指纹性能。我们引入了一种系统的方法,以定位不同特征内容向量的最佳阈值,从而简化了整体检测过程。

The daily amount of Android malicious applications (apps) targeting the app repositories is increasing, and their number is overwhelming the process of fingerprinting. To address this issue, we propose an enhanced Cypider framework, a set of techniques and tools aiming to perform a systematic detection of mobile malware by building a scalable and obfuscation resilient similarity network infrastructure of malicious apps. Our approach is based on our proposed concept, namely malicious community, in which we consider malicious instances that share common features are the most likely part of the same malware family. Using this concept, we presumably assume that multiple similar Android apps with different authors are most likely to be malicious. Specifically, Cypider leverages this assumption for the detection of variants of known malware families and zero-day malicious apps. Cypider applies community detection algorithms on the similarity network, which extracts sub-graphs considered as suspicious and possibly malicious communities. Furthermore, we propose a novel fingerprinting technique, namely community fingerprint, based on a one-class machine learning model for each malicious community. Besides, we proposed an enhanced Cypider framework, which requires less memory, x650, and less time to build the similarity network, x700, compared to the original version, without affecting the fingerprinting performance of the framework. We introduce a systematic approach to locate the best threshold on different feature content vectors, which simplifies the overall detection process.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源