用于Android恶意软件分类的动态加权联合学习

论文标题

用于Android恶意软件分类的动态加权联合学习

A Dynamic Weighted Federated Learning for Android Malware Classification

论文作者

Chaudhuri, Ayushi, Nandi, Arijit, Pradhan, Buddhadeb

论文摘要

Android恶意软件攻击每天都在大量销量增加，这使Android用户更容易受到网络攻击的影响。研究人员已经开发了许多机器学习（ML）/深度学习（DL）技术来检测和减轻Android恶意软件攻击。但是，由于技术进步，Android移动设备的上升幅度有所增加。此外，设备在地理上分散，从而产生分布式数据。在这种情况下，传统的ML/DL技术是不可行的，因为所有这些方法都要求将数据保存在中央系统中。由于Android移动设备的大规模扩散，这可能会给用户隐私提供问题。将数据放在中央系统中会产生开销。同样，传统的基于ML/DL的Android恶意软件分类技术是不可扩展的。研究人员提出了基于联邦学习（FL）的Android恶意软件分类系统，以通过高分类性能来解决隐私保护和可扩展性。在传统的FL中，通过合并从参与FL的所有客户获得的所有本地模型，将联邦平均（FedAvg）用于在每个回合中构建全球模型。但是，常规的FedAvg有一个不利的位置：如果每个回合的全球模型开发中都包含一个表现不佳的本地模型，则可能导致表现不佳的全球模型。因为FedAvg在平均值时同样偏爱所有本地模型。为了解决这个问题，我们的主要目标是设计一个动态加权联合平均（DW-FEDAVG）策略，在该策略中，每个本地模型的权重根据客户在客户端的性能自动更新。使用四个流行的基准数据集评估了DW-FEDAVG，即梅尔基组，Drebin，kronodroid和Tuandromd用于Android恶意软件分类研究。

Android malware attacks are increasing daily at a tremendous volume, making Android users more vulnerable to cyber-attacks. Researchers have developed many machine learning (ML)/ deep learning (DL) techniques to detect and mitigate android malware attacks. However, due to technological advancement, there is a rise in android mobile devices. Furthermore, the devices are geographically dispersed, resulting in distributed data. In such scenario, traditional ML/DL techniques are infeasible since all of these approaches require the data to be kept in a central system; this may provide a problem for user privacy because of the massive proliferation of Android mobile devices; putting the data in a central system creates an overhead. Also, the traditional ML/DL-based android malware classification techniques are not scalable. Researchers have proposed federated learning (FL) based android malware classification system to solve the privacy preservation and scalability with high classification performance. In traditional FL, Federated Averaging (FedAvg) is utilized to construct the global model at each round by merging all of the local models obtained from all of the customers that participated in the FL. However, the conventional FedAvg has a disadvantage: if one poor-performing local model is included in global model development for each round, it may result in an under-performing global model. Because FedAvg favors all local models equally when averaging. To address this issue, our main objective in this work is to design a dynamic weighted federated averaging (DW-FedAvg) strategy in which the weights for each local model are automatically updated based on their performance at the client. The DW-FedAvg is evaluated using four popular benchmark datasets, Melgenome, Drebin, Kronodroid and Tuandromd used in android malware classification research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题