迈向弹性机器学习分类器 - 勒索软件检测的案例研究

论文标题

迈向弹性机器学习分类器 - 勒索软件检测的案例研究

Towards a Resilient Machine Learning Classifier -- a Case Study of Ransomware Detection

论文作者

Yang, Chih-Yuan, Sahita, Ravi

论文摘要

由于加密而引起的加密软件造成的损坏很难恢复并导致数据丢失。在本文中，构建了一种机器学习（ML）分类器，以早期检测勒索软件（称为加密抛光软件），该勒索软件使用程序行为使用加密图。如果错过了基于签名的检测，则基于行为的检测器可能是检测和包含损坏的最后防御线。我们发现，勒索软件和文件包含熵的输入/输出活动是检测加密货币软件的唯一特征。深度学习（DL）分类器可以以高准确性和较低的假正率检测勒索软件。我们针对产生的模型进行对抗性研究。我们使用模拟的勒索软件程序来启动灰色盒分析，以探测ML分类器的弱点并提高模型鲁棒性。除了准确性和弹性外，可信度是质量检测器的其他关键标准。确保将正确的信息用于推断对于安全应用程序很重要。综合梯度方法用于解释深度学习模型，并揭示为什么假阴性逃避检测。展示和讨论了建造和评估现实世界检测器的方法。

The damage caused by crypto-ransomware, due to encryption, is difficult to revert and cause data losses. In this paper, a machine learning (ML) classifier was built to early detect ransomware (called crypto-ransomware) that uses cryptography by program behavior. If a signature-based detection was missed, a behavior-based detector can be the last line of defense to detect and contain the damages. We find that input/output activities of ransomware and the file-content entropy are unique traits to detect crypto-ransomware. A deep-learning (DL) classifier can detect ransomware with a high accuracy and a low false positive rate. We conduct an adversarial research against the models generated. We use simulated ransomware programs to launch a gray-box analysis to probe the weakness of ML classifiers and to improve model robustness. In addition to accuracy and resiliency, trustworthiness is the other key criteria for a quality detector. Making sure that the correct information was used for inference is important for a security application. The Integrated Gradient method was used to explain the deep learning model and also to reveal why false negatives evade the detection. The approaches to build and to evaluate a real-world detector were demonstrated and discussed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题