使用贝叶斯高斯混合模型的无监督的基于用户的内部威胁检测

论文标题

使用贝叶斯高斯混合模型的无监督的基于用户的内部威胁检测

Unsupervised User-Based Insider Threat Detection Using Bayesian Gaussian Mixture Models

论文作者

Bertrand, Simon, Tawbi, Nadia, Desharnais, Josée

论文摘要

内幕威胁是组织日益关注的，因为其成员通过将其特权访问和领域知识结合起来可能造成的损害。尽管如此，对这种威胁的发现是具有挑战性的，这正是由于授权人员轻松采取恶意行动的能力，以及由于组织所产生的巨大规模和审计数据的巨大规模和多样性，其中少数恶意足迹被隐藏了。在本文中，我们根据贝叶斯高斯混合模型提出了一个基于审计数据的无监督的内部威胁检测系统。所提出的方法利用基于用户的模型来优化基于Word2Vec的特定行为模型和自动特征提取系统，以便在现实生活中易用。该解决方案可以通过不需要数据平衡或仅在正常情况下训练数据来区分自身，并且通过其实施所需的小领域知识进行培训。尽管如此，结果表明，所提出的方法与最先进的方法竞争，表现出88 \％的良好召回，准确性和真正的负率为93％，误报率为6.9％。对于我们的实验，我们使用了基准数据集证书4.2版。

Insider threats are a growing concern for organizations due to the amount of damage that their members can inflict by combining their privileged access and domain knowledge. Nonetheless, the detection of such threats is challenging, precisely because of the ability of the authorized personnel to easily conduct malicious actions and because of the immense size and diversity of audit data produced by organizations in which the few malicious footprints are hidden. In this paper, we propose an unsupervised insider threat detection system based on audit data using Bayesian Gaussian Mixture Models. The proposed approach leverages a user-based model to optimize specific behaviors modelization and an automatic feature extraction system based on Word2Vec for ease of use in a real-life scenario. The solution distinguishes itself by not requiring data balancing nor to be trained only on normal instances, and by its little domain knowledge required to implement. Still, results indicate that the proposed method competes with state-of-the-art approaches, presenting a good recall of 88\%, accuracy and true negative rate of 93%, and a false positive rate of 6.9%. For our experiments, we used the benchmark dataset CERT version 4.2.

下载PDF全文

下载文献需遵守相关版权规定

论文标题