用于网络入侵检测的多阶段优化机器学习框架

论文标题

用于网络入侵检测的多阶段优化机器学习框架

Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection

论文作者

Injadat, MohammadNoor, Moubayed, Abdallah, Nassif, Ali Bou, Shami, Abdallah

论文摘要

由于个人和组织在互联网上的依赖性增加以及他们对其在线活动的安全性和隐私的关注，网络安全引起了极大的关注。已经开发了几种以前的机器学习（ML）网络入侵检测系统（NIDSS）来防止恶意的在线行为。本文提出了一个新型的多阶段优化基于ML的NIDS框架，该框架在保持其检测性能的同时降低了计算复杂性。这项工作研究了过采样技术对模型训练样本量的影响，并确定了最少合适的训练样本量。此外，它比较了两种特征选择技术，即基于信息增益和基于相关性，并探讨了它们对检测性能和时间复杂性的影响。此外，研究了不同的ML高参数优化技术，以提高NIDS的性能。使用两个最近的入侵检测数据集（CICIDS 2017和UNSW-NB 2015数据集）评估了提出的框架的性能。实验结果表明，所提出的模型大大减少了所需的训练样本量（高达74％）和特征集尺寸（高达50％）。此外，通过高参数优化，模型性能通过两个数据集的检测准确性超过99％而增强，表现优于最近的文献的精度提高了1-2％，错误警报率降低了1-2％。

Cyber-security garnered significant attention due to the increased dependency of individuals and organizations on the Internet and their concern about the security and privacy of their online activities. Several previous machine learning (ML)-based network intrusion detection systems (NIDSs) have been developed to protect against malicious online behavior. This paper proposes a novel multi-stage optimized ML-based NIDS framework that reduces computational complexity while maintaining its detection performance. This work studies the impact of oversampling techniques on the models' training sample size and determines the minimal suitable training sample size. Furthermore, it compares between two feature selection techniques, information gain and correlation-based, and explores their effect on detection performance and time complexity. Moreover, different ML hyper-parameter optimization techniques are investigated to enhance the NIDS's performance. The performance of the proposed framework is evaluated using two recent intrusion detection datasets, the CICIDS 2017 and the UNSW-NB 2015 datasets. Experimental results show that the proposed model significantly reduces the required training sample size (up to 74%) and feature set size (up to 50%). Moreover, the model performance is enhanced with hyper-parameter optimization with detection accuracies over 99% for both datasets, outperforming recent literature works by 1-2% higher accuracy and 1-2% lower false alarm rate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题