论文标题
区分网络钓鱼和合法网站时,使用网络结构和社区检测来发现重要的网站功能
Using network structure and community detection to discover important website features when distinguishing between phishing and legitimate ones
论文作者
论文摘要
在本文中,我们揭示了允许智能模型区分网络钓鱼和合法网站的网站的基本特征。网络钓鱼网站是使用类似的用户界面制造的网站,与值得信赖的网站的地址几乎相似,以说服用户输入其私人数据,以获取攻击者的潜在滥用。用智能系统检测网络钓鱼网站是保护使用HTTP协议的用户,公司和其他在线服务的重要目标。智能模型需要区分重要的特征,这些功能是预测网络钓鱼站点的输入。在这项研究中,使用基于相关的网络,我们提供了一种基于网络的新方法,以找到在网络钓鱼检测中更重要的特征。在已建立的网络钓鱼数据集上对网络进行了训练和测试。三个不同的网络是通过通过数据实例标签对数据集进行划分的三个不同网络。通过发现这些网络的枢纽,可以找到重要的特征,并介绍和分析结果。这是第一次使用基于网络的方法进行功能选择,这是一种快速准确的方法。
In this paper, we uncover the essential features of websites that allow intelligent models to distinguish between phishing and legitimate sites. Phishing websites are those that are made with a similar user interface and a near similar address to trustworthy websites in order to persuade users to input their private data for potential future misuse by attackers. Detecting phishing websites with intelligent systems is an important goal to protect users, companies, and other online services that use the HTTP protocol. An intelligent model needs to distinguish features that are important as input to predict phishing sites. In this research, using correlation-based networks, we provide a novel network-based method to find features that are more important in phishing detection. The networks are trained and tested on an established phishing dataset. Three different networks are made by partitioning the dataset by its data instance labels. The important features are found by discovering the hubs of these networks, and the results are presented and analysed. This is the first time using a network-based approach for feature selection which is a fast and accurate way to do so.