论文标题
在LinkedIn中识别假档案
Identifying Fake Profiles in LinkedIn
论文作者
论文摘要
随着组织越来越依赖于以专业为导向的网络(例如LinkedIn(最大的社交网络))来建立业务联系,因此在网络中注意到自己的个人资料的价值越来越大。随着该价值的增加,出于不道德目的滥用网络的诱惑也是如此。假概要文件对整个网络的可信度产生不利影响,并且可以根据假信息来代表时间和精力的巨大成本。不幸的是,很难识别伪造的配置文件。已经为某些社交网络提出了方法。但是,这些通常取决于无法公开可用于LinkedIn配置文件的数据。在这项研究中,我们确定了在LinkedIn中识别伪造资料所需的最小配置文件集,并提出了一种适当的数据挖掘方法来进行伪造概况识别。我们证明,即使有限的配置文件数据,我们的方法也可以以87%的精度和94%的真实负率识别伪造的配置文件,这与基于较大的数据集和更广泛的配置信息信息获得的结果相当。此外,与使用类似数量和类型的数据相比,我们的方法可提高大约14%的精度。
As organizations increasingly rely on professionally oriented networks such as LinkedIn (the largest such social network) for building business connections, there is increasing value in having one's profile noticed within the network. As this value increases, so does the temptation to misuse the network for unethical purposes. Fake profiles have an adverse effect on the trustworthiness of the network as a whole, and can represent significant costs in time and effort in building a connection based on fake information. Unfortunately, fake profiles are difficult to identify. Approaches have been proposed for some social networks; however, these generally rely on data that are not publicly available for LinkedIn profiles. In this research, we identify the minimal set of profile data necessary for identifying fake profiles in LinkedIn, and propose an appropriate data mining approach for fake profile identification. We demonstrate that, even with limited profile data, our approach can identify fake profiles with 87% accuracy and 94% True Negative Rate, which is comparable to the results obtained based on larger data sets and more expansive profile information. Further, when compared to approaches using similar amounts and types of data, our method provides an improvement of approximately 14% accuracy.