提供基于社交网络结构和同质性相似性的链接预测模型

论文标题

提供基于社交网络结构和同质性相似性的链接预测模型

Providing a Link Prediction Model based on Structural and Homophily Similarity in Social Networks

论文作者

Eshaghpour, Alireza, Salehi, Mostafa, Ranjbar, Vahid

论文摘要

近年来，随着在线社交网络数量的越来越多，这些网络已成为广告和商业的最佳市场之一，因此研究这些网络非常重要。预测在线社交网络中的新边缘可以使我们更好地了解这些网络的增长。在工程和人文科学领域进行了许多关于链接预测的研究。科学家将两个个体之间的新关系归因于两个原因：1）靠近图（结构）2）两个个体的相似特性（同质法）。但是，研究两种方法共同创建新边缘的影响仍然是一个开放的问题。相似性指标也可以分为两类；基于社区和基于路径的。到目前为止，在基于邻里的指标中尚未一起发现上述两种理论方法（接近性和同型）。在本文中，我们首先尝试提供一个解决方案，以确定图形连接性中邻近图的重要性以及类似的特征。然后将获得的权重分配给近端和同质性。然后获得每种方法中的最佳相似性度量。最后，将所选的同质相似性和结构相似性的度量与获得的权重结合。这项研究的结果在两个数据集上进行了评估。 Zanjan大学社会科学和POKEC在线社交网络研究生院。为这项研究收集了第一个数据集，然后填写了问卷和数据收集方法。由于该数据集是及其用户规格编辑的少数伊朗数据集之一，因此它具有很高的价值。在本文中，我们能够通过在图和同质方法中使用两个接近度来提高基于邻里的相似性度量的准确性。

In recent years, with the growing number of online social networks, these networks have become one of the best markets for advertising and commerce, so studying these networks is very important. Forecasting new edges in online social networks can give us a better understanding of the growth of these networks. There have been many studies of link prediction in the field of engineering and humanities. Scientists attribute the existence of a new relationship between two individuals for two reasons: 1) Proximity to the graph (structure) 2) Similar properties of the two individuals (Homophile law). However, studying the impact of the two approaches working together to create new edges remains an open problem. Similarity metrics can also be divided into two categories; Neighborhood-based and path-based. So far, the above two theoretical approaches (proximity and homophile) have not been found together in the neighborhood-based metrics. In this paper, we first attempt to provide a solution to determine importance of the proximity to the graph and similar features in the connectivity of the graphs. Then obtained weights are assigned to both proximity and homophile. Then the best similarity metric in each approach are obtained. Finally, the selected metric of homophily similarity and structural similarity are combined with the obtained weights. The results of this study were evaluated on two datasets; Zanjan University Graduate School of Social Sciences and Pokec online Social Network. The first data set was collected for this study and then the questionnaires and data collection methods were filled out. Since this dataset is one of the few Iranian datasets that has been compiled with its users' specifications, it can be of great value. In this paper, we have been able to increase the accuracy of Neighborhood-based similarity metric by using two proximity in graph and homophily approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题