论文标题

基于自适应差异的非阴性潜在因素分析

Adaptive Divergence-based Non-negative Latent Factor Analysis

论文作者

Yuan, Ye, Yuan, Guangxiao, Wang, Renfang, Luo, Xin

论文摘要

高维和不完整(HDI)数据在各种工业应用中经常发现,在众多节点之间具有复杂的相互作用,这些节点通常非负用于表示节点相互作用的固有非负性。非负潜在因子(NLF)模型能够有效地从此类数据中提取固有特征。但是,现有的NLF模型都采用了静态差异度量,例如欧几里得距离或α-\ b {eta}散度来构建其学习目标,这极大地限制了其准确代表来自不同领域的HDI数据的可扩展性。为了解决这个问题,本研究提出了一种基于自适应差异的非负潜在因素(ADNLF)模型,具有三个思想:a)用α-\ b {eta} divergence概括目标函数,以扩大其代表各种HDI数据的潜力; b)采用非负桥接功能将优化变量与输出潜在因素连接起来,以不断实现非阴性约束; c)通过粒子群优化使差异参数自适应,从而促进学习目标中的适应性差异以实现高可扩展性。实证研究是在来自真实应用的四个HDI数据集上进行的,其结果表明,与最先进的NLF模型相比,ADNLF模型的估计精度明显更高,对于具有较高计算效率的HDI数据集的缺失数据。

High-Dimensional and Incomplete (HDI) data are frequently found in various industrial applications with complex interactions among numerous nodes, which are commonly non-negative for representing the inherent non-negativity of node interactions. A Non-negative Latent Factor (NLF) model is able to extract intrinsic features from such data efficiently. However, existing NLF models all adopt a static divergence metric like Euclidean distance or α-\b{eta} divergence to build its learning objective, which greatly restricts its scalability of accurately representing HDI data from different domains. Aiming at addressing this issue, this study presents an Adaptive Divergence-based Non-negative Latent Factor (ADNLF) model with three-fold ideas: a) generalizing the objective function with the α-\b{eta}-divergence to expand its potential of representing various HDI data; b) adopting a non-negative bridging function to connect the optimization variables with output latent factors for fulfilling the non-negativity constraints constantly; and c) making the divergence parameters adaptive through particle swarm optimization, thereby facilitating adaptive divergence in the learning objective to achieve high scalability. Empirical studies are conducted on four HDI datasets from real applications, whose results demonstrate that in comparison with state-of-the-art NLF models, an ADNLF model achieves significantly higher estimation accuracy for missing data of an HDI dataset with high computational efficiency.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源