论文标题

一次性负顺序挖掘

One-off Negative Sequential Pattern Mining

论文作者

Wu, Youxi, Chen, Mingjie, Li, Yan, Liu, Jing, Li, Zhao, Li, Jinyan, Wu, Xindong

论文摘要

负顺序模式挖掘(SPM)是一个重要的SPM研究主题。与阳性SPM不同,负SPM可以发现本应发生但尚未发生的事件,并且可以用于财务风险管理和欺诈检测。但是,现有方法通常忽略了模式的重复,并且不考虑差距约束,这可能会导致采矿结果,其中包含用户不感兴趣的大量模式。要解决此问题,本文发现频繁的一次性负面顺序模式(ONPS)。这个问题具有以下两个特征。首先,支撑是在一次性条件下计算的,这意味着序列中的任何字符最多只能使用一次。其次,差距约束可以由用户给出。为了有效地挖掘模式,本文提出了ONP-Miner算法,该算法采用深度优先和回溯策略来计算支持。因此,OnP-Miner可以有效避免创建冗余节点和亲子关系。此外,为了有效地减少候选模式的数量,ONP-Miner使用模式联接和修剪策略分别生成和进一步修剪候选模式。实验结果表明,ONP-Miner不仅提高了采矿效率,而且比最先进的算法具有更好的采矿性能。更重要的是,ONP挖掘可以在流量量数据中找到更多有趣的模式,以预测未来的流量。

Negative sequential pattern mining (SPM) is an important SPM research topic. Unlike positive SPM, negative SPM can discover events that should have occurred but have not occurred, and it can be used for financial risk management and fraud detection. However, existing methods generally ignore the repetitions of the pattern and do not consider gap constraints, which can lead to mining results containing a large number of patterns that users are not interested in. To solve this problem, this paper discovers frequent one-off negative sequential patterns (ONPs). This problem has the following two characteristics. First, the support is calculated under the one-off condition, which means that any character in the sequence can only be used once at most. Second, the gap constraint can be given by the user. To efficiently mine patterns, this paper proposes the ONP-Miner algorithm, which employs depth-first and backtracking strategies to calculate the support. Therefore, ONP-Miner can effectively avoid creating redundant nodes and parent-child relationships. Moreover, to effectively reduce the number of candidate patterns, ONP-Miner uses pattern join and pruning strategies to generate and further prune the candidate patterns, respectively. Experimental results show that ONP-Miner not only improves the mining efficiency, but also has better mining performance than the state-of-the-art algorithms. More importantly, ONP mining can find more interesting patterns in traffic volume data to predict future traffic.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源