Tadgan：使用生成对抗网络的时间序列异常检测

论文标题

Tadgan：使用生成对抗网络的时间序列异常检测

TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks

论文作者

Geiger, Alexander, Liu, Dongyu, Alnegheimish, Sarah, Cuesta-Infante, Alfredo, Veeramachaneni, Kalyan

论文摘要

时间序列异常可以提供与从金融和航空航天到IT，安全和医疗领域的各个领域面临的关键情况有关的信息。但是，由于对异常的含糊定义，在时间序列数据中检测异常尤其具有挑战性，并且说数据经常缺乏标签和高度复杂的时间相关性。当前用于异常检测的无监督的机器学习方法遭受可伸缩性和可移植性问题的影响，并且可能具有很高的假阳性率。在本文中，我们提出了Tadgan，这是一种基于生成对抗网络（GAN）的无监督的异常检测方法。为了捕获时间序列分布的时间相关性，我们使用LSTM复发性神经网络作为发电机和评论家的基本模型。 Tadgan接受了周期一致性损失的训练，以允许有效的时间序列数据重建。我们进一步提出了几种新的方法来计算重建误差，以及结合重建误差和批评家输出以计算异常得分的不同方法。为了证明我们方法的性能和概括性，我们测试了几种异常评分技术，并报告了最合适的评分技术。我们将我们的方法与来自NASA，NASA，Yahoo，Numenta，Amazon和Twitter等多个信誉量的11个数据集上的8个基线异常检测方法进行了比较。结果表明，在大多数情况下，我们的方法可以有效地检测异常情况，并且表现优于基线方法（11个中的6种）。值得注意的是，我们的方法在所有数据集中的平均F1得分最高。我们的代码是开源的，可以作为基准工具可用。

Time series anomalies can offer information relevant to critical situations facing various fields, from finance and aerospace to the IT, security, and medical domains. However, detecting anomalies in time series data is particularly challenging due to the vague definition of anomalies and said data's frequent lack of labels and highly complex temporal correlations. Current state-of-the-art unsupervised machine learning methods for anomaly detection suffer from scalability and portability issues, and may have high false positive rates. In this paper, we propose TadGAN, an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs). To capture the temporal correlations of time series distributions, we use LSTM Recurrent Neural Networks as base models for Generators and Critics. TadGAN is trained with cycle consistency loss to allow for effective time-series data reconstruction. We further propose several novel methods to compute reconstruction errors, as well as different approaches to combine reconstruction errors and Critic outputs to compute anomaly scores. To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one. We compare our approach to 8 baseline anomaly detection methods on 11 datasets from multiple reputable sources such as NASA, Yahoo, Numenta, Amazon, and Twitter. The results show that our approach can effectively detect anomalies and outperform baseline methods in most cases (6 out of 11). Notably, our method has the highest averaged F1 score across all the datasets. Our code is open source and is available as a benchmarking tool.

下载PDF全文

下载文献需遵守相关版权规定

论文标题