不规则时间序列的广义解释形状

论文标题

不规则时间序列的广义解释形状

Generalised Interpretable Shapelets for Irregular Time Series

论文作者

Kidger, Patrick, Morrill, James, Lyons, Terry

论文摘要

Shapelet变换是时间序列的一种特征提取形式，其中时间序列与其与“塑形集”集合的相似性描述。但是，它以前遭受了许多局限性，例如仅限于定期间隔完全观察的时间序列，并且必须在有效的培训和可解释性之间进行选择。在这里，我们将方法扩展到连续时间，然后处理不规则采样的部分观察到的多元时间序列的一般情况。此外，我们表明可以使用简单的正规化惩罚来有效地训练而无需牺牲可解释性。连续时间配方还允许以可区分的方式学习每个塑形（以前是离散对象）的长度。最后，我们证明了时间序列之间相似性的度量可以推广到学习的伪计。我们通过在几个数据集上证明其性能和可解释性来验证我们的方法；例如，我们发现（纯粹是从数据中），数字5和6可以通过其底部循环的手性来区分，并且在口语音频分类中存在一种频谱差距。

The shapelet transform is a form of feature extraction for time series, in which a time series is described by its similarity to each of a collection of `shapelets'. However it has previously suffered from a number of limitations, such as being limited to regularly-spaced fully-observed time series, and having to choose between efficient training and interpretability. Here, we extend the method to continuous time, and in doing so handle the general case of irregularly-sampled partially-observed multivariate time series. Furthermore, we show that a simple regularisation penalty may be used to train efficiently without sacrificing interpretability. The continuous-time formulation additionally allows for learning the length of each shapelet (previously a discrete object) in a differentiable manner. Finally, we demonstrate that the measure of similarity between time series may be generalised to a learnt pseudometric. We validate our method by demonstrating its performance and interpretability on several datasets; for example we discover (purely from data) that the digits 5 and 6 may be distinguished by the chirality of their bottom loop, and that a kind of spectral gap exists in spoken audio classification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题