在连接对象上识别人类活动识别数据流分类的基准

论文标题

在连接对象上识别人类活动识别数据流分类的基准

A benchmark of data stream classification for human activity recognition on connected objects

论文作者

Khannouz, Martin, Glatard, Tristan

论文摘要

本文从连接的设备的角度评估了数据流分类器，重点是HAR的用例。我们测量了五种常规流分类算法的分类性能和资源消耗（运行时，内存和功率），该算法在一致的库中实现，并应用于两个真实的人类活动数据集和三个合成数据集。关于分类性能，结果表明，HT，MF和NB分类器的总体优势超过了FNN，而Micro Cluster最近的邻居（MCNN）分类器在6个数据集中的6个数据集中（包括真实的数据集）中的总体优势。另外，HT和MCNN在某种程度上是唯一可以从概念漂移中恢复的分类器。总体而言，这三个领先的分类器的性能仍然大大低于实际数据集中的离线分类器。关于资源消耗，HT和MF是最密集的内存，并且运行时间最长，但是，分类器之间没有发现功耗差异。我们得出的结论是，HAR在连接的对象上为HAR的流学习受到两个因素的挑战，这些因素可能会导致有趣的未来工作：高度记忆消耗和整体上的F1得分较低。

This paper evaluates data stream classifiers from the perspective of connected devices, focusing on the use case of HAR. We measure both classification performance and resource consumption (runtime, memory, and power) of five usual stream classification algorithms, implemented in a consistent library, and applied to two real human activity datasets and to three synthetic datasets. Regarding classification performance, results show an overall superiority of the HT, the MF, and the NB classifiers over the FNN and the Micro Cluster Nearest Neighbor (MCNN) classifiers on 4 datasets out of 6, including the real ones. In addition, the HT, and to some extent MCNN, are the only classifiers that can recover from a concept drift. Overall, the three leading classifiers still perform substantially lower than an offline classifier on the real datasets. Regarding resource consumption, the HT and the MF are the most memory intensive and have the longest runtime, however, no difference in power consumption is found between classifiers. We conclude that stream learning for HAR on connected objects is challenged by two factors which could lead to interesting future work: a high memory consumption and low F1 scores overall.

下载PDF全文

下载文献需遵守相关版权规定

论文标题