论文标题
智能手机运输模式使用分层机学习分类器和频率域的合并功能识别
Smartphone Transportation Mode Recognition Using a Hierarchical Machine Learning Classifier and Pooled Features From Time and Frequency Domains
论文作者
论文摘要
本文开发了一种新型的两层分层分类器,可提高传统运输模式分类算法的准确性。本文还通过提取新的频域特征来提高分类精度。许多研究人员从全球定位系统数据中获得了这些功能。但是,本文排除了此数据,因为系统使用可能会耗尽智能手机的电池,并且在某些地区可能会丢失信号。我们提出的两层框架与以前的分类尝试不同:1)使用贝叶斯规则将这两层的输出组合在一起,以选择最大的后验概率的运输模式; 2)提出的框架将新提取的功能与传统使用的时域功能相结合,以创建一系列功能; 3)根据分类模式在每层中使用不同的提取特征子集。使用了几种机器学习技术,包括k-nearest邻居,分类和回归树,支持向量机,随机森林以及随机森林和支持向量机的异质框架。结果表明,所提出的框架的分类精度优于传统方法。将时域特征转换为频域也在新空间中添加了新功能,并提供了更多有关信息丢失的控制。因此,将时域和频域在大池中的特征组合,然后选择最佳子集比单独使用任何一个域相比会产生更高的精度。提出的两层分类器获得的最大分类精度为97.02%。
This paper develops a novel two-layer hierarchical classifier that increases the accuracy of traditional transportation mode classification algorithms. This paper also enhances classification accuracy by extracting new frequency domain features. Many researchers have obtained these features from global positioning system data; however, this data was excluded in this paper, as the system use might deplete the smartphone's battery and signals may be lost in some areas. Our proposed two-layer framework differs from previous classification attempts in three distinct ways: 1) the outputs of the two layers are combined using Bayes' rule to choose the transportation mode with the largest posterior probability; 2) the proposed framework combines the new extracted features with traditionally used time domain features to create a pool of features; and 3) a different subset of extracted features is used in each layer based on the classified modes. Several machine learning techniques were used, including k-nearest neighbor, classification and regression tree, support vector machine, random forest, and a heterogeneous framework of random forest and support vector machine. Results show that the classification accuracy of the proposed framework outperforms traditional approaches. Transforming the time domain features to the frequency domain also adds new features in a new space and provides more control on the loss of information. Consequently, combining the time domain and the frequency domain features in a large pool and then choosing the best subset results in higher accuracy than using either domain alone. The proposed two-layer classifier obtained a maximum classification accuracy of 97.02%.