论文标题

在世界范围内找到另一个我 - 使用Spark的大规模语义轨迹分析

Find Another Me Across the World -- Large-scale Semantic Trajectory Analysis Using Spark

论文作者

Cai, Chaoquan, Lin, Dan

论文摘要

在当今社会中,广泛使用基于位置的服务,这些服务收集了大量的人类轨迹。分析这些轨迹的语义含义可以使众多现实世界应用受益,例如产品广告,朋友建议和社交行为分析。但是,关于语义轨迹的现有作品主要是集中的方法,无法跟上快速增长的轨迹收集。在本文中,我们提出了一种新型的大规模语义轨迹分析算法Apache Spark。我们设计了一种新的哈希功能以及有效的分布式算法,这些算法可以快速计算语义轨迹相似性并确定世界各地具有相似行为的人们的社区。实验结果表明,我们的方法比集中式方法快30倍以上,而没有像其他平行方法那样牺牲任何准确性。

In today's society, location-based services are widely used which collect a huge amount of human trajectories. Analyzing semantic meanings of these trajectories can benefit numerous real-world applications, such as product advertisement, friend recommendation, and social behavior analysis. However, existing works on semantic trajectories are mostly centralized approaches that are not able to keep up with the rapidly growing trajectory collections. In this paper, we propose a novel large-scale semantic trajectory analysis algorithm in Apache Spark. We design a new hash function along with efficient distributed algorithms that can quickly compute semantic trajectory similarities and identify communities of people with similar behavior across the world. The experimental results show that our approach is more than 30 times faster than centralized approaches without sacrificing any accuracy like other parallel approaches.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源