论文标题

图像分析从地理标签的推文流增强了事件检测

Image Analysis Enhanced Event Detection from Geo-tagged Tweet Streams

论文作者

Han, Yi, Karunasekera, Shanika, Leckie, Christopher

论文摘要

从社交媒体流中检测到的事件通常包括事故,犯罪或灾难的早期迹象。因此,相关方可以将它们用于及时有效的响应。尽管从推文流进行的事件检测方面取得了重大进展,但大多数现有方法尚未考虑到推文中发布的图像,这些图像比文本提供了更丰富的信息,并且可能是事件是否发生的可靠指标。在本文中,我们设计了一种事件检测算法,该算法通过无监督的机器学习方法结合了文本,统计和图像信息。具体而言,该算法从语义和统计分析开始,以获取一系列的推文簇列表,每个曲线都与事件候选人相对应,然后执行图像分析以将事件与非事件分开 - 卷积自动编码器均经过卷积自动编码器作为每个集群作为一个厌氧检测器,将图像用作训练数据的一部分,并且用作测试数据的一部分,并将其用作测试图像。我们在多个数据集上进行的实验验证了事件发生时,与候选人是非事实集群相比,训练和测试图像的平均重建错误和测试图像的距离更加接近。基于这一发现,该算法会拒绝候选者,如果差异大于阈值。超过数百万推文的实验结果表明,这种图像分析增强的方法可以显着提高精度,而对召回的最小影响。

Events detected from social media streams often include early signs of accidents, crimes or disasters. Therefore, they can be used by related parties for timely and efficient response. Although significant progress has been made on event detection from tweet streams, most existing methods have not considered the posted images in tweets, which provide richer information than the text, and potentially can be a reliable indicator of whether an event occurs or not. In this paper, we design an event detection algorithm that combines textual, statistical and image information, following an unsupervised machine learning approach. Specifically, the algorithm starts with semantic and statistical analyses to obtain a list of tweet clusters, each of which corresponds to an event candidate, and then performs image analysis to separate events from non-events---a convolutional autoencoder is trained for each cluster as an anomaly detector, where a part of the images are used as the training data and the remaining images are used as the test instances. Our experiments on multiple datasets verify that when an event occurs, the mean reconstruction errors of the training and test images are much closer, compared with the case where the candidate is a non-event cluster. Based on this finding, the algorithm rejects a candidate if the difference is larger than a threshold. Experimental results over millions of tweets demonstrate that this image analysis enhanced approach can significantly increase the precision with minimum impact on the recall.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源