E^2VT：无人驾驶汽车的节能视频文本斑点

论文标题

E^2VT：无人驾驶汽车的节能视频文本斑点

E^2VTS: Energy-Efficient Video Text Spotting from Unmanned Aerial Vehicles

论文作者

Hu, Zhenyu, Wu, Zhenyu, Pi, Pengcheng, Xue, Yunhe, Shen, Jiayi, Tan, Jianchao, Lian, Xiangru, Wang, Zhangyang, Liu, Ji

论文摘要

基于无人驾驶飞机（UAV）的视频文本斑点已广泛用于民用和军事领域。无人机的电池容量有限激发了我们开发节能视频文本发现解决方案。在本文中，我们首先重新访问RCNN的农作物和调整培训策略，并从经验上发现，它的表现优于在无人机捕获的真实世界视频文本数据集上对齐的ROI采样。为了减少能耗，我们进一步提出了一个多阶段图像处理器，该处理器考虑了视频的冗余，连续性和混合降解。最后，该模型在部署在Raspberry Pi上之前进行修剪和量化。我们提出的节能视频文本发现解决方案被称为E^2VT，通过实现能源效率和性能之间的竞争权衡来优于所有以前的方法。我们所有的代码和预培训模型均可在https://github.com/wuzhenyusjtu/lpcvc20-videotextspotting上获得。

Unmanned Aerial Vehicles (UAVs) based video text spotting has been extensively used in civil and military domains. UAV's limited battery capacity motivates us to develop an energy-efficient video text spotting solution. In this paper, we first revisit RCNN's crop & resize training strategy and empirically find that it outperforms aligned RoI sampling on a real-world video text dataset captured by UAV. To reduce energy consumption, we further propose a multi-stage image processor that takes videos' redundancy, continuity, and mixed degradation into account. Lastly, the model is pruned and quantized before deployed on Raspberry Pi. Our proposed energy-efficient video text spotting solution, dubbed as E^2VTS, outperforms all previous methods by achieving a competitive tradeoff between energy efficiency and performance. All our codes and pre-trained models are available at https://github.com/wuzhenyusjtu/LPCVC20-VideoTextSpotting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题