论文标题

E^2VT:无人驾驶汽车的节能视频文本斑点

E^2VTS: Energy-Efficient Video Text Spotting from Unmanned Aerial Vehicles

论文作者

Hu, Zhenyu, Wu, Zhenyu, Pi, Pengcheng, Xue, Yunhe, Shen, Jiayi, Tan, Jianchao, Lian, Xiangru, Wang, Zhangyang, Liu, Ji

论文摘要

基于无人驾驶飞机(UAV)的视频文本斑点已广泛用于民用和军事领域。无人机的电池容量有限激发了我们开发节能视频文本发现解决方案。在本文中,我们首先重新访问RCNN的农作物和调整培训策略,并从经验上发现,它的表现优于在无人机捕获的真实世界视频文本数据集上对齐的ROI采样。为了减少能耗,我们进一步提出了一个多阶段图像处理器,该处理器考虑了视频的冗余,连续性和混合降解。最后,该模型在部署在Raspberry Pi上之前进行修剪和量化。我们提出的节能视频文本发现解决方案被称为E^2VT,通过实现能源效率和性能之间的竞争权衡来优于所有以前的方法。我们所有的代码和预培训模型均可在https://github.com/wuzhenyusjtu/lpcvc20-videotextspotting上获得。

Unmanned Aerial Vehicles (UAVs) based video text spotting has been extensively used in civil and military domains. UAV's limited battery capacity motivates us to develop an energy-efficient video text spotting solution. In this paper, we first revisit RCNN's crop & resize training strategy and empirically find that it outperforms aligned RoI sampling on a real-world video text dataset captured by UAV. To reduce energy consumption, we further propose a multi-stage image processor that takes videos' redundancy, continuity, and mixed degradation into account. Lastly, the model is pruned and quantized before deployed on Raspberry Pi. Our proposed energy-efficient video text spotting solution, dubbed as E^2VTS, outperforms all previous methods by achieving a competitive tradeoff between energy efficiency and performance. All our codes and pre-trained models are available at https://github.com/wuzhenyusjtu/LPCVC20-VideoTextSpotting.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源