论文标题

部分可观测时空混沌系统的无模型预测

AccMPEG: Optimizing Video Encoding for Video Analytics

论文作者

Du, Kuntai, Zhang, Qizheng, Arapin, Anton, Wang, Haodong, Xia, Zhengxu, Jiang, Junchen

论文摘要

随着边缘传感器(相机)录制更多视频,并通过计算机视觉深神经网(DNNS)进行分析,因此出现了一种新的视频流系统,其目标是实时将视频压缩和将视频播放到远程服务器,同时保留足够的信息以通过服务器侧DNNS非常准确地进行推出。视频流系统的理想设计应同时满足三个关键要求:(1)编码和流的低潜伏期,(2)服务器端DNN的高精度以及(3)相机上的低计算开销。不幸的是,尽管最近进行了许多努力,但这种视频流系统迄今仍难以捉摸,尤其是在服务于对象检测或语义细分等高级视觉任务时。本文介绍了Accmpeg,这是一个满足所有三个要求的新视频编码和流媒体系统。关键是要了解每个(16x16)宏观块的编码质量有多少会影响服务器端DNN精度,我们称之为精度梯度。我们的见解是,可以通过廉价型号通过喂食视频框架来推断这些宏观层级的精度梯度。 ACCMPEG提供了一套技术套件,鉴于新的服务器端DNN,可以快速创建一个廉价的型号,以在接近实时的任何新框架上推断精度梯度。我们对ACCMPEG对两种类型的边缘设备(一种Intel Xeon Silver 4100 CPU或NVIDIA JETSON NANO)的广泛评估和三个视觉任务(最近的六个预先训练的DNN)表明ACCMPEG(具有相同的摄像机侧计算资源)可以减少与终端的延迟相比,可以减少10-43%的限制,并降低10-43%的限制。

With more videos being recorded by edge sensors (cameras) and analyzed by computer-vision deep neural nets (DNNs), a new breed of video streaming systems has emerged, with the goal to compress and stream videos to remote servers in real time while preserving enough information to allow highly accurate inference by the server-side DNNs. An ideal design of the video streaming system should simultaneously meet three key requirements: (1) low latency of encoding and streaming, (2) high accuracy of server-side DNNs, and (3) low compute overheads on the camera. Unfortunately, despite many recent efforts, such video streaming system has hitherto been elusive, especially when serving advanced vision tasks such as object detection or semantic segmentation. This paper presents AccMPEG, a new video encoding and streaming system that meets all the three requirements. The key is to learn how much the encoding quality at each (16x16) macroblock can influence the server-side DNN accuracy, which we call accuracy gradient. Our insight is that these macroblock-level accuracy gradient can be inferred with sufficient precision by feeding the video frames through a cheap model. AccMPEG provides a suite of techniques that, given a new server-side DNN, can quickly create a cheap model to infer the accuracy gradient on any new frame in near realtime. Our extensive evaluation of AccMPEG on two types of edge devices (one Intel Xeon Silver 4100 CPU or NVIDIA Jetson Nano) and three vision tasks (six recent pre-trained DNNs) shows that AccMPEG (with the same camera-side compute resources) can reduce the end-to-end inference delay by 10-43% without hurting accuracy compared to the state-of-the-art baselines

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源