端到端学习的基于块的图像压缩，带有块级蒙版的卷积和渐近闭环训练

论文标题

端到端学习的基于块的图像压缩，带有块级蒙版的卷积和渐近闭环训练

End-to-End Learned Block-Based Image Compression with Block-Level Masked Convolutions and Asymptotic Closed Loop Training

论文作者

Kamisli, Fatih

论文摘要

学识渊博的图像压缩研究已通过基于自动编码器的神经网络体系结构实现了最新的压缩性能，其中图像是通过卷积神经网络（CNN）映射到潜在表示中的，该表示与CNN再次对其进行量化并处理以获得重建图像。 CNN在整个输入图像上操作。另一方面，出于各种原因，传统的最先进的图像和视频压缩方法采用阻塞的处理方法处理图像。最近，还出现了使用基于块的方法进行学习的图像压缩的工作，这些方法在输入图像的大块上使用自动编码器体系结构，并引入其他神经网络，这些神经网络可以执行/空间预测和脱机/后处理功能。本文探讨了一种替代学习的基于块的图像压缩方法，其中均未使用显式内部预测神经网络和显式脱盖神经网络。使用具有块级蒙版卷积的单个自动编码器神经网络，并且块大小要小得多（8x8）。通过使用块级掩蔽的卷积，使用编码器和解码器处的重建的左侧和上块来处理每个块。因此，在压缩过程中利用相邻块之间的相互信息，并使用相邻块重建每个块，从而解决了对显式内部预测和脱破坏神经网络的需求。由于探索系统是一个闭环系统，因此一种特殊的优化程序，即渐近闭环设计，用于基于标准的随机梯度下降训练。实验结果表明竞争性图像压缩性能。

Learned image compression research has achieved state-of-the-art compression performance with auto-encoder based neural network architectures, where the image is mapped via convolutional neural networks (CNN) into a latent representation that is quantized and processed again with CNN to obtain the reconstructed image. CNN operate on entire input images. On the other hand, traditional state-of-the-art image and video compression methods process images with a block-by-block processing approach for various reasons. Very recently, work on learned image compression with block based approaches have also appeared, which use the auto-encoder architecture on large blocks of the input image and introduce additional neural networks that perform intra/spatial prediction and deblocking/post-processing functions. This paper explores an alternative learned block-based image compression approach in which neither an explicit intra prediction neural network nor an explicit deblocking neural network is used. A single auto-encoder neural network with block-level masked convolutions is used and the block size is much smaller (8x8). By using block-level masked convolutions, each block is processed using reconstructed neighboring left and upper blocks both at the encoder and decoder. Hence, the mutual information between adjacent blocks is exploited during compression and each block is reconstructed using neighboring blocks, resolving the need for explicit intra prediction and deblocking neural networks. Since the explored system is a closed loop system, a special optimization procedure, the asymptotic closed loop design, is used with standard stochastic gradient descent based training. The experimental results indicate competitive image compression performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题