JPEG 2000压缩域中使用深神经网络的遥感图像场景分类

论文标题

JPEG 2000压缩域中使用深神经网络的遥感图像场景分类

Remote Sensing Image Scene Classification with Deep Neural Networks in JPEG 2000 Compressed Domain

论文作者

Byju, Akshara Preethy, Sumbul, Gencer, Demir, Begüm, Bruzzone, Lorenzo

论文摘要

为了减少存储要求，遥感图像通常以压缩格式存储。使用深神经网络（DNN）的现有场景分类方法需要完全解压缩图像，这是操作应用程序中一项计算要求的任务。为了解决这个问题，在本文中，我们提出了一种新型方法，以在JPEG 2000压缩RS图像中实现场景分类。所提出的方法包括两个主要步骤：i）jpeg 2000中使用的可逆生物息肉小波滤波器的较精细分辨率子频段的近似；和ii）基于学习的描述符的近似小波子频段和场景分类的高级语义含量的表征。这是通过采用与最高分辨率小波子频段相关的编码词来实现的，作为使用多个转置卷积层的近似较好分辨率子带的输入。然后，一系列的卷积层建模了近似小波子带的高级语义含量。因此，所提出的方法模拟了端到端可训练的统一神经网络中JPEG 2000压缩算法中给出的多分辨率范式。在分类阶段，所提出的方法仅将最高的分辨率小波子带作为输入，从而减少了应用解码所需的时间。与传统的RS场景分类方法相比，在两个基准的空中图像档案上进行的实验结果表明，所提出的方法可以用相似的分类精度（需要完整的图像减压）显着减少计算时间。

To reduce the storage requirements, remote sensing (RS) images are usually stored in compressed format. Existing scene classification approaches using deep neural networks (DNNs) require to fully decompress the images, which is a computationally demanding task in operational applications. To address this issue, in this paper we propose a novel approach to achieve scene classification in JPEG 2000 compressed RS images. The proposed approach consists of two main steps: i) approximation of the finer resolution sub-bands of reversible biorthogonal wavelet filters used in JPEG 2000; and ii) characterization of the high-level semantic content of approximated wavelet sub-bands and scene classification based on the learnt descriptors. This is achieved by taking codestreams associated with the coarsest resolution wavelet sub-band as input to approximate finer resolution sub-bands using a number of transposed convolutional layers. Then, a series of convolutional layers models the high-level semantic content of the approximated wavelet sub-band. Thus, the proposed approach models the multiresolution paradigm given in the JPEG 2000 compression algorithm in an end-to-end trainable unified neural network. In the classification stage, the proposed approach takes only the coarsest resolution wavelet sub-bands as input, thereby reducing the time required to apply decoding. Experimental results performed on two benchmark aerial image archives demonstrate that the proposed approach significantly reduces the computational time with similar classification accuracies when compared to traditional RS scene classification approaches (which requires full image decompression).

下载PDF全文

下载文献需遵守相关版权规定

论文标题