深度多模式转移学习域中的回归

论文标题

深度多模式转移学习域中的回归

Deep Multimodal Transfer-Learned Regression in Data-Poor Domains

论文作者

McClenny, Levi, Haile, Mulugeta, Attari, Vahid, Sadler, Brian, Braga-Neto, Ulisses, Arroyave, Raymundo

论文摘要

在许多深度学习的实际应用中，对目标的估计可能依赖于各种类型的输入数据模式，例如音频视频，图像文本等。由于缺乏足够的数据，此任务可能会使此任务更加复杂。在这里，我们提出了一个深层的多模式转移学习回归器（DMTL-R），用于在深度回归体系结构中对图像和特征数据进行多模式学习有效地预测数据罚款域中的目标参数。我们的模型能够在少量训练图像数据上微调一组给定的预训练的CNN权重，同时在网络训练期间从免费数据模式下根据功能信息进行调节，从而比使用图像或单独功能可以实现更准确的单目标或多目标回归。我们使用带有一组物理特征的相位模拟微观结构图像提出了结果，并使用来自各种众所周知的CNN体系结构的预训练的权重，这些权重证明了所提出的多模式方法的功效。

In many real-world applications of deep learning, estimation of a target may rely on various types of input data modes, such as audio-video, image-text, etc. This task can be further complicated by a lack of sufficient data. Here we propose a Deep Multimodal Transfer-Learned Regressor (DMTL-R) for multimodal learning of image and feature data in a deep regression architecture effective at predicting target parameters in data-poor domains. Our model is capable of fine-tuning a given set of pre-trained CNN weights on a small amount of training image data, while simultaneously conditioning on feature information from a complimentary data mode during network training, yielding more accurate single-target or multi-target regression than can be achieved using the images or the features alone. We present results using phase-field simulation microstructure images with an accompanying set of physical features, using pre-trained weights from various well-known CNN architectures, which demonstrate the efficacy of the proposed multimodal approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题