论文标题
MASKIT:掩盖有效利用不完整的公共数据集用于培训深度学习模型
MaskIt: Masking for efficient utilization of incomplete public datasets for training deep learning models
论文作者
论文摘要
培训深度学习模型的主要挑战是缺乏高质量和完整的数据集。在本文中,我们提出了一种掩盖方法,用于从公开可用但不完整的数据集中培训深度学习模型。例如,德国汉堡市在道路上保留了树木清单,但该数据集不包含有关私人房屋和公园中树木的任何信息。为了在这样的数据集上训练深度学习模型,我们将街道树木和空中图像掩盖了道路网络。用于创建面具的道路网络从OpenStreetMap下载,它标志着训练数据可用的区域。掩码作为输入之一传递给模型,并且也覆盖了输出。我们的模型学会了只有78.4%的精度才能成功预测蒙面区域中的树木。
A major challenge in training deep learning models is the lack of high quality and complete datasets. In the paper, we present a masking approach for training deep learning models from a publicly available but incomplete dataset. For example, city of Hamburg, Germany maintains a list of trees along the roads, but this dataset does not contain any information about trees in private homes and parks. To train a deep learning model on such a dataset, we mask the street trees and aerial images with the road network. Road network used for creating the mask is downloaded from OpenStreetMap, and it marks the area where the training data is available. The mask is passed to the model as one of the inputs and it also coats the output. Our model learns to successfully predict trees only in the masked region with 78.4% accuracy.