论文标题
代表学习的程序图像程序
Procedural Image Programs for Representation Learning
论文作者
论文摘要
使用合成数据学习图像表示可以训练神经网络,而无需与真实图像相关的某些问题,例如隐私和偏见。现有的工作着重于一些精心策划的生成过程,这些过程需要专家知识才能设计,从而使其难以扩展。为了克服这一点,我们建议使用二万个程序的大量数据集进行培训,每个数据集都产生了一套多样化的合成图像。这些程序是简短的代码片段,易于修改,并且使用OpenGL快速执行。所提出的数据集可用于监督和无监督的表示学习,并将使用真实图像和程序生成的图像的预训练之间的差距减少38%。
Learning image representations using synthetic data allows training neural networks without some of the concerns associated with real images, such as privacy and bias. Existing work focuses on a handful of curated generative processes which require expert knowledge to design, making it hard to scale up. To overcome this, we propose training with a large dataset of twenty-one thousand programs, each one generating a diverse set of synthetic images. These programs are short code snippets, which are easy to modify and fast to execute using OpenGL. The proposed dataset can be used for both supervised and unsupervised representation learning, and reduces the gap between pre-training with real and procedurally generated images by 38%.