使用语言了解美学：用于美学评估的照片评论数据集

论文标题

使用语言了解美学：用于美学评估的照片评论数据集

Understanding Aesthetics with Language: A Photo Critique Dataset for Aesthetic Assessment

论文作者

Nieto, Daniel Vera, Celona, Luigi, Fernandez-Labrador, Clara

论文摘要

由于其主观性质，美学的计算推断是一项不确定的任务。已经提出了许多数据集来通过根据人类评级提供成对的图像和美学得分来解决问题。但是，人类更好地通过语言表达自己的意见，品味和情感，而不是单个数字总结他们。实际上，照片评论提供了更丰富的信息，因为它们揭示了用户如何以及为什么对视觉刺激的美学评价。在这方面，我们提出了Reddit照片评论数据集（RPCD），其中包含图像和照片评论的元素。 RPCD由74K图像和220k评论组成，并从业余爱好者和专业摄影师使用的Reddit社区收集，通过利用建设性的社区反馈来提高其摄影技巧。所提出的数据集与以前的美学数据集不同，主要是在三个方面，即（i）数据集的大规模和批评图像不同方面的评论的扩展，（ii）它主要包含Ultrahd映像，并且（III）可以通过自动管道轻松地将其扩展到新数据上。据我们所知，在这项工作中，我们提出了第一次尝试从评论中估算视觉刺激的美学质量的尝试。为此，我们利用批评情感的极性为美学判断的指标。我们证明了情感极性如何与可用于两种美学评估基准的美学判断正相关。最后，我们通过使用情感分数作为排名图像的目标进行了几种模型。提供数据集和基准（https://github.com/mediatechnologycenter/aestheval）。

Computational inference of aesthetics is an ill-defined task due to its subjective nature. Many datasets have been proposed to tackle the problem by providing pairs of images and aesthetic scores based on human ratings. However, humans are better at expressing their opinion, taste, and emotions by means of language rather than summarizing them in a single number. In fact, photo critiques provide much richer information as they reveal how and why users rate the aesthetics of visual stimuli. In this regard, we propose the Reddit Photo Critique Dataset (RPCD), which contains tuples of image and photo critiques. RPCD consists of 74K images and 220K comments and is collected from a Reddit community used by hobbyists and professional photographers to improve their photography skills by leveraging constructive community feedback. The proposed dataset differs from previous aesthetics datasets mainly in three aspects, namely (i) the large scale of the dataset and the extension of the comments criticizing different aspects of the image, (ii) it contains mostly UltraHD images, and (iii) it can easily be extended to new data as it is collected through an automatic pipeline. To the best of our knowledge, in this work, we propose the first attempt to estimate the aesthetic quality of visual stimuli from the critiques. To this end, we exploit the polarity of the sentiment of criticism as an indicator of aesthetic judgment. We demonstrate how sentiment polarity correlates positively with the aesthetic judgment available for two aesthetic assessment benchmarks. Finally, we experiment with several models by using the sentiment scores as a target for ranking images. Dataset and baselines are available (https://github.com/mediatechnologycenter/aestheval).

下载PDF全文

下载文献需遵守相关版权规定

论文标题