论文标题

重演需要反馈:视力障碍的可解释的差图像通知框架

Feedback is Needed for Retakes: An Explainable Poor Image Notification Framework for the Visually Impaired

论文作者

Ohata, Kazuya, Kitada, Shunsuke, Iyatomi, Hitoshi

论文摘要

我们提出了一个简单而有效的图像字幕框架,该框架可以确定图像的质量并通知用户图像中任何缺陷的原因。我们的框架首先确定图像的质量,然后仅使用确定为高质量的图像生成字幕。如果图像质量较低,则会通过缺陷功能通知用户,并且重复此周期,直到将输入图像视为高质量为止。作为框架的一个组成部分,我们训练和评估了一个低质量的图像检测模型,该模型同时学习了识别图像和个人缺陷的困难,我们证明我们的建议可以解释具有足够得分的缺陷的原因。我们还评估了一个由我们的框架去除的低质量图像的数据集,并发现了所有四个常见指标(例如BLEU-4,Meteor,Rouge-l,苹果酒)的提高值,证实了通用图像的改进能力。我们的框架将有助于视力障碍,他们难以判断图像质量。

We propose a simple yet effective image captioning framework that can determine the quality of an image and notify the user of the reasons for any flaws in the image. Our framework first determines the quality of images and then generates captions using only those images that are determined to be of high quality. The user is notified by the flaws feature to retake if image quality is low, and this cycle is repeated until the input image is deemed to be of high quality. As a component of the framework, we trained and evaluated a low-quality image detection model that simultaneously learns difficulty in recognizing images and individual flaws, and we demonstrated that our proposal can explain the reasons for flaws with a sufficient score. We also evaluated a dataset with low-quality images removed by our framework and found improved values for all four common metrics (e.g., BLEU-4, METEOR, ROUGE-L, CIDEr), confirming an improvement in general-purpose image captioning capability. Our framework would assist the visually impaired, who have difficulty judging image quality.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源