Ellseg-gen，朝着头部式眼影的域概括

论文标题

Ellseg-gen，朝着头部式眼影的域概括

EllSeg-Gen, towards Domain Generalization for head-mounted eyetracking

论文作者

Kothari, Rakshit S., Bailey, Reynold J., Kanan, Christopher, Pelz, Jeff B., Diaz, Gabriel J.

论文摘要

对自然环境中人类凝视行为的研究需要算法来估计，这些算法在各种成像条件下具有鲁棒性。但是，在存在反射性伪影和遮挡的情况下，算法通常无法识别诸如虹膜和学生质心等特征。先前的工作表明，尽管存在此类文物，但卷积网络在提取凝视特征方面表现出色。但是，这些网络在培训过程中通常在看不见的数据上表现不佳。这项工作遵循这样的直觉，即与多个数据集共同培训卷积网络可以学习眼部零件的广义表示。我们将训练有多个数据集训练的单个模型与在单个数据集中训练的模型池进行比较。结果表明，在数据集上测试的模型在该模型中表现出更高的外观变异性受到多组训练的好处。相反，数据集特异性模型可以更好地推广到具有较低外观可变性的眼睛图像上。

The study of human gaze behavior in natural contexts requires algorithms for gaze estimation that are robust to a wide range of imaging conditions. However, algorithms often fail to identify features such as the iris and pupil centroid in the presence of reflective artifacts and occlusions. Previous work has shown that convolutional networks excel at extracting gaze features despite the presence of such artifacts. However, these networks often perform poorly on data unseen during training. This work follows the intuition that jointly training a convolutional network with multiple datasets learns a generalized representation of eye parts. We compare the performance of a single model trained with multiple datasets against a pool of models trained on individual datasets. Results indicate that models tested on datasets in which eye images exhibit higher appearance variability benefit from multiset training. In contrast, dataset-specific models generalize better onto eye images with lower appearance variability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题