主论文：通过学习令牌化的神经手语翻译

论文标题

主论文：通过学习令牌化的神经手语翻译

Master Thesis: Neural Sign Language Translation by Learning Tokenization

论文作者

Orbay, Alptekin

论文摘要

在本文中，我们提出了一种基于多任务学习的方法，以改善由两个部分组成的神经手语翻译（NSLT），即一个令牌化层和神经机器翻译（NMT）。令牌化部分重点介绍了如何将手语（SL）视频表示为另一部分。尚未对其进行详尽的研究，而NMT研究吸引了一些贡献巨大进步的研究人员。到目前为止，有两个主要的输入令牌化级别，即框架级别和光泽级令牌化。光泽是世界般的中间演示文稿，是SLS独有的。因此，我们旨在开发一个通用的标志级令牌化层，以便在不进一步努力的情况下适用于其他域。我们首先研究当前的令牌化方法，并通过几个实验来解释它们的弱点。为了提供解决方案，我们适应转移学习，多任务学习和无监督的领域适应该研究，以利用其他监督。我们成功地使SLS之间的知识转移并在BLEU-4中提高了5分，在Rouge分数中提高了5分。其次，我们通过在所有令牌化方法中通过广泛的实验来显示身体部位的影响。除此之外，我们采用3D-CNN来提高时间和空间方面的效率。最后，我们讨论了签名级令牌化的优势，而不是光泽级别的令牌化。总而言之，我们提出的方法消除了对光泽水平注释的需求，通过利用弱监督来源提供额外的监督来获得更高的分数。

In this thesis, we propose a multitask learning based method to improve Neural Sign Language Translation (NSLT) consisting of two parts, a tokenization layer and Neural Machine Translation (NMT). The tokenization part focuses on how Sign Language (SL) videos should be represented to be fed into the other part. It has not been studied elaborately whereas NMT research has attracted several researchers contributing enormous advancements. Up to now, there are two main input tokenization levels, namely frame-level and gloss-level tokenization. Glosses are world-like intermediate presentation and unique to SLs. Therefore, we aim to develop a generic sign-level tokenization layer so that it is applicable to other domains without further effort. We begin with investigating current tokenization approaches and explain their weaknesses with several experiments. To provide a solution, we adapt Transfer Learning, Multitask Learning and Unsupervised Domain Adaptation into this research to leverage additional supervision. We succeed in enabling knowledge transfer between SLs and improve translation quality by 5 points in BLEU-4 and 8 points in ROUGE scores. Secondly, we show the effects of body parts by extensive experiments in all the tokenization approaches. Apart from these, we adopt 3D-CNNs to improve efficiency in terms of time and space. Lastly, we discuss the advantages of sign-level tokenization over gloss-level tokenization. To sum up, our proposed method eliminates the need for gloss level annotation to obtain higher scores by providing additional supervision by utilizing weak supervision sources.

下载PDF全文

下载文献需遵守相关版权规定

论文标题