了解人类教师的声学模式，向机器人展示操纵任务

论文标题

了解人类教师的声学模式，向机器人展示操纵任务

Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots

论文作者

Saran, Akanksha, Desai, Kush, Chang, Mai Lee, Lioutikov, Rudolf, Thomaz, Andrea, Niekum, Scott

论文摘要

当对其他人的新技能或任务教授新技能或任务时，人类以口语或口头反应的形式使用音频信号。虽然示威活动使人类自然地教机器人，但仅从轨迹中学习并不能利用其他可用的方式，包括人类教师的音频。为了有效利用伴随人类示范的音频提示，首先要了解这种提示存在和传达哪种信息。这项工作是从人类教师那里的音频来表征，使用三种功能类型的人类教师向位置的锯耶机器人展示了多个步骤操纵任务：（1）使用的语音持续时间，（2）语音或韵律中的表现力，以及（3）语音语音内容。我们沿着四个维度分析了这些特征，发现教师通过口语传达相似的语义概念，以针对（1）演示类型的不同条件，（2）音频使用说明，（3）子任务和（4）在演示期间的错误。但是，在持续时间和表现力方面，语音的分化特性沿着四个维度存在，这强调了人类音频带有丰富的信息，这可能有益于从演示方法中机器人学习的技术进步。

Humans use audio signals in the form of spoken language or verbal reactions effectively when teaching new skills or tasks to other humans. While demonstrations allow humans to teach robots in a natural way, learning from trajectories alone does not leverage other available modalities including audio from human teachers. To effectively utilize audio cues accompanying human demonstrations, first it is important to understand what kind of information is present and conveyed by such cues. This work characterizes audio from human teachers demonstrating multi-step manipulation tasks to a situated Sawyer robot using three feature types: (1) duration of speech used, (2) expressiveness in speech or prosody, and (3) semantic content of speech. We analyze these features along four dimensions and find that teachers convey similar semantic concepts via spoken words for different conditions of (1) demonstration types, (2) audio usage instructions, (3) subtasks, and (4) errors during demonstrations. However, differentiating properties of speech in terms of duration and expressiveness are present along the four dimensions, highlighting that human audio carries rich information, potentially beneficial for technological advancement of robot learning from demonstration methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题