使用具有视觉和机器人状态的深度学习在机器人辅助手术中进行力量估算

论文标题

使用具有视觉和机器人状态的深度学习在机器人辅助手术中进行力量估算

Toward Force Estimation in Robot-Assisted Surgery using Deep Learning with Vision and Robot State

论文作者

Chua, Zonghe, Jarc, Anthony M., Okamura, Allison M.

论文摘要

在远程手工机器人辅助手术过程中，对相互作用的知识可用于向人类操作员进行反馈并评估组织处理技能。但是，最终效应器的直接力传感是具有挑战性的，因为它需要生物相容性，可消毒和具有成本效益的传感器。使用卷积神经网络的基于视觉的深度学习是提供有用力量估计的一种有前途的方法，尽管关于对新场景和实时推断的概括仍然存在问题。我们提出了使用RGB图像和机器人状态作为输入的力估计神经网络。使用自收集的数据集，我们将网络与仅包含单个输入类型的变体进行了比较，并评估了它们如何推广到新观点，工作区位置，材料和工具。我们发现，基于视觉的网络对观点的变化很敏感，而仅州的网络对工作空间的变化却有坚固的态度。具有状态和视觉输入的网络对于看不见的工具具有最高的精度，并且对观点的变化非常强大。通过特征去除研究，我们发现仅使用位置功能比仅使用力特征作为输入而产生的精度更好。具有状态和视觉输入的网络的准确性优于基于物理的基线模型。它显示出可比的精度，但计算时间比基线复发性神经网络更快，这使其更适合实时应用。

Knowledge of interaction forces during teleoperated robot-assisted surgery could be used to enable force feedback to human operators and evaluate tissue handling skill. However, direct force sensing at the end-effector is challenging because it requires biocompatible, sterilizable, and cost-effective sensors. Vision-based deep learning using convolutional neural networks is a promising approach for providing useful force estimates, though questions remain about generalization to new scenarios and real-time inference. We present a force estimation neural network that uses RGB images and robot state as inputs. Using a self-collected dataset, we compared the network to variants that included only a single input type, and evaluated how they generalized to new viewpoints, workspace positions, materials, and tools. We found that vision-based networks were sensitive to shifts in viewpoints, while state-only networks were robust to changes in workspace. The network with both state and vision inputs had the highest accuracy for an unseen tool, and was moderately robust to changes in viewpoints. Through feature removal studies, we found that using only position features produced better accuracy than using only force features as input. The network with both state and vision inputs outperformed a physics-based baseline model in accuracy. It showed comparable accuracy but faster computation times than a baseline recurrent neural network, making it better suited for real-time applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题