AP-MTL：针对机器人辅助手术中实时仪器检测和分割的注意修剪的多任务学习模型

论文标题

AP-MTL：针对机器人辅助手术中实时仪器检测和分割的注意修剪的多任务学习模型

AP-MTL: Attention Pruned Multi-task Learning Model for Real-time Instrument Detection and Segmentation in Robot-assisted Surgery

论文作者

Islam, Mobarakol, VS, Vibashan, Ren, Hongliang

论文摘要

手术场景的理解和多任务学习对于图像引导的机器人手术至关重要。培训一个实时机器人系统来检测和分割高分辨率图像，这为有限的计算资源提供了一个具有挑战性的问题。提出的感知可以应用于有效的实时反馈，手术技能评估和人类机器人协作手术中，以增强手术结果。为此，我们开发了一种具有重量共享编码器和任务意识的检测和分段解码器的新型端到端可训练的实时多任务学习（MTL）模型。在同一收敛点的多个任务优化至关重要，并提出一个复杂的问题。因此，我们提出了一种异步任务感知的优化（ATO）技术来计算以任务为导向的梯度并独立训练解码器。此外，MTL模型总是在计算上昂贵，这阻碍了实时应用程序。为了应对这一挑战，我们通过删除较少的显着和稀疏参数引入了全球注意力动态修剪（GADP）。我们进一步设计了一个跳过挤压和激发（SE）模块，该模块抑制了弱特征，激发了重要的特征，并执行动态空间和频道的特征重新校准。在验证MICCAI内窥镜视觉挑战的机器人仪器分割数据集上，我们的模型显着优于最先进的细分和检测模型，包括挑战中最佳的模型。

Surgical scene understanding and multi-tasking learning are crucial for image-guided robotic surgery. Training a real-time robotic system for the detection and segmentation of high-resolution images provides a challenging problem with the limited computational resource. The perception drawn can be applied in effective real-time feedback, surgical skill assessment, and human-robot collaborative surgeries to enhance surgical outcomes. For this purpose, we develop a novel end-to-end trainable real-time Multi-Task Learning (MTL) model with weight-shared encoder and task-aware detection and segmentation decoders. Optimization of multiple tasks at the same convergence point is vital and presents a complex problem. Thus, we propose an asynchronous task-aware optimization (ATO) technique to calculate task-oriented gradients and train the decoders independently. Moreover, MTL models are always computationally expensive, which hinder real-time applications. To address this challenge, we introduce a global attention dynamic pruning (GADP) by removing less significant and sparse parameters. We further design a skip squeeze and excitation (SE) module, which suppresses weak features, excites significant features and performs dynamic spatial and channel-wise feature re-calibration. Validating on the robotic instrument segmentation dataset of MICCAI endoscopic vision challenge, our model significantly outperforms state-of-the-art segmentation and detection models, including best-performed models in the challenge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题