论文标题
特定于实例的图像目标导航:培训体现的代理以查找对象实例
Instance-Specific Image Goal Navigation: Training Embodied Agents to Find Object Instances
论文作者
论文摘要
我们考虑了给定图像目标(Imagenav)的具体视觉导航的问题,其中代理在不熟悉的环境中初始化,并负责导航到图像“描述”的位置。与相关的导航任务不同,ImagenAv没有标准化的任务定义,这使得在方法之间进行比较困难。此外,现有配方具有两个有问题的特性。 (1)图像目标是从随机位置取样的,这些位置可能导致歧义(例如,看墙),(2)图像目标与摄像机的规范和代理的实施例相匹配;当考虑用户驱动的下游应用程序时,这种刚度是限制的。我们提供了特定于实例的ImagenAV任务(InstanceImagenav)来解决这些限制。具体而言,目标图像“集中”在场景中的某些特定对象实例上,并用独立于代理的摄像机参数拍摄。我们使用Habitat-Matterport3D数据集(HM3D)中的场景在栖息地模拟器中实例化ImplaseMagenav,并释放标准化的基准测量社区进度。
We consider the problem of embodied visual navigation given an image-goal (ImageNav) where an agent is initialized in an unfamiliar environment and tasked with navigating to a location 'described' by an image. Unlike related navigation tasks, ImageNav does not have a standardized task definition which makes comparison across methods difficult. Further, existing formulations have two problematic properties; (1) image-goals are sampled from random locations which can lead to ambiguity (e.g., looking at walls), and (2) image-goals match the camera specification and embodiment of the agent; this rigidity is limiting when considering user-driven downstream applications. We present the Instance-specific ImageNav task (InstanceImageNav) to address these limitations. Specifically, the goal image is 'focused' on some particular object instance in the scene and is taken with camera parameters independent of the agent. We instantiate InstanceImageNav in the Habitat Simulator using scenes from the Habitat-Matterport3D dataset (HM3D) and release a standardized benchmark to measure community progress.