论文标题
预测:多模式位置预测的生成概率框架给定上下文信息
PrognoseNet: A Generative Probabilistic Framework for Multimodal Position Prediction given Context Information
论文作者
论文摘要
考虑到周围环境,同时估算其概率是安全自主驾驶的关键,可以预测自我驾驶的概率的能力。当前的大多数最新深度学习方法都经过轨迹数据培训,以实现这一任务。但是,传感器系统捕获的轨迹数据高度不平衡,因为到目前为止,大多数轨迹遵循速度大致恒定的直线。这对预测未来立场的任务构成了巨大的挑战,这是本质上的回归问题。当前的最新方法仅通过对培训数据进行重大预处理,例如在本文中,我们提出了一种将预测问题重新制定为分类任务的方法,允许使用强大的工具,例如焦点损失,以打击失衡。为此,我们设计了一种生成概率模型,该模型由高斯头部混合物的深神经网络组成。对潜在变量的明智选择允许对数可能的函数进行重新重新制定,以作为分类问题和简化的回归问题的结合。我们模型的输出是对未来位置的概率密度函数的估计,因此可以预测多个可能的位置,同时还可以估算其概率。提出的方法可以轻松地包含上下文信息,并且不需要对数据进行任何预处理。
The ability to predict multiple possible future positions of the ego-vehicle given the surrounding context while also estimating their probabilities is key to safe autonomous driving. Most of the current state-of-the-art Deep Learning approaches are trained on trajectory data to achieve this task. However trajectory data captured by sensor systems is highly imbalanced, since by far most of the trajectories follow straight lines with an approximately constant velocity. This poses a huge challenge for the task of predicting future positions, which is inherently a regression problem. Current state-of-the-art approaches alleviate this problem only by major preprocessing of the training data, e.g. resampling, clustering into anchors etc. In this paper we propose an approach which reformulates the prediction problem as a classification task, allowing for powerful tools, e.g. focal loss, to combat the imbalance. To this end we design a generative probabilistic model consisting of a deep neural network with a Mixture of Gaussian head. A smart choice of the latent variable allows for the reformulation of the log-likelihood function as a combination of a classification problem and a much simplified regression problem. The output of our model is an estimate of the probability density function of future positions, hence allowing for prediction of multiple possible positions while also estimating their probabilities. The proposed approach can easily incorporate context information and does not require any preprocessing of the data.