ISPRS Journal of Photogrammetry and Remote Sensing, 2018 (article)
This work investigates the estimation of dense three-dimensional motion fields, commonly
referred to as scene flow. While great progress has been made in recent years,
large displacements and adverse imaging conditions as observed in natural outdoor
environments are still very challenging for current approaches to reconstruction and
motion estimation. In this paper, we propose a unified random field model which reasons
jointly about 3D scene flow as well as the location, shape and motion of vehicles
in the observed scene. We formulate the problem as the task of decomposing the scene
into a small number of rigidly moving objects sharing the same motion parameters.
Thus, our formulation effectively introduces long-range spatial dependencies which
commonly employed local rigidity priors are lacking. Our inference algorithm then
estimates the association of image segments and object hypotheses together with their
three-dimensional shape and motion. We demonstrate the potential of the proposed
approach by introducing a novel challenging scene flow benchmark which allows for a
thorough comparison of the proposed scene flow approach with respect to various baseline
models. In contrast to previous benchmarks, our evaluation is the first to provide
stereo and optical flow ground truth for dynamic real-world urban scenes at large scale.
Our experiments reveal that rigid motion segmentation can be utilized as an effective
regularizer for the scene flow problem, improving upon existing two-frame scene flow
methods. At the same time, our method yields plausible object segmentations without requiring an explicitly trained recognition model for a specific object class.
In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2015, pages: 3061-3070, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2015 (inproceedings)
This paper proposes a novel model and dataset for 3D scene flow estimation with an application to autonomous driving. Taking advantage of the fact that outdoor scenes often decompose into a small number of independently moving objects, we represent each element in the scene by its rigid motion parameters and each superpixel by a 3D plane as well as an index to the corresponding object. This minimal representation increases robustness and leads to a discrete-continuous CRF where the data term decomposes into pairwise potentials between superpixels and objects. Moreover, our model intrinsically segments the scene into its constituting dynamic components. We demonstrate the performance of our model on existing benchmarks as well as a novel realistic dataset with scene flow ground truth. We obtain this dataset by annotating 400 dynamic scenes from the KITTI raw data collection using detailed 3D CAD models for all vehicles in motion. Our experiments also reveal novel challenges which can't be handled by existing methods.
In German Conference on Pattern Recognition (GCPR), 9358, pages: 16-28, Springer International Publishing, 2015 (inproceedings)
We propose to look at large-displacement optical flow from a discrete point of view. Motivated by the observation that sub-pixel accuracy is easily obtained given pixel-accurate optical flow, we conjecture that computing the integral part is the hardest piece of the problem. Consequently, we formulate optical flow estimation as a discrete inference problem in a conditional random field, followed by sub-pixel refinement. Naive discretization of the 2D flow space, however, is intractable due to the resulting size of the label set. In this paper, we therefore investigate three different strategies, each able to reduce computation and memory demands by several orders of magnitude. Their combination allows us to estimate large-displacement optical flow both accurately and efficiently and demonstrates the potential of discrete optimization for optical flow. We obtain state-of-the-art performance on MPI Sintel and KITTI.
In Proc. of the ISPRS Workshop on Image Sequence Analysis (ISA), 2015 (inproceedings)
Three-dimensional reconstruction of dynamic scenes is an important prerequisite for applications like mobile robotics or autonomous driving. While much progress has been made in recent years, imaging conditions in natural outdoor environments are still very challenging for current reconstruction and recognition methods. In this paper, we propose a novel unified approach which reasons jointly about 3D scene flow as well as the pose, shape and motion of vehicles in the scene. Towards this goal, we incorporate a deformable CAD model into a slanted-plane conditional random field for scene flow estimation and enforce shape consistency between the rendered 3D models and the parameters of all superpixels in the image. The association of superpixels to objects is established by an index variable which implicitly enables model selection. We evaluate our approach on the challenging KITTI scene flow dataset in terms of object and scene flow estimation. Our results provide a prove of concept and demonstrate the usefulness of our method.
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems