Header logo is


2016


no image
Consistent Kernel Mean Estimation for Functions of Random Variables

Simon-Gabriel*, C. J., Ścibior*, A., Tolstikhin, I., Schölkopf, B.

Advances in Neural Information Processing Systems 29, pages: 1732-1740, (Editors: D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett), Curran Associates, Inc., 30th Annual Conference on Neural Information Processing Systems, December 2016, *joint first authors (conference)

ei

link (url) Project Page Project Page Project Page [BibTex]

2016


link (url) Project Page Project Page Project Page [BibTex]


Thumb xl nonlinear approximate vs exact
A New Perspective and Extension of the Gaussian Filter

Wüthrich, M., Trimpe, S., Garcia Cifuentes, C., Kappler, D., Schaal, S.

The International Journal of Robotics Research, 35(14):1731-1749, December 2016 (article)

Abstract
The Gaussian Filter (GF) is one of the most widely used filtering algorithms; instances are the Extended Kalman Filter, the Unscented Kalman Filter and the Divided Difference Filter. The GF represents the belief of the current state by a Gaussian distribution, whose mean is an affine function of the measurement. We show that this representation can be too restrictive to accurately capture the dependences in systems with nonlinear observation models, and we investigate how the GF can be generalized to alleviate this problem. To this end, we view the GF as the solution to a constrained optimization problem. From this new perspective, the GF is seen as a special case of a much broader class of filters, obtained by relaxing the constraint on the form of the approximate posterior. On this basis, we outline some conditions which potential generalizations have to satisfy in order to maintain the computational efficiency of the GF. We propose one concrete generalization which corresponds to the standard GF using a pseudo measurement instead of the actual measurement. Extending an existing GF implementation in this manner is trivial. Nevertheless, we show that this small change can have a major impact on the estimation accuracy.

am ics

PDF DOI Project Page [BibTex]

PDF DOI Project Page [BibTex]


no image
Understanding Probabilistic Sparse Gaussian Process Approximations

Bauer, M., van der Wilk, M., Rasmussen, C. E.

Advances in Neural Information Processing Systems 29, pages: 1533-1541, (Editors: D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett), Curran Associates, Inc., 30th Annual Conference on Neural Information Processing Systems, December 2016 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Minimax Estimation of Maximum Mean Discrepancy with Radial Kernels

Tolstikhin, I., Sriperumbudur, B. K., Schölkopf, B.

Advances in Neural Information Processing Systems 29, pages: 1930-1938, (Editors: D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett), Curran Associates, Inc., 30th Annual Conference on Neural Information Processing Systems, December 2016 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Local-utopia Policy Selection for Multi-objective Reinforcement Learning

Parisi, S., Blank, A., Viernickel, T., Peters, J.

In IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pages: 1-7, IEEE, December 2016 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]


no image
Lifelong Learning with Weighted Majority Votes

Pentina, A., Urner, R.

Advances in Neural Information Processing Systems 29, pages: 3612-3620, (Editors: D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett), Curran Associates, Inc., 30th Annual Conference on Neural Information Processing Systems, December 2016 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Active Nearest-Neighbor Learning in Metric Spaces

Kontorovich, A., Sabato, S., Urner, R.

Advances in Neural Information Processing Systems 29, pages: 856-864, (Editors: D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett), Curran Associates, Inc., 30th Annual Conference on Neural Information Processing Systems, December 2016 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Predictive and Self Triggering for Event-based State Estimation

Trimpe, S.

In Proceedings of the 55th IEEE Conference on Decision and Control (CDC), pages: 3098-3105, Las Vegas, NV, USA, December 2016 (inproceedings)

am ics

arXiv PDF DOI Project Page [BibTex]

arXiv PDF DOI Project Page [BibTex]


no image
Catching heuristics are optimal control policies

Belousov, B., Neumann, G., Rothkopf, C., Peters, J.

Advances in Neural Information Processing Systems 29, pages: 1426-1434, (Editors: D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett), Curran Associates, Inc., 30th Annual Conference on Neural Information Processing Systems, December 2016 (conference)

ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Incremental Imitation Learning of Context-Dependent Motor Skills

Ewerton, M., Maeda, G., Kollegger, G., Wiemeyer, J., Peters, J.

IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pages: 351-358, IEEE, November 2016 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Using Probabilistic Movement Primitives for Striking Movements

Gomez-Gonzalez, S., Neumann, G., Schölkopf, B., Peters, J.

16th IEEE-RAS International Conference on Humanoid Robots (Humanoids), pages: 502-508, November 2016 (conference)

am ei

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]


no image
Demonstration Based Trajectory Optimization for Generalizable Robot Motions

Koert, D., Maeda, G., Lioutikov, R., Neumann, G., Peters, J.

IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pages: 351-358, IEEE, November 2016 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl screen shot 2019 01 07 at 11.54.16
Jointly Learning Trajectory Generation and Hitting Point Prediction in Robot Table Tennis

Huang, Y., Büchler, D., Koc, O., Schölkopf, B., Peters, J.

16th IEEE-RAS International Conference on Humanoid Robots (Humanoids), pages: 650-655, November 2016 (conference)

am ei

final link (url) DOI Project Page [BibTex]

final link (url) DOI Project Page [BibTex]


no image
Deep Spiking Networks for Model-based Planning in Humanoids

Tanneberg, D., Paraschos, A., Peters, J., Rueckert, E.

IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pages: 656-661, IEEE, November 2016 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Anticipative Interaction Primitives for Human-Robot Collaboration

Maeda, G., Maloo, A., Ewerton, M., Lioutikov, R., Peters, J.

AAAI Fall Symposium Series. Shared Autonomy in Research and Practice, pages: 325-330, November 2016 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


Thumb xl psychscience
Creating body shapes from verbal descriptions by linking similarity spaces

Hill, M. Q., Streuber, S., Hahn, C. A., Black, M. J., O’Toole, A. J.

Psychological Science, 27(11):1486-1497, November 2016, (article)

Abstract
Brief verbal descriptions of bodies (e.g. curvy, long-legged) can elicit vivid mental images. The ease with which we create these mental images belies the complexity of three-dimensional body shapes. We explored the relationship between body shapes and body descriptions and show that a small number of words can be used to generate categorically accurate representations of three-dimensional bodies. The dimensions of body shape variation that emerged in a language-based similarity space were related to major dimensions of variation computed directly from three-dimensional laser scans of 2094 bodies. This allowed us to generate three-dimensional models of people in the shape space using only their coordinates on analogous dimensions in the language-based description space. Human descriptions of photographed bodies and their corresponding models matched closely. The natural mapping between the spaces illustrates the role of language as a concise code for body shape, capturing perceptually salient global and local body features.

ps

pdf [BibTex]

pdf [BibTex]


no image
Unifying distillation and privileged information

Lopez-Paz, D., Schölkopf, B., Bottou, L., Vapnik, V.

International Conference on Learning Representations (ICLR), November 2016 (conference)

ei

Arxiv Project Page [BibTex]

Arxiv Project Page [BibTex]


no image
Learning High-Order Filters for Efficient Blind Deconvolution of Document Photographs

Xiao, L., Wang, J., Heidrich, W., Hirsch, M.

Computer Vision - ECCV 2016, Lecture Notes in Computer Science, LNCS 9907, Part III, pages: 734-749, (Editors: Bastian Leibe, Jiri Matas, Nicu Sebe and Max Welling), Springer, October 2016 (conference)

ei

DOI [BibTex]

DOI [BibTex]


no image
Adaptive Training Strategies for BCIs

Sharma, D., Tanneberg, D., Grosse-Wentrup, M., Peters, J., Rueckert, E.

Cybathlon Symposium, October 2016 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies

Osa, T., Peters, J., Neumann, G.

International Symposium on Experimental Robotics (ISER), 1, pages: 160-172, Springer Proceedings in Advanced Robotics, (Editors: Dana Kulic, Yoshihiko Nakamura, Oussama Khatib and Gentiane Venture), Springer, October 2016 (conference)

ei

DOI [BibTex]

DOI [BibTex]


no image
Stable Reinforcement Learning with Autoencoders for Tactile and Visual Data

van Hoof, H., Chen, N., Karl, M., van der Smagt, P., Peters, J.

Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS), pages: 3928-3934, IEEE, October 2016 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl smplify
Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M. J.

In Computer Vision – ECCV 2016, pages: 561-578, Lecture Notes in Computer Science, Springer International Publishing, 14th European Conference on Computer Vision, October 2016 (inproceedings)

Abstract
We describe the first method to automatically estimate the 3D pose of the human body as well as its 3D shape from a single unconstrained image. We estimate a full 3D mesh and show that 2D joints alone carry a surprising amount of information about body shape. The problem is challenging because of the complexity of the human body, articulation, occlusion, clothing, lighting, and the inherent ambiguity in inferring 3D from 2D. To solve this, we fi rst use a recently published CNN-based method, DeepCut, to predict (bottom-up) the 2D body joint locations. We then fit (top-down) a recently published statistical body shape model, called SMPL, to the 2D joints. We do so by minimizing an objective function that penalizes the error between the projected 3D model joints and detected 2D joints. Because SMPL captures correlations in human shape across the population, we are able to robustly fi t it to very little data. We further leverage the 3D model to prevent solutions that cause interpenetration. We evaluate our method, SMPLify, on the Leeds Sports, HumanEva, and Human3.6M datasets, showing superior pose accuracy with respect to the state of the art.

ps

pdf Video Sup Mat video Code Project Project Page [BibTex]

pdf Video Sup Mat video Code Project Project Page [BibTex]


no image
A New Trajectory Generation Framework in Robotic Table Tennis

Koc, O., Maeda, G., Peters, J.

Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS), pages: 3750-3756, October 2016 (conference)

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl gadde
Superpixel Convolutional Networks using Bilateral Inceptions

Gadde, R., Jampani, V., Kiefel, M., Kappler, D., Gehler, P.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, Springer, 14th European Conference on Computer Vision, October 2016 (inproceedings)

Abstract
In this paper we propose a CNN architecture for semantic image segmentation. We introduce a new “bilateral inception” module that can be inserted in existing CNN architectures and performs bilateral filtering, at multiple feature-scales, between superpixels in an image. The feature spaces for bilateral filtering and other parameters of the module are learned end-to-end using standard backpropagation techniques. The bilateral inception module addresses two issues that arise with general CNN segmentation architectures. First, this module propagates information between (super) pixels while respecting image edges, thus using the structured information of the problem for improved results. Second, the layer recovers a full resolution segmentation result from the lower resolution solution of a CNN. In the experiments, we modify several existing CNN architectures by inserting our inception modules between the last CNN (1 × 1 convolution) layers. Empirical results on three different datasets show reliable improvements not only in comparison to the baseline networks, but also in comparison to several dense-pixel prediction techniques such as CRFs, while being competitive in time.

am ps

pdf supplementary poster Project Page Project Page [BibTex]

pdf supplementary poster Project Page Project Page [BibTex]


no image
Probabilistic Decomposition of Sequential Force Interaction Tasks into Movement Primitives

Manschitz, S., Gienger, M., Kober, J., Peters, J.

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages: 3920-3927, IEEE, October 2016 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl thumb
Barrista - Caffe Well-Served

Lassner, C., Kappler, D., Kiefel, M., Gehler, P.

In ACM Multimedia Open Source Software Competition, ACM OSSC16, October 2016 (inproceedings)

Abstract
The caffe framework is one of the leading deep learning toolboxes in the machine learning and computer vision community. While it offers efficiency and configurability, it falls short of a full interface to Python. With increasingly involved procedures for training deep networks and reaching depths of hundreds of layers, creating configuration files and keeping them consistent becomes an error prone process. We introduce the barrista framework, offering full, pythonic control over caffe. It separates responsibilities and offers code to solve frequently occurring tasks for pre-processing, training and model inspection. It is compatible to all caffe versions since mid 2015 and can import and export .prototxt files. Examples are included, e.g., a deep residual network implemented in only 172 lines (for arbitrary depths), comparing to 2320 lines in the official implementation for the equivalent model.

am ps

pdf link (url) DOI Project Page [BibTex]

pdf link (url) DOI Project Page [BibTex]


no image
Multi-task logistic regression in brain-computer interfaces

Fiebig, K., Jayaram, V., Peters, J., Grosse-Wentrup, M.

6th Workshop on Brain-Machine Interface Systems at IEEE International Conference on Systems, Man, and Cybernetics (SMC 2016), pages: 002307-002312, IEEE, October 2016 (conference)

ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Active Tactile Object Exploration with Gaussian Processes

Yi, Z., Calandra, R., Veiga, F., van Hoof, H., Hermans, T., Zhang, Y., Peters, J.

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages: 4925-4930, IEEE, October 2016 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
On Version Space Compression

Ben-David, S., Urner, R.

Algorithmic Learning Theory - 27th International Conference (ALT), 9925, pages: 50-64, Lecture Notes in Computer Science, (Editors: Ortner, R., Simon, H. U., and Zilles, S.), September 2016 (conference)

ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller

Abdolmaleki, A., Lau, N., Reis, L., Peters, J., Neumann, G.

Journal of Intelligent & Robotic Systems, 83(3-4):393-408, (Editors: Luis Almeida, Lino Marques ), September 2016, Special Issue: Autonomous Robot Systems (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Learning Probabilistic Features from EMG Data for Predicting Knee Abnormalities

Kohlschuetter, J., Peters, J., Rueckert, E.

XIV Mediterranean Conference on Medical and Biological Engineering and Computing (MEDICON), pages: 668-672, (Editors: Kyriacou, E., Christofides, S., and Pattichis, C. S.), September 2016 (conference)

ei

DOI [BibTex]

DOI [BibTex]


no image
Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

Grau-Moya, J, Leibfried, F, Genewein, T, Braun, DA

Machine Learning and Knowledge Discovery in Databases, pages: 475-491, Lecture Notes in Computer Science; 9852, Springer, Cham, Switzerland, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML PKDD), September 2016 (conference)

Abstract
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.

ei

DOI [BibTex]

DOI [BibTex]


Thumb xl 2016 lightfield depth
Depth Estimation Through a Generative Model of Light Field Synthesis

Sajjadi, M. S. M., Köhler, R., Schölkopf, B., Hirsch, M.

Pattern Recognition - 38th German Conference (GCPR), 9796, pages: 426-438, Lecture Notes in Computer Science, (Editors: Rosenhahn, B. and Andres, B.), Springer International Publishing, September 2016 (conference)

ei

Arxiv Project link (url) DOI [BibTex]

Arxiv Project link (url) DOI [BibTex]


no image
Bidirektionale Interaktion zwischen Mensch und Roboter beim Bewegungslernen (BIMROB)

Kollegger, G., Ewerton, M., Peters, J., Wiemeyer, J.

11. Symposium der DVS Sportinformatik, September 2016 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
A Low-cost Sensor Glove with Vibrotactile Feedback and Multiple Finger Joint and Hand Motion Sensing for Human-Robot Interaction

Weber, P., Rueckert, E., Calandra, R., Peters, J., Beckerle, P.

25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pages: 99-104, August 2016 (conference)

ei

DOI [BibTex]

DOI [BibTex]


no image
Experimental and causal view on information integration in autonomous agents

Geiger, P., Hofmann, K., Schölkopf, B.

Proceedings of the 6th International Workshop on Combinations of Intelligent Methods and Applications (CIMA), pages: 21-28, (Editors: Hatzilygeroudis, I. and Palade, V.), August 2016 (conference)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Manifold Gaussian Processes for Regression

Calandra, R., Peters, J., Rasmussen, C. E., Deisenroth, M. P.

International Joint Conference on Neural Networks (IJCNN), pages: 3338-3345, IEEE, July 2016 (conference)

ei

DOI [BibTex]

DOI [BibTex]


no image
Acquiring and Generalizing the Embodiment Mapping from Human Observations to Robot Skills

Maeda, G., Ewerton, M., Koert, D., Peters, J.

IEEE Robotics and Automation Letters, 1(2):784-791, July 2016 (article)

ei

DOI [BibTex]

DOI [BibTex]


Thumb xl cover
Dynamic baseline stereo vision-based cooperative target tracking

Ahmad, A., Ruff, E., Bülthoff, H.

19th International Conference on Information Fusion, pages: 1728-1734, July 2016 (conference)

Abstract
In this article we present a new method for multi-robot cooperative target tracking based on dynamic baseline stereo vision. The core novelty of our approach includes a computationally light-weight scheme to compute the 3D stereo measurements that exactly satisfy the epipolar constraints and a covariance intersection (CI)-based method to fuse the 3D measurements obtained by each individual robot. Using CI we are able to systematically integrate the robot localization uncertainties as well as the uncertainties in the measurements generated by the monocular camera images from each individual robot into the resulting stereo measurements. Through an extensive set of simulation and real robot results we show the robustness and accuracy of our approach with respect to ground truth. The source code related to this article is publicly accessible on our website and the datasets are available on request.

ps

DOI [BibTex]

DOI [BibTex]


Thumb xl screen shot 2015 12 04 at 15.11.43
Robust Gaussian Filtering using a Pseudo Measurement

Wüthrich, M., Garcia Cifuentes, C., Trimpe, S., Meier, F., Bohg, J., Issac, J., Schaal, S.

In Proceedings of the American Control Conference (ACC), Boston, MA, USA, July 2016 (inproceedings)

Abstract
Most widely-used state estimation algorithms, such as the Extended Kalman Filter and the Unscented Kalman Filter, belong to the family of Gaussian Filters (GF). Unfortunately, GFs fail if the measurement process is modelled by a fat-tailed distribution. This is a severe limitation, because thin-tailed measurement models, such as the analytically-convenient and therefore widely-used Gaussian distribution, are sensitive to outliers. In this paper, we show that mapping the measurements into a specific feature space enables any existing GF algorithm to work with fat-tailed measurement models. We find a feature function which is optimal under certain conditions. Simulation results show that the proposed method allows for robust filtering in both linear and nonlinear systems with measurements contaminated by fat-tailed noise.

am ics

Web link (url) DOI Project Page [BibTex]

Web link (url) DOI Project Page [BibTex]


Thumb xl webteaser
Body Talk: Crowdshaping Realistic 3D Avatars with Words

Streuber, S., Quiros-Ramirez, M. A., Hill, M. Q., Hahn, C. A., Zuffi, S., O’Toole, A., Black, M. J.

ACM Trans. Graph. (Proc. SIGGRAPH), 35(4):54:1-54:14, July 2016 (article)

Abstract
Realistic, metrically accurate, 3D human avatars are useful for games, shopping, virtual reality, and health applications. Such avatars are not in wide use because solutions for creating them from high-end scanners, low-cost range cameras, and tailoring measurements all have limitations. Here we propose a simple solution and show that it is surprisingly accurate. We use crowdsourcing to generate attribute ratings of 3D body shapes corresponding to standard linguistic descriptions of 3D shape. We then learn a linear function relating these ratings to 3D human shape parameters. Given an image of a new body, we again turn to the crowd for ratings of the body shape. The collection of linguistic ratings of a photograph provides remarkably strong constraints on the metric 3D shape. We call the process crowdshaping and show that our Body Talk system produces shapes that are perceptually indistinguishable from bodies created from high-resolution scans and that the metric accuracy is sufficient for many tasks. This makes body “scanning” practical without a scanner, opening up new applications including database search, visualization, and extracting avatars from books.

ps

pdf web tool video talk (ppt) [BibTex]

pdf web tool video talk (ppt) [BibTex]


Thumb xl teaser
DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation

Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P., Schiele, B.

In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages: 4929-4937, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
This paper considers the task of articulated human pose estimation of multiple people in real-world images. We propose an approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other. This joint formulation is in contrast to previous strategies, that address the problem by first detecting people and subsequently estimating their body pose. We propose a partitioning and labeling formulation of a set of body-part hypotheses generated with CNN-based part detectors. Our formulation, an instance of an integer linear program, implicitly performs non-maximum suppression on the set of part candidates and groups them to form configurations of body parts respecting geometric and appearance constraints. Experiments on four different datasets demonstrate state-of-the-art results for both single person and multi person pose estimation.

ps

code pdf supplementary DOI Project Page [BibTex]

code pdf supplementary DOI Project Page [BibTex]


Thumb xl tsaiteaser
Video segmentation via object flow

Tsai, Y., Yang, M., Black, M. J.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
Video object segmentation is challenging due to fast moving objects, deforming shapes, and cluttered backgrounds. Optical flow can be used to propagate an object segmentation over time but, unfortunately, flow is often inaccurate, particularly around object boundaries. Such boundaries are precisely where we want our segmentation to be accurate. To obtain accurate segmentation across time, we propose an efficient algorithm that considers video segmentation and optical flow estimation simultaneously. For video segmentation, we formulate a principled, multiscale, spatio-temporal objective function that uses optical flow to propagate information between frames. For optical flow estimation, particularly at object boundaries, we compute the flow independently in the segmented regions and recompose the results. We call the process object flow and demonstrate the effectiveness of jointly optimizing optical flow and video segmentation using an iterative scheme. Experiments on the SegTrack v2 and Youtube-Objects datasets show that the proposed algorithm performs favorably against the other state-of-the-art methods.

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl capital
Patches, Planes and Probabilities: A Non-local Prior for Volumetric 3D Reconstruction

Ulusoy, A. O., Black, M. J., Geiger, A.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
In this paper, we propose a non-local structured prior for volumetric multi-view 3D reconstruction. Towards this goal, we present a novel Markov random field model based on ray potentials in which assumptions about large 3D surface patches such as planarity or Manhattan world constraints can be efficiently encoded as probabilistic priors. We further derive an inference algorithm that reasons jointly about voxels, pixels and image segments, and estimates marginal distributions of appearance, occupancy, depth, normals and planarity. Key to tractable inference is a novel hybrid representation that spans both voxel and pixel space and that integrates non-local information from 2D image segmentations in a principled way. We compare our non-local prior to commonly employed local smoothness assumptions and a variety of state-of-the-art volumetric reconstruction baselines on challenging outdoor scenes with textureless and reflective surfaces. Our experiments indicate that regularizing over larger distances has the potential to resolve ambiguities where local regularizers fail.

avg ps

YouTube pdf poster suppmat Project Page [BibTex]

YouTube pdf poster suppmat Project Page [BibTex]


no image
The Mondrian Kernel

Balog, M., Lakshminarayanan, B., Ghahramani, Z., Roy, D. M., Teh, Y. W.

Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence (UAI), (Editors: Ihler, Alexander T. and Janzing, Dominik), June 2016 (conference)

ei

Arxiv link (url) Project Page [BibTex]

Arxiv link (url) Project Page [BibTex]


Thumb xl ijcv tumb
Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

Tzionas, D., Ballan, L., Srikantha, A., Aponte, P., Pollefeys, M., Gall, J.

International Journal of Computer Vision (IJCV), 118(2):172-193, June 2016 (article)

Abstract
Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras. For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.

ps

Website pdf link (url) DOI Project Page [BibTex]

Website pdf link (url) DOI Project Page [BibTex]


Thumb xl header
Optical Flow with Semantic Segmentation and Localized Layers

Sevilla-Lara, L., Sun, D., Jampani, V., Black, M. J.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 3889-3898, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
Existing optical flow methods make generic, spatially homogeneous, assumptions about the spatial structure of the flow. In reality, optical flow varies across an image depending on object class. Simply put, different objects move differently. Here we exploit recent advances in static semantic scene segmentation to segment the image into objects of different types. We define different models of image motion in these regions depending on the type of object. For example, we model the motion on roads with homographies, vegetation with spatially smooth flow, and independently moving objects like cars and planes with affine motion plus deviations. We then pose the flow estimation problem using a novel formulation of localized layers, which addresses limitations of traditional layered models for dealing with complex scene motion. Our semantic flow method achieves the lowest error of any published monocular method in the KITTI-2015 flow benchmark and produces qualitatively better flow and segmentation than recent top methods on a wide range of natural videos.

ps

video Kitti Precomputed Data (1.6GB) pdf YouTube Sequences Code Project Page Project Page [BibTex]

video Kitti Precomputed Data (1.6GB) pdf YouTube Sequences Code Project Page Project Page [BibTex]


Thumb xl tes cvpr16 bilateral
Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks

Jampani, V., Kiefel, M., Gehler, P. V.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 4452-4461, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (inproceedings)

Abstract
Bilateral filters have wide spread use due to their edge-preserving properties. The common use case is to manually choose a parametric filter type, usually a Gaussian filter. In this paper, we will generalize the parametrization and in particular derive a gradient descent algorithm so the filter parameters can be learned from data. This derivation allows to learn high dimensional linear filters that operate in sparsely populated feature spaces. We build on the permutohedral lattice construction for efficient filtering. The ability to learn more general forms of high-dimensional filters can be used in several diverse applications. First, we demonstrate the use in applications where single filter applications are desired for runtime reasons. Further, we show how this algorithm can be used to learn the pairwise potentials in densely connected conditional random fields and apply these to different image segmentation tasks. Finally, we introduce layers of bilateral filters in CNNs and propose bilateral neural networks for the use of high-dimensional sparse data. This view provides new ways to encode model structure into network architectures. A diverse set of experiments empirically validates the usage of general forms of filters.

ps

project page code CVF open-access pdf supplementary poster Project Page Project Page [BibTex]

project page code CVF open-access pdf supplementary poster Project Page Project Page [BibTex]


no image
Recovery of non-linear cause-effect relationships from linearly mixed neuroimaging data

Weichwald, S., Gretton, A., Schölkopf, B., Grosse-Wentrup, M.

Proceedings of the 6th International Workshop on Pattern Recognition in NeuroImaging (PRNI 2016), June 2016 (conference)

ei

PDF Arxiv Code DOI Project Page [BibTex]

PDF Arxiv Code DOI Project Page [BibTex]