Header logo is


2018


Thumb xl screenshot 2018 05 18 16 38 40
Learning 3D Shape Completion under Weak Supervision

Stutz, D., Geiger, A.

Arxiv, May 2018 (article)

Abstract
We address the problem of 3D shape completion from sparse and noisy point clouds, a fundamental problem in computer vision and robotics. Recent approaches are either data-driven or learning-based: Data-driven approaches rely on a shape model whose parameters are optimized to fit the observations; Learning-based approaches, in contrast, avoid the expensive optimization step by learning to directly predict complete shapes from incomplete observations in a fully-supervised setting. However, full supervision is often not available in practice. In this work, we propose a weakly-supervised learning-based approach to 3D shape completion which neither requires slow optimization nor direct supervision. While we also learn a shape prior on synthetic data, we amortize, i.e., learn, maximum likelihood fitting using deep neural networks resulting in efficient shape completion without sacrificing accuracy. On synthetic benchmarks based on ShapeNet and ModelNet as well as on real robotics data from KITTI and Kinect, we demonstrate that the proposed amortized maximum likelihood approach is able to compete with fully supervised baselines and outperforms data-driven approaches, while requiring less supervision and being significantly faster.

avg

PDF Project Page Project Page [BibTex]


no image
Schema-related cognitive load influences performance, speech, and physiology in a dual-task setting: A continuous multi-measure approach

Wirzberger, M., Herms, R., Esmaeili Bijarsari, S., Eibl, M., Rey, G. D.

Cognitive Research: Principles and Implications, 3:46, Springer Nature, 2018 (article)

Abstract
Schema acquisition processes comprise an essential source of cognitive demands in learning situations. To shed light on related mechanisms and influencing factors, this study applied a continuous multi-measure approach for cognitive load assessment. In a dual-task setting, a sample of 123 student participants learned visually presented symbol combinations with one of two levels of complexity while memorizing auditorily presented number sequences. Learners’ cognitive load during the learning task was addressed by secondary task performance, prosodic speech parameters (pauses, articulation rate), and physiological markers (heart rate, skin conductance response). While results revealed increasing primary and secondary task performance over the trials, decreases in speech and physiological parameters indicated a reduction in the overall level of cognitive load with task progression. In addition, the robustness of the acquired schemata was confirmed by a transfer task that required participants to apply the obtained symbol combinations. Taken together, the observed pattern of evidence supports the idea of a logarithmically decreasing progression of cognitive load with increasing schema acquisition, and further hints on robust and stable transfer performance, even under enhanced transfer demands. Finally, theoretical and practical consequences consider evidence on desirable difficulties in learning as well as the potential of multimodal cognitive load detection in learning applications.

re

DOI [BibTex]

DOI [BibTex]


Thumb xl hassan teaser paper
Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes

Alhaija, H., Mustikovela, S., Mescheder, L., Geiger, A., Rother, C.

International Journal of Computer Vision (IJCV), 2018, 2018 (article)

Abstract
The success of deep learning in computer vision is based on the availability of large annotated datasets. To lower the need for hand labeled images, virtually rendered 3D worlds have recently gained popularity. Unfortunately, creating realistic 3D content is challenging on its own and requires significant human effort. In this work, we propose an alternative paradigm which combines real and synthetic data for learning semantic instance segmentation and object detection models. Exploiting the fact that not all aspects of the scene are equally important for this task, we propose to augment real-world imagery with virtual objects of the target category. Capturing real-world images at large scale is easy and cheap, and directly provides real background appearances without the need for creating complex 3D models of the environment. We present an efficient procedure to augment these images with virtual objects. In contrast to modeling complete 3D environments, our data augmentation approach requires only a few user interactions in combination with 3D models of the target object category. Leveraging our approach, we introduce a novel dataset of augmented urban driving scenes with 360 degree images that are used as environment maps to create realistic lighting and reflections on rendered objects. We analyze the significance of realistic object placement by comparing manual placement by humans to automatic methods based on semantic scene analysis. This allows us to create composite images which exhibit both realistic background appearance as well as a large number of complex object arrangements. Through an extensive set of experiments, we conclude the right set of parameters to produce augmented data which can maximally enhance the performance of instance segmentation models. Further, we demonstrate the utility of the proposed approach on training standard deep models for semantic instance segmentation and object detection of cars in outdoor driving scenarios. We test the models trained on our augmented data on the KITTI 2015 dataset, which we have annotated with pixel-accurate ground truth, and on the Cityscapes dataset. Our experiments demonstrate that the models trained on augmented imagery generalize better than those trained on fully synthetic data or models trained on limited amounts of annotated real data.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


no image
Attention please! Enhanced attention control abilities compensate for instructional impairments in multimedia learning

Wirzberger, M., Rey, G. D.

Journal of Computers in Education, 5(2):243-257, Springer Nature, 2018 (article)

Abstract
Learners exposed to multimedia learning contexts have to deal with a variety of visual stimuli, demanding a conducive design of learning material to maintain limitations in attentional resources. Within the current study, effects and constraints arising from two selected impairing features are investigated in more detail within a computer-based learning task on factor analysis. A sample of 53 students received a combination of textual and pictorial elements that explained the topic, while impaired attention was systematically induced in a 2 × 2 factorial between-subjects design by interrupting system-notifications (with vs. without) and seductive text passages (with vs. without). Learners’ ability for controlled attention was assessed with a standardized psychological attention inventory. Approaching the results, learners receiving seductive text passages spent significantly more time on the learning material. In addition, a moderation effect of attention control abilities on the relationship between interruptions and retention performance resulted. Explanations for the obtained findings are discussed referring to mechanisms of compensation, load, and activation.

re

DOI [BibTex]

DOI [BibTex]


no image
The Computational Challenges of Pursuing Multiple Goals: Network Structure of Goal Systems Predicts Human Performance

Reichman, D., Lieder, F., Bourgin, D. D., Talmon, N., Griffiths, T. L.

PsyArXiv, 2018 (article)

re

DOI [BibTex]

DOI [BibTex]


no image
The moderating role of arousal on the seductive detail effect in a multimedia learning setting

Schneider, S., Wirzberger, M., Rey, G. D.

Applied Cognitive Psychology, Wiley, 2018 (article)

Abstract
Arousal has been found to increase learners' attentional resources. In contrast, seductive details (interesting but learning‐irrelevant information) are considered to distract attention away from relevant information and, thus, hinder learning. However, a possibly moderating role of arousal on the seductive detail effect has not been examined yet. In this study, arousal variations were induced via audio files of false heartbeats. In consequence, 100 participants were randomly assigned to a 2 (with or without seductive details) × 2 (lower vs. higher false heart rates) between‐subjects design. Data on learning performance, cognitive load, motivation, heartbeat frequency, and electro‐dermal activity were collected. Results show learning‐inhibiting effects for seductive details and learning‐enhancing effects for higher false heart rates. Cognitive processes mediate both effects. However, the detrimental effect of seductive details was not present when heart rate was higher. Results indicate that the seductive detail effect is moderated by a learner's state of arousal.

re

DOI [BibTex]

DOI [BibTex]


Thumb xl stutz
Learning 3D Shape Completion under Weak Supervision

Stutz, D., Geiger, A.

International Journal of Computer Vision (IJCV), 2018, 2018 (article)

Abstract
We address the problem of 3D shape completion from sparse and noisy point clouds, a fundamental problem in computer vision and robotics. Recent approaches are either data-driven or learning-based: Data-driven approaches rely on a shape model whose parameters are optimized to fit the observations; Learning-based approaches, in contrast, avoid the expensive optimization step by learning to directly predict complete shapes from incomplete observations in a fully-supervised setting. However, full supervision is often not available in practice. In this work, we propose a weakly-supervised learning-based approach to 3D shape completion which neither requires slow optimization nor direct supervision. While we also learn a shape prior on synthetic data, we amortize, i.e., learn, maximum likelihood fitting using deep neural networks resulting in efficient shape completion without sacrificing accuracy. On synthetic benchmarks based on ShapeNet and ModelNet as well as on real robotics data from KITTI and Kinect, we demonstrate that the proposed amortized maximum likelihood approach is able to compete with a fully supervised baseline and outperforms the data-driven approach of Engelmann et al., while requiring less supervision and being significantly faster.

avg

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl objectflow
Object Scene Flow

Menze, M., Heipke, C., Geiger, A.

ISPRS Journal of Photogrammetry and Remote Sensing, 2018 (article)

Abstract
This work investigates the estimation of dense three-dimensional motion fields, commonly referred to as scene flow. While great progress has been made in recent years, large displacements and adverse imaging conditions as observed in natural outdoor environments are still very challenging for current approaches to reconstruction and motion estimation. In this paper, we propose a unified random field model which reasons jointly about 3D scene flow as well as the location, shape and motion of vehicles in the observed scene. We formulate the problem as the task of decomposing the scene into a small number of rigidly moving objects sharing the same motion parameters. Thus, our formulation effectively introduces long-range spatial dependencies which commonly employed local rigidity priors are lacking. Our inference algorithm then estimates the association of image segments and object hypotheses together with their three-dimensional shape and motion. We demonstrate the potential of the proposed approach by introducing a novel challenging scene flow benchmark which allows for a thorough comparison of the proposed scene flow approach with respect to various baseline models. In contrast to previous benchmarks, our evaluation is the first to provide stereo and optical flow ground truth for dynamic real-world urban scenes at large scale. Our experiments reveal that rigid motion segmentation can be utilized as an effective regularizer for the scene flow problem, improving upon existing two-frame scene flow methods. At the same time, our method yields plausible object segmentations without requiring an explicitly trained recognition model for a specific object class.

avg

Project Page [BibTex]

Project Page [BibTex]


no image
Rational metareasoning and the plasticity of cognitive control

Lieder, F., Shenhav, A., Musslick, S., Griffiths, T. L.

{PLoS Computational Biology}, 14(4):e1006043, Public Library of Science, 2018 (article)

re

Project Page [BibTex]

Project Page [BibTex]


no image
Beyond bounded rationality: Reverse-engineering and enhancing human intelligence

Lieder, F.

University of California, Berkeley, 2018 (phdthesis)

re

[BibTex]


no image
Over-representation of extreme events in decision making reflects rational use of cognitive resources

Lieder, F., Griffiths, T. L., Hsu, M.

Psychological Review, 125(1):1-32, 2018 (article)

re

[BibTex]

[BibTex]

2014


Thumb xl pami
3D Traffic Scene Understanding from Movable Platforms

Geiger, A., Lauer, M., Wojek, C., Stiller, C., Urtasun, R.

IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 36(5):1012-1025, published, IEEE, Los Alamitos, CA, May 2014 (article)

Abstract
In this paper, we present a novel probabilistic generative model for multi-object traffic scene understanding from movable platforms which reasons jointly about the 3D scene layout as well as the location and orientation of objects in the scene. In particular, the scene topology, geometry and traffic activities are inferred from short video sequences. Inspired by the impressive driving capabilities of humans, our model does not rely on GPS, lidar or map knowledge. Instead, it takes advantage of a diverse set of visual cues in the form of vehicle tracklets, vanishing points, semantic scene labels, scene flow and occupancy grids. For each of these cues we propose likelihood functions that are integrated into a probabilistic generative model. We learn all model parameters from training data using contrastive divergence. Experiments conducted on videos of 113 representative intersections show that our approach successfully infers the correct layout in a variety of very challenging scenarios. To evaluate the importance of each feature cue, experiments using different feature combinations are conducted. Furthermore, we show how by employing context derived from the proposed method we are able to improve over the state-of-the-art in terms of object detection and object orientation estimation in challenging and cluttered urban environments.

avg ps

pdf link (url) [BibTex]

2014


pdf link (url) [BibTex]


no image
Smart@load? Modeling interruption while using a Smartphone-app in alternating workload conditions

Wirzberger, M.

TU Berlin, 2014 (mastersthesis)

Abstract
Based on a time course model of interruption and resumption, the current thesis aims to inspect cognitive processes after being interrupted by product advertisements while performing a shopping task with a smartphone application. In doing so, different levels of mental workload, which are assumed to influence human performance as well as resumption strategy choice in this context, are taken into account. Within the applied research approach, cognitive modeling in the framework of the cognitive architecture ACT-R is combined with the development of a corresponding experimental design. The derived model predictions are validated with a 2x3-factorial design that includes repeated measures upon the second factor, and consists of 62 human participants. In detail, the influence of mental workload (high vs. low) and interruption (no vs. low vs. high) on various aspects of task-related performance and the applied resumption strategy is assessed. While the inspected performance parameters and resumption strategy choice usually point towards the expected direction for the model data, a converse pattern for the human data shows up in most cases. Comparing model and human data for each level of workload displays rather mixed results that are discussed afterwards. An outline of potential expansions and toeholds for future research within and beyond the mobile sector forms the completion of the thesis.

re

DOI [BibTex]


no image
Modeling of cognitive aspects of mobile interaction

Russwinkel, N., Prezenski, S., Lindner, S., Halbrügge, M., Schulz, M., Wirzberger, M.

Cognitive Processing, 15(Suppl.1), pages: S22-S24, Springer Nature, 2014 (article)

re

DOI [BibTex]

DOI [BibTex]