Header logo is


2019


Thumb xl screenshot from 2019 12 09 17 01 24
Generating 3D People in Scenes without People

Zhang, Y., Hassan, M., Neumann, H., Black, M. J., Tang, S.

arXiv:1912.02923, December 2019 (article)

Abstract
We present a fully-automatic system that takes a 3D scene and generates plausible 3D human bodies that are posed naturally in that 3D scene. Given a 3D scene without people, humans can easily imagine how people could interact with the scene and the objects in it. However, this is a challenging task for a computer as solving it requires (1) the generated human bodies should be semantically plausible with the 3D environment, e.g. people sitting on the sofa or cooking near the stove; (2) the generated human-scene interaction should be physically feasible in the way that the human body and scene do not interpenetrate while, at the same time, body-scene contact supports physical interactions. To that end, we make use of the surface-based 3D human model SMPL-X. We first train a conditional variational autoencoder to predict semantically plausible 3D human pose conditioned on latent scene representations, then we further refine the generated 3D bodies using scene constraints to enforce feasible physical interaction. We show that our approach is able to synthesize realistic and expressive 3D human bodies that naturally interact with 3D environment. We perform extensive experiments demonstrating that our generative framework compares favorably with existing methods, both qualitatively and quantitatively. We believe that our scene-conditioned 3D human generation pipeline will be useful for numerous applications; e.g. to generate training data for human pose estimation, in video games and in VR/AR.

ps

PDF link (url) [BibTex]

2019



Thumb xl multihumanoflow thumb
Learning Multi-Human Optical Flow

Ranjan, A., Hoffmann, D. T., Tzionas, D., Tang, S., Romero, J., Black, M. J.

International Journal of Computer Vision (IJCV), December 2019 (article)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body and motion capture data to synthesize realistic flow fields in both single-and multi-person images. We then train optical flow networks to estimate human flow fields from pairs of images. We demonstrate that our trained networks are more accurate than a wide range of top methods on held-out test data and that they can generalize well to real image sequences. The code, trained models and the dataset are available for research.

ps

Paper poster link (url) DOI [BibTex]

Paper poster link (url) DOI [BibTex]


Thumb xl celia
Decoding subcategories of human bodies from both body- and face-responsive cortical regions

Foster, C., Zhao, M., Romero, J., Black, M. J., Mohler, B. J., Bartels, A., Bülthoff, I.

NeuroImage, 202(15):116085, November 2019 (article)

Abstract
Our visual system can easily categorize objects (e.g. faces vs. bodies) and further differentiate them into subcategories (e.g. male vs. female). This ability is particularly important for objects of social significance, such as human faces and bodies. While many studies have demonstrated category selectivity to faces and bodies in the brain, how subcategories of faces and bodies are represented remains unclear. Here, we investigated how the brain encodes two prominent subcategories shared by both faces and bodies, sex and weight, and whether neural responses to these subcategories rely on low-level visual, high-level visual or semantic similarity. We recorded brain activity with fMRI while participants viewed faces and bodies that varied in sex, weight, and image size. The results showed that the sex of bodies can be decoded from both body- and face-responsive brain areas, with the former exhibiting more consistent size-invariant decoding than the latter. Body weight could also be decoded in face-responsive areas and in distributed body-responsive areas, and this decoding was also invariant to image size. The weight of faces could be decoded from the fusiform body area (FBA), and weight could be decoded across face and body stimuli in the extrastriate body area (EBA) and a distributed body-responsive area. The sex of well-controlled faces (e.g. excluding hairstyles) could not be decoded from face- or body-responsive regions. These results demonstrate that both face- and body-responsive brain regions encode information that can distinguish the sex and weight of bodies. Moreover, the neural patterns corresponding to sex and weight were invariant to image size and could sometimes generalize across face and body stimuli, suggesting that such subcategorical information is encoded with a high-level visual or semantic code.

ps

paper pdf DOI [BibTex]

paper pdf DOI [BibTex]


Thumb xl autonomous mocap cover image new
Active Perception based Formation Control for Multiple Aerial Vehicles

Tallamraju, R., Price, E., Ludwig, R., Karlapalem, K., Bülthoff, H. H., Black, M. J., Ahmad, A.

IEEE Robotics and Automation Letters, Robotics and Automation Letters, 4(4):4491-4498, IEEE, October 2019 (article)

Abstract
We present a novel robotic front-end for autonomous aerial motion-capture (mocap) in outdoor environments. In previous work, we presented an approach for cooperative detection and tracking (CDT) of a subject using multiple micro-aerial vehicles (MAVs). However, it did not ensure optimal view-point configurations of the MAVs to minimize the uncertainty in the person's cooperatively tracked 3D position estimate. In this article, we introduce an active approach for CDT. In contrast to cooperatively tracking only the 3D positions of the person, the MAVs can actively compute optimal local motion plans, resulting in optimal view-point configurations, which minimize the uncertainty in the tracked estimate. We achieve this by decoupling the goal of active tracking into a quadratic objective and non-convex constraints corresponding to angular configurations of the MAVs w.r.t. the person. We derive this decoupling using Gaussian observation model assumptions within the CDT algorithm. We preserve convexity in optimization by embedding all the non-convex constraints, including those for dynamic obstacle avoidance, as external control inputs in the MPC dynamics. Multiple real robot experiments and comparisons involving 3 MAVs in several challenging scenarios are presented.

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl 3dmm
3D Morphable Face Models - Past, Present and Future

Egger, B., Smith, W. A. P., Tewari, A., Wuhrer, S., Zollhoefer, M., Beeler, T., Bernard, F., Bolkart, T., Kortylewski, A., Romdhani, S., Theobalt, C., Blanz, V., Vetter, T.

arxiv preprint arXiv:1909.01815, September 2019 (article)

Abstract
In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation,and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research and highlighting the broad range of current and future applications.

ps

paper project page [BibTex]

paper project page [BibTex]


Thumb xl motorized device
Implementation of a 6-DOF Parallel Continuum Manipulator for Delivering Fingertip Tactile Cues

Young, E. M., Kuchenbecker, K. J.

IEEE Transactions on Haptics, 12(3):295-306, June 2019 (article)

Abstract
Existing fingertip haptic devices can deliver different subsets of tactile cues in a compact package, but we have not yet seen a wearable six-degree-of-freedom (6-DOF) display. This paper presents the Fuppeteer (short for Fingertip Puppeteer), a device that is capable of controlling the position and orientation of a flat platform, such that any combination of normal and shear force can be delivered at any location on any human fingertip. We build on our previous work of designing a parallel continuum manipulator for fingertip haptics by presenting a motorized version in which six flexible Nitinol wires are actuated via independent roller mechanisms and proportional-derivative controllers. We evaluate the settling time and end-effector vibrations observed during system responses to step inputs. After creating a six-dimensional lookup table and adjusting simulated inputs using measured Jacobians, we show that the device can make contact with all parts of the fingertip with a mean error of 1.42 mm. Finally, we present results from a human-subject study. A total of 24 users discerned 9 evenly distributed contact locations with an average accuracy of 80.5%. Translational and rotational shear cues were identified reasonably well near the center of the fingertip and more poorly around the edges.

hi

DOI [BibTex]


Thumb xl hessepami
Learning and Tracking the 3D Body Shape of Freely Moving Infants from RGB-D sequences

Hesse, N., Pujades, S., Black, M., Arens, M., Hofmann, U., Schroeder, S.

Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019 (article)

Abstract
Statistical models of the human body surface are generally learned from thousands of high-quality 3D scans in predefined poses to cover the wide variety of human body shapes and articulations. Acquisition of such data requires expensive equipment, calibration procedures, and is limited to cooperative subjects who can understand and follow instructions, such as adults. We present a method for learning a statistical 3D Skinned Multi-Infant Linear body model (SMIL) from incomplete, low-quality RGB-D sequences of freely moving infants. Quantitative experiments show that SMIL faithfully represents the RGB-D data and properly factorizes the shape and pose of the infants. To demonstrate the applicability of SMIL, we fit the model to RGB-D sequences of freely moving infants and show, with a case study, that our method captures enough motion detail for General Movements Assessment (GMA), a method used in clinical practice for early detection of neurodevelopmental disorders in infants. SMIL provides a new tool for analyzing infant shape and movement and is a step towards an automated system for GMA.

ps

pdf Journal DOI [BibTex]

pdf Journal DOI [BibTex]


Thumb xl kenny
Perceptual Effects of Inconsistency in Human Animations

Kenny, S., Mahmood, N., Honda, C., Black, M. J., Troje, N. F.

ACM Trans. Appl. Percept., 16(1):2:1-2:18, Febuary 2019 (article)

Abstract
The individual shape of the human body, including the geometry of its articulated structure and the distribution of weight over that structure, influences the kinematics of a person’s movements. How sensitive is the visual system to inconsistencies between shape and motion introduced by retargeting motion from one person onto the shape of another? We used optical motion capture to record five pairs of male performers with large differences in body weight, while they pushed, lifted, and threw objects. From these data, we estimated both the kinematics of the actions as well as the performer’s individual body shape. To obtain consistent and inconsistent stimuli, we created animated avatars by combining the shape and motion estimates from either a single performer or from different performers. Using these stimuli we conducted three experiments in an immersive virtual reality environment. First, a group of participants detected which of two stimuli was inconsistent. Performance was very low, and results were only marginally significant. Next, a second group of participants rated perceived attractiveness, eeriness, and humanness of consistent and inconsistent stimuli, but these judgements of animation characteristics were not affected by consistency of the stimuli. Finally, a third group of participants rated properties of the objects rather than of the performers. Here, we found strong influences of shape-motion inconsistency on perceived weight and thrown distance of objects. This suggests that the visual system relies on its knowledge of shape and motion and that these components are assimilated into an altered perception of the action outcome. We propose that the visual system attempts to resist inconsistent interpretations of human animations. Actions involving object manipulations present an opportunity for the visual system to reinterpret the introduced inconsistencies as a change in the dynamics of an object rather than as an unexpected combination of body shape and body motion.

ps

publisher pdf DOI [BibTex]

publisher pdf DOI [BibTex]


no image
How Does It Feel to Clap Hands with a Robot?

Fitter, N. T., Kuchenbecker, K. J.

International Journal of Social Robotics, 2019 (article) Accepted

Abstract
Future robots may need lighthearted physical interaction capabilities to connect with people in meaningful ways. To begin exploring how users perceive playful human–robot hand-to-hand interaction, we conducted a study with 20 participants. Each user played simple hand-clapping games with the Rethink Robotics Baxter Research Robot during a 1-h-long session involving 24 randomly ordered conditions that varied in facial reactivity, physical reactivity, arm stiffness, and clapping tempo. Survey data and experiment recordings demonstrate that this interaction is viable: all users successfully completed the experiment and mentioned enjoying at least one game without prompting. Hand-clapping tempo was highly salient to users, and human-like robot errors were more widely accepted than mechanical errors. Furthermore, perceptions of Baxter varied in the following statistically significant ways: facial reactivity increased the robot’s perceived pleasantness and energeticness; physical reactivity decreased pleasantness, energeticness, and dominance; higher arm stiffness increased safety and decreased dominance; and faster tempo increased energeticness and increased dominance. These findings can motivate and guide roboticists who want to design social–physical human–robot interactions.

hi

[BibTex]

[BibTex]


Thumb xl screenshot 2019 12 11 at 20.48.00
Tactile Roughness Perception of Virtual Gratings by Electrovibration

Isleyen, A., Vardar, Y., Basdogan, C.

IEEE Transactions on Haptics, 2019 (article) Accepted

hi

[BibTex]

[BibTex]


Thumb xl virtualcaliper
The Virtual Caliper: Rapid Creation of Metrically Accurate Avatars from 3D Measurements

Pujades, S., Mohler, B., Thaler, A., Tesch, J., Mahmood, N., Hesse, N., Bülthoff, H. H., Black, M. J.

IEEE Transactions on Visualization and Computer Graphics, 25, pages: 1887,1897, IEEE, 2019 (article)

Abstract
Creating metrically accurate avatars is important for many applications such as virtual clothing try-on, ergonomics, medicine, immersive social media, telepresence, and gaming. Creating avatars that precisely represent a particular individual is challenging however, due to the need for expensive 3D scanners, privacy issues with photographs or videos, and difficulty in making accurate tailoring measurements. We overcome these challenges by creating “The Virtual Caliper”, which uses VR game controllers to make simple measurements. First, we establish what body measurements users can reliably make on their own body. We find several distance measurements to be good candidates and then verify that these are linearly related to 3D body shape as represented by the SMPL body model. The Virtual Caliper enables novice users to accurately measure themselves and create an avatar with their own body shape. We evaluate the metric accuracy relative to ground truth 3D body scan data, compare the method quantitatively to other avatar creation tools, and perform extensive perceptual studies. We also provide a software application to the community that enables novices to rapidly create avatars in fewer than five minutes. Not only is our approach more rapid than existing methods, it exports a metrically accurate 3D avatar model that is rigged and skinned.

ps

Project Page IEEE Open Access IEEE Open Access PDF DOI [BibTex]

Project Page IEEE Open Access IEEE Open Access PDF DOI [BibTex]

2018


Thumb xl dip final
Deep Inertial Poser: Learning to Reconstruct Human Pose from Sparse Inertial Measurements in Real Time

Huang, Y., Kaufmann, M., Aksan, E., Black, M. J., Hilliges, O., Pons-Moll, G.

ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), 37, pages: 185:1-185:15, ACM, November 2018, Two first authors contributed equally (article)

Abstract
We demonstrate a novel deep neural network capable of reconstructing human full body pose in real-time from 6 Inertial Measurement Units (IMUs) worn on the user's body. In doing so, we address several difficult challenges. First, the problem is severely under-constrained as multiple pose parameters produce the same IMU orientations. Second, capturing IMU data in conjunction with ground-truth poses is expensive and difficult to do in many target application scenarios (e.g., outdoors). Third, modeling temporal dependencies through non-linear optimization has proven effective in prior work but makes real-time prediction infeasible. To address this important limitation, we learn the temporal pose priors using deep learning. To learn from sufficient data, we synthesize IMU data from motion capture datasets. A bi-directional RNN architecture leverages past and future information that is available at training time. At test time, we deploy the network in a sliding window fashion, retaining real time capabilities. To evaluate our method, we recorded DIP-IMU, a dataset consisting of 10 subjects wearing 17 IMUs for validation in 64 sequences with 330,000 time instants; this constitutes the largest IMU dataset publicly available. We quantitatively evaluate our approach on multiple datasets and show results from a real-time implementation. DIP-IMU and the code are available for research purposes.

ps

data code pdf preprint video DOI Project Page [BibTex]

2018


data code pdf preprint video DOI Project Page [BibTex]


Thumb xl cover
Deep Neural Network-based Cooperative Visual Tracking through Multiple Micro Aerial Vehicles

Price, E., Lawless, G., Ludwig, R., Martinovic, I., Buelthoff, H. H., Black, M. J., Ahmad, A.

IEEE Robotics and Automation Letters, Robotics and Automation Letters, 3(4):3193-3200, IEEE, October 2018, Also accepted and presented in the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). (article)

Abstract
Multi-camera tracking of humans and animals in outdoor environments is a relevant and challenging problem. Our approach to it involves a team of cooperating micro aerial vehicles (MAVs) with on-board cameras only. DNNs often fail at objects with small scale or far away from the camera, which are typical characteristics of a scenario with aerial robots. Thus, the core problem addressed in this paper is how to achieve on-board, online, continuous and accurate vision-based detections using DNNs for visual person tracking through MAVs. Our solution leverages cooperation among multiple MAVs and active selection of most informative regions of image. We demonstrate the efficiency of our approach through simulations with up to 16 robots and real robot experiments involving two aerial robots tracking a person, while maintaining an active perception-driven formation. ROS-based source code is provided for the benefit of the community.

ps

Published Version link (url) DOI [BibTex]

Published Version link (url) DOI [BibTex]


Thumb xl huggiebot
Softness, Warmth, and Responsiveness Improve Robot Hugs

Block, A. E., Kuchenbecker, K. J.

International Journal of Social Robotics, 11(1):49-64, October 2018 (article)

Abstract
Hugs are one of the first forms of contact and affection humans experience. Due to their prevalence and health benefits, roboticists are naturally interested in having robots one day hug humans as seamlessly as humans hug other humans. This project's purpose is to evaluate human responses to different robot physical characteristics and hugging behaviors. Specifically, we aim to test the hypothesis that a soft, warm, touch-sensitive PR2 humanoid robot can provide humans with satisfying hugs by matching both their hugging pressure and their hugging duration. Thirty relatively young and rather technical participants experienced and evaluated twelve hugs with the robot, divided into three randomly ordered trials that focused on physical robot characteristics (single factor, three levels) and nine randomly ordered trials with low, medium, and high hug pressure and duration (two factors, three levels each). Analysis of the results showed that people significantly prefer soft, warm hugs over hard, cold hugs. Furthermore, users prefer hugs that physically squeeze them and release immediately when they are ready for the hug to end. Taking part in the experiment also significantly increased positive user opinions of robots and robot use.

hi

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]


Thumb xl alice
First Impressions of Personality Traits From Body Shapes

Hu, Y., Parde, C. J., Hill, M. Q., Mahmood, N., O’Toole, A. J.

Psychological Science, 29(12):1969-–1983, October 2018 (article)

Abstract
People infer the personalities of others from their facial appearance. Whether they do so from body shapes is less studied. We explored personality inferences made from body shapes. Participants rated personality traits for male and female bodies generated with a three-dimensional body model. Multivariate spaces created from these ratings indicated that people evaluate bodies on valence and agency in ways that directly contrast positive and negative traits from the Big Five domains. Body-trait stereotypes based on the trait ratings revealed a myriad of diverse body shapes that typify individual traits. Personality-trait profiles were predicted reliably from a subset of the body-shape features used to specify the three-dimensional bodies. Body features related to extraversion and conscientiousness were predicted with the highest consensus, followed by openness traits. This study provides the first comprehensive look at the range, diversity, and reliability of personality inferences that people make from body shapes.

ps

publisher site pdf DOI [BibTex]

publisher site pdf DOI [BibTex]


Thumb xl fict 05 00018 g003
Visual Perception and Evaluation of Photo-Realistic Self-Avatars From 3D Body Scans in Males and Females

Thaler, A., Piryankova, I., Stefanucci, J. K., Pujades, S., de la Rosa, S., Streuber, S., Romero, J., Black, M. J., Mohler, B. J.

Frontiers in ICT, 5, pages: 1-14, September 2018 (article)

Abstract
The creation or streaming of photo-realistic self-avatars is important for virtual reality applications that aim for perception and action to replicate real world experience. The appearance and recognition of a digital self-avatar may be especially important for applications related to telepresence, embodied virtual reality, or immersive games. We investigated gender differences in the use of visual cues (shape, texture) of a self-avatar for estimating body weight and evaluating avatar appearance. A full-body scanner was used to capture each participant's body geometry and color information and a set of 3D virtual avatars with realistic weight variations was created based on a statistical body model. Additionally, a second set of avatars was created with an average underlying body shape matched to each participant’s height and weight. In four sets of psychophysical experiments, the influence of visual cues on the accuracy of body weight estimation and the sensitivity to weight changes was assessed by manipulating body shape (own, average) and texture (own photo-realistic, checkerboard). The avatars were presented on a large-screen display, and participants responded to whether the avatar's weight corresponded to their own weight. Participants also adjusted the avatar's weight to their desired weight and evaluated the avatar's appearance with regard to similarity to their own body, uncanniness, and their willingness to accept it as a digital representation of the self. The results of the psychophysical experiments revealed no gender difference in the accuracy of estimating body weight in avatars. However, males accepted a larger weight range of the avatars as corresponding to their own. In terms of the ideal body weight, females but not males desired a thinner body. With regard to the evaluation of avatar appearance, the questionnaire responses suggest that own photo-realistic texture was more important to males for higher similarity ratings, while own body shape seemed to be more important to females. These results argue for gender-specific considerations when creating self-avatars.

ps

pdf DOI [BibTex]

pdf DOI [BibTex]


no image
Task-Driven PCA-Based Design Optimization of Wearable Cutaneous Devices

Pacchierotti, C., Young, E. M., Kuchenbecker, K. J.

IEEE Robotics and Automation Letters, 3(3):2214-2221, July 2018, Presented at ICRA 2018 (article)

Abstract
Small size and low weight are critical requirements for wearable and portable haptic interfaces, making it essential to work toward the optimization of their sensing and actuation systems. This paper presents a new approach for task-driven design optimization of fingertip cutaneous haptic devices. Given one (or more) target tactile interactions to render and a cutaneous device to optimize, we evaluate the minimum number and best configuration of the device’s actuators to minimize the estimated haptic rendering error. First, we calculate the motion needed for the original cutaneous device to render the considered target interaction. Then, we run a principal component analysis (PCA) to search for possible couplings between the original motor inputs, looking also for the best way to reconfigure them. If some couplings exist, we can re-design our cutaneous device with fewer motors, optimally configured to render the target tactile sensation. The proposed approach is quite general and can be applied to different tactile sensors and cutaneous devices. We validated it using a BioTac tactile sensor and custom plate-based 3-DoF and 6-DoF fingertip cutaneous devices, considering six representative target tactile interactions. The algorithm was able to find couplings between each device’s motor inputs, proving it to be a viable approach to optimize the design of wearable and portable cutaneous devices. Finally, we present two examples of optimized designs for our 3-DoF fingertip cutaneous device.

hi

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl mazen
Robust Physics-based Motion Retargeting with Realistic Body Shapes

Borno, M. A., Righetti, L., Black, M. J., Delp, S. L., Fiume, E., Romero, J.

Computer Graphics Forum, 37, pages: 6:1-12, July 2018 (article)

Abstract
Motion capture is often retargeted to new, and sometimes drastically different, characters. When the characters take on realistic human shapes, however, we become more sensitive to the motion looking right. This means adapting it to be consistent with the physical constraints imposed by different body shapes. We show how to take realistic 3D human shapes, approximate them using a simplified representation, and animate them so that they move realistically using physically-based retargeting. We develop a novel spacetime optimization approach that learns and robustly adapts physical controllers to new bodies and constraints. The approach automatically adapts the motion of the mocap subject to the body shape of a target subject. This motion respects the physical properties of the new body and every body shape results in a different and appropriate movement. This makes it easy to create a varied set of motions from a single mocap sequence by simply varying the characters. In an interactive environment, successful retargeting requires adapting the motion to unexpected external forces. We achieve robustness to such forces using a novel LQR-tree formulation. We show that the simulated motions look appropriate to each character’s anatomy and their actions are robust to perturbations.

mg ps

pdf video Project Page Project Page [BibTex]

pdf video Project Page Project Page [BibTex]


Thumb xl fitter18 frai imus
Teaching a Robot Bimanual Hand-Clapping Games via Wrist-Worn IMUs

Fitter, N. T., Kuchenbecker, K. J.

Frontiers in Robotics and Artificial Intelligence, 5(85), July 2018 (article)

Abstract
Colleagues often shake hands in greeting, friends connect through high fives, and children around the world rejoice in hand-clapping games. As robots become more common in everyday human life, they will have the opportunity to join in these social-physical interactions, but few current robots are intended to touch people in friendly ways. This article describes how we enabled a Baxter Research Robot to both teach and learn bimanual hand-clapping games with a human partner. Our system monitors the user's motions via a pair of inertial measurement units (IMUs) worn on the wrists. We recorded a labeled library of 10 common hand-clapping movements from 10 participants; this dataset was used to train an SVM classifier to automatically identify hand-clapping motions from previously unseen participants with a test-set classification accuracy of 97.0%. Baxter uses these sensors and this classifier to quickly identify the motions of its human gameplay partner, so that it can join in hand-clapping games. This system was evaluated by N = 24 naïve users in an experiment that involved learning sequences of eight motions from Baxter, teaching Baxter eight-motion game patterns, and completing a free interaction period. The motion classification accuracy in this less structured setting was 85.9%, primarily due to unexpected variations in motion timing. The quantitative task performance results and qualitative participant survey responses showed that learning games from Baxter was significantly easier than teaching games to Baxter, and that the teaching role caused users to consider more teamwork aspects of the gameplay. Over the course of the experiment, people felt more understood by Baxter and became more willing to follow the example of the robot. Users felt uniformly safe interacting with Baxter, and they expressed positive opinions of Baxter and reported fun interacting with the robot. Taken together, the results indicate that this robot achieved credible social-physical interaction with humans and that its ability to both lead and follow systematically changed the human partner's experience.

hi

DOI [BibTex]

DOI [BibTex]


no image
Automatically Rating Trainee Skill at a Pediatric Laparoscopic Suturing Task

Oquendo, Y. A., Riddle, E. W., Hiller, D., Blinman, T. A., Kuchenbecker, K. J.

Surgical Endoscopy, 32(4):1840-1857, April 2018 (article)

hi

DOI [BibTex]

DOI [BibTex]


Thumb xl animage2mask3
Assessing body image in anorexia nervosa using biometric self-avatars in virtual reality: Attitudinal components rather than visual body size estimation are distorted

Mölbert, S. C., Thaler, A., Mohler, B. J., Streuber, S., Romero, J., Black, M. J., Zipfel, S., Karnath, H., Giel, K. E.

Psychological Medicine, 48(4):642-653, March 2018 (article)

Abstract
Background: Body image disturbance (BID) is a core symptom of anorexia nervosa (AN), but as yet distinctive features of BID are unknown. The present study aimed at disentangling perceptual and attitudinal components of BID in AN. Methods: We investigated n=24 women with AN and n=24 controls. Based on a 3D body scan, we created realistic virtual 3D bodies (avatars) for each participant that were varied through a range of ±20% of the participants' weights. Avatars were presented in a virtual reality mirror scenario. Using different psychophysical tasks, participants identified and adjusted their actual and their desired body weight. To test for general perceptual biases in estimating body weight, a second experiment investigated perception of weight and shape matched avatars with another identity. Results: Women with AN and controls underestimated their weight, with a trend that women with AN underestimated more. The average desired body of controls had normal weight while the average desired weight of women with AN corresponded to extreme AN (DSM-5). Correlation analyses revealed that desired body weight, but not accuracy of weight estimation, was associated with eating disorder symptoms. In the second experiment, both groups estimated accurately while the most attractive body was similar to Experiment 1. Conclusions: Our results contradict the widespread assumption that patients with AN overestimate their body weight due to visual distortions. Rather, they illustrate that BID might be driven by distorted attitudes with regard to the desired body. Clinical interventions should aim at helping patients with AN to change their desired weight.

ps

doi pdf DOI Project Page [BibTex]


Thumb xl plos1
Body size estimation of self and others in females varying in BMI

Thaler, A., Geuss, M. N., Mölbert, S. C., Giel, K. E., Streuber, S., Romero, J., Black, M. J., Mohler, B. J.

PLoS ONE, 13(2), Febuary 2018 (article)

Abstract
Previous literature suggests that a disturbed ability to accurately identify own body size may contribute to overweight. Here, we investigated the influence of personal body size, indexed by body mass index (BMI), on body size estimation in a non-clinical population of females varying in BMI. We attempted to disentangle general biases in body size estimates and attitudinal influences by manipulating whether participants believed the body stimuli (personalized avatars with realistic weight variations) represented their own body or that of another person. Our results show that the accuracy of own body size estimation is predicted by personal BMI, such that participants with lower BMI underestimated their body size and participants with higher BMI overestimated their body size. Further, participants with higher BMI were less likely to notice the same percentage of weight gain than participants with lower BMI. Importantly, these results were only apparent when participants were judging a virtual body that was their own identity (Experiment 1), but not when they estimated the size of a body with another identity and the same underlying body shape (Experiment 2a). The different influences of BMI on accuracy of body size estimation and sensitivity to weight change for self and other identity suggests that effects of BMI on visual body size estimation are self-specific and not generalizable to other bodies.

ps

pdf DOI Project Page [BibTex]


no image
Immersive Low-Cost Virtual Reality Treatment for Phantom Limb Pain: Evidence from Two Cases

Ambron, E., Miller, A., Kuchenbecker, K. J., Buxbaum, L. J., Coslett, H. B.

Frontiers in Neurology, 9(67):1-7, 2018 (article)

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Thumb xl yanzhang clustering
Temporal Human Action Segmentation via Dynamic Clustering

Zhang, Y., Sun, H., Tang, S., Neumann, H.

arXiv preprint arXiv:1803.05790, 2018 (article)

Abstract
We present an effective dynamic clustering algorithm for the task of temporal human action segmentation, which has comprehensive applications such as robotics, motion analysis, and patient monitoring. Our proposed algorithm is unsupervised, fast, generic to process various types of features, and applica- ble in both the online and offline settings. We perform extensive experiments of processing data streams, and show that our algorithm achieves the state-of- the-art results for both online and offline settings.

ps

link (url) [BibTex]

link (url) [BibTex]


Thumb xl motion segmentation tracking clustering teaser
Motion Segmentation & Multiple Object Tracking by Correlation Co-Clustering

Keuper, M., Tang, S., Andres, B., Brox, T., Schiele, B.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018 (article)

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]

2013


Thumb xl thumb
Branch&Rank for Efficient Object Detection

Lehmann, A., Gehler, P., VanGool, L.

International Journal of Computer Vision, Springer, December 2013 (article)

Abstract
Ranking hypothesis sets is a powerful concept for efficient object detection. In this work, we propose a branch&rank scheme that detects objects with often less than 100 ranking operations. This efficiency enables the use of strong and also costly classifiers like non-linear SVMs with RBF-TeX kernels. We thereby relieve an inherent limitation of branch&bound methods as bounds are often not tight enough to be effective in practice. Our approach features three key components: a ranking function that operates on sets of hypotheses and a grouping of these into different tasks. Detection efficiency results from adaptively sub-dividing the object search space into decreasingly smaller sets. This is inherited from branch&bound, while the ranking function supersedes a tight bound which is often unavailable (except for rather limited function classes). The grouping makes the system effective: it separates image classification from object recognition, yet combines them in a single formulation, phrased as a structured SVM problem. A novel aspect of branch&rank is that a better ranking function is expected to decrease the number of classifier calls during detection. We use the VOC’07 dataset to demonstrate the algorithmic properties of branch&rank.

ps

pdf link (url) [BibTex]

2013


pdf link (url) [BibTex]


Thumb xl tro
Extracting Postural Synergies for Robotic Grasping

Romero, J., Feix, T., Ek, C., Kjellstrom, H., Kragic, D.

Robotics, IEEE Transactions on, 29(6):1342-1352, December 2013 (article)

ps

[BibTex]

[BibTex]


Thumb xl pic cviu13
Markov Random Field Modeling, Inference & Learning in Computer Vision & Image Understanding: A Survey

Wang, C., Komodakis, N., Paragios, N.

Computer Vision and Image Understanding (CVIU), 117(11):1610-1627, November 2013 (article)

Abstract
In this paper, we present a comprehensive survey of Markov Random Fields (MRFs) in computer vision and image understanding, with respect to the modeling, the inference and the learning. While MRFs were introduced into the computer vision field about two decades ago, they started to become a ubiquitous tool for solving visual perception problems around the turn of the millennium following the emergence of efficient inference methods. During the past decade, a variety of MRF models as well as inference and learning methods have been developed for addressing numerous low, mid and high-level vision problems. While most of the literature concerns pairwise MRFs, in recent years we have also witnessed significant progress in higher-order MRFs, which substantially enhances the expressiveness of graph-based models and expands the domain of solvable problems. This survey provides a compact and informative summary of the major literature in this research topic.

ps

Publishers site pdf [BibTex]

Publishers site pdf [BibTex]


Thumb xl ijrr
Vision meets Robotics: The KITTI Dataset

Geiger, A., Lenz, P., Stiller, C., Urtasun, R.

International Journal of Robotics Research, 32(11):1231 - 1237 , Sage Publishing, September 2013 (article)

Abstract
We present a novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research. In total, we recorded 6 hours of traffic scenarios at 10-100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3D laser scanner and a high-precision GPS/IMU inertial navigation system. The scenarios are diverse, capturing real-world traffic situations and range from freeways over rural areas to inner-city scenes with many static and dynamic objects. Our data is calibrated, synchronized and timestamped, and we provide the rectified and raw image sequences. Our dataset also contains object labels in the form of 3D tracklets and we provide online benchmarks for stereo, optical flow, object detection and other tasks. This paper describes our recording platform, the data format and the utilities that we provide.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl imgf0006
Human Pose Calculation from Optical Flow Data

Black, M., Loper, M., Romero, J., Zuffi, S.

European Patent Application EP 2843621 , August 2013 (patent)

ps

Google Patents [BibTex]

Google Patents [BibTex]


Thumb xl jmiv2012 mut
Unscented Kalman Filtering on Riemannian Manifolds

Soren Hauberg, Francois Lauze, Kim S. Pedersen

Journal of Mathematical Imaging and Vision, 46(1):103-120, Springer Netherlands, May 2013 (article)

ps

Publishers site PDF [BibTex]

Publishers site PDF [BibTex]


Thumb xl thumb hennigk2012 2
Quasi-Newton Methods: A New Direction

Hennig, P., Kiefel, M.

Journal of Machine Learning Research, 14(1):843-865, March 2013 (article)

Abstract
Four decades after their invention, quasi-Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

ei ps pn

website+code pdf link (url) [BibTex]

website+code pdf link (url) [BibTex]


Thumb xl training faces
Random Forests for Real Time 3D Face Analysis

Fanelli, G., Dantone, M., Gall, J., Fossati, A., van Gool, L.

International Journal of Computer Vision, 101(3):437-458, Springer, 2013 (article)

Abstract
We present a random forest-based framework for real time head pose estimation from depth images and extend it to localize a set of facial features in 3D. Our algorithm takes a voting approach, where each patch extracted from the depth image can directly cast a vote for the head pose or each of the facial features. Our system proves capable of handling large rotations, partial occlusions, and the noisy depth data acquired using commercial sensors. Moreover, the algorithm works on each frame independently and achieves real time performance without resorting to parallel computations on a GPU. We present extensive experiments on publicly available, challenging datasets and present a new annotated head pose database recorded using a Microsoft Kinect.

ps

data and code publisher's site pdf DOI Project Page [BibTex]

data and code publisher's site pdf DOI Project Page [BibTex]


Thumb xl humans3tracking
Markerless Motion Capture of Multiple Characters Using Multi-view Image Segmentation

Liu, Y., Gall, J., Stoll, C., Dai, Q., Seidel, H., Theobalt, C.

Transactions on Pattern Analysis and Machine Intelligence, 35(11):2720-2735, 2013 (article)

Abstract
Capturing the skeleton motion and detailed time-varying surface geometry of multiple, closely interacting peoples is a very challenging task, even in a multicamera setup, due to frequent occlusions and ambiguities in feature-to-person assignments. To address this task, we propose a framework that exploits multiview image segmentation. To this end, a probabilistic shape and appearance model is employed to segment the input images and to assign each pixel uniquely to one person. Given the articulated template models of each person and the labeled pixels, a combined optimization scheme, which splits the skeleton pose optimization problem into a local one and a lower dimensional global one, is applied one by one to each individual, followed with surface estimation to capture detailed nonrigid deformations. We show on various sequences that our approach can capture the 3D motion of humans accurately even if they move rapidly, if they wear wide apparel, and if they are engaged in challenging multiperson motions, including dancing, wrestling, and hugging.

ps

data and video pdf DOI Project Page [BibTex]

data and video pdf DOI Project Page [BibTex]


Thumb xl perception
Viewpoint and pose in body-form adaptation

Sekunova, A., Black, M., Parkinson, L., Barton, J. J. S.

Perception, 42(2):176-186, 2013 (article)

Abstract
Faces and bodies are complex structures, perception of which can play important roles in person identification and inference of emotional state. Face representations have been explored using behavioural adaptation: in particular, studies have shown that face aftereffects show relatively broad tuning for viewpoint, consistent with origin in a high-level structural descriptor far removed from the retinal image. Our goals were to determine first, if body aftereffects also showed a degree of viewpoint invariance, and second if they also showed pose invariance, given that changes in pose create even more dramatic changes in the 2-D retinal image. We used a 3-D model of the human body to generate headless body images, whose parameters could be varied to generate different body forms, viewpoints, and poses. In the first experiment, subjects adapted to varying viewpoints of either slim or heavy bodies in a neutral stance, followed by test stimuli that were all front-facing. In the second experiment, we used the same front-facing bodies in neutral stance as test stimuli, but compared adaptation from bodies in the same neutral stance to adaptation with the same bodies in different poses. We found that body aftereffects were obtained over substantial viewpoint changes, with no significant decline in aftereffect magnitude with increasing viewpoint difference between adapting and test images. Aftereffects also showed transfer across one change in pose but not across another. We conclude that body representations may have more viewpoint invariance than faces, and demonstrate at least some transfer across pose, consistent with a high-level structural description. Keywords: aftereffect, shape, face, representation

ps

pdf from publisher abstract pdf link (url) Project Page [BibTex]

pdf from publisher abstract pdf link (url) Project Page [BibTex]


Thumb xl 2013 ivc rkek teaser
Non-parametric hand pose estimation with object context

Romero, J., Kjellström, H., Ek, C. H., Kragic, D.

Image and Vision Computing , 31(8):555 - 564, 2013 (article)

Abstract
In the spirit of recent work on contextual recognition and estimation, we present a method for estimating the pose of human hands, employing information about the shape of the object in the hand. Despite the fact that most applications of human hand tracking involve grasping and manipulation of objects, the majority of methods in the literature assume a free hand, isolated from the surrounding environment. Occlusion of the hand from grasped objects does in fact often pose a severe challenge to the estimation of hand pose. In the presented method, object occlusion is not only compensated for, it contributes to the pose estimation in a contextual fashion; this without an explicit model of object shape. Our hand tracking method is non-parametric, performing a nearest neighbor search in a large database (.. entries) of hand poses with and without grasped objects. The system that operates in real time, is robust to self occlusions, object occlusions and segmentation errors, and provides full hand pose reconstruction from monocular video. Temporal consistency in hand pose is taken into account, without explicitly tracking the hand in the high-dim pose space. Experiments show the non-parametric method to outperform other state of the art regression methods, while operating at a significantly lower computational cost than comparable model-based hand tracking methods.

ps

Publisher site pdf link (url) [BibTex]

Publisher site pdf link (url) [BibTex]