Header logo is


2013


Thumb xl thumb
Branch&Rank for Efficient Object Detection

Lehmann, A., Gehler, P., VanGool, L.

International Journal of Computer Vision, Springer, December 2013 (article)

Abstract
Ranking hypothesis sets is a powerful concept for efficient object detection. In this work, we propose a branch&rank scheme that detects objects with often less than 100 ranking operations. This efficiency enables the use of strong and also costly classifiers like non-linear SVMs with RBF-TeX kernels. We thereby relieve an inherent limitation of branch&bound methods as bounds are often not tight enough to be effective in practice. Our approach features three key components: a ranking function that operates on sets of hypotheses and a grouping of these into different tasks. Detection efficiency results from adaptively sub-dividing the object search space into decreasingly smaller sets. This is inherited from branch&bound, while the ranking function supersedes a tight bound which is often unavailable (except for rather limited function classes). The grouping makes the system effective: it separates image classification from object recognition, yet combines them in a single formulation, phrased as a structured SVM problem. A novel aspect of branch&rank is that a better ranking function is expected to decrease the number of classifier calls during detection. We use the VOC’07 dataset to demonstrate the algorithmic properties of branch&rank.

ps

pdf link (url) [BibTex]

2013


pdf link (url) [BibTex]


Thumb xl tro
Extracting Postural Synergies for Robotic Grasping

Romero, J., Feix, T., Ek, C., Kjellstrom, H., Kragic, D.

Robotics, IEEE Transactions on, 29(6):1342-1352, December 2013 (article)

ps

[BibTex]

[BibTex]


Thumb xl pic cviu13
Markov Random Field Modeling, Inference & Learning in Computer Vision & Image Understanding: A Survey

Wang, C., Komodakis, N., Paragios, N.

Computer Vision and Image Understanding (CVIU), 117(11):1610-1627, November 2013 (article)

Abstract
In this paper, we present a comprehensive survey of Markov Random Fields (MRFs) in computer vision and image understanding, with respect to the modeling, the inference and the learning. While MRFs were introduced into the computer vision field about two decades ago, they started to become a ubiquitous tool for solving visual perception problems around the turn of the millennium following the emergence of efficient inference methods. During the past decade, a variety of MRF models as well as inference and learning methods have been developed for addressing numerous low, mid and high-level vision problems. While most of the literature concerns pairwise MRFs, in recent years we have also witnessed significant progress in higher-order MRFs, which substantially enhances the expressiveness of graph-based models and expands the domain of solvable problems. This survey provides a compact and informative summary of the major literature in this research topic.

ps

Publishers site pdf [BibTex]

Publishers site pdf [BibTex]


no image
Multi-robot cooperative spherical-object tracking in 3D space based on particle filters

Ahmad, A., Lima, P.

Robotics and Autonomous Systems, 61(10):1084-1093, October 2013 (article)

Abstract
This article presents a cooperative approach for tracking a moving spherical object in 3D space by a team of mobile robots equipped with sensors, in a highly dynamic environment. The tracker’s core is a particle filter, modified to handle, within a single unified framework, the problem of complete or partial occlusion for some of the involved mobile sensors, as well as inconsistent estimates in the global frame among sensors, due to observation errors and/or self-localization uncertainty. We present results supporting our approach by applying it to a team of real soccer robots tracking a soccer ball, including comparison with ground truth.

ps

DOI [BibTex]

DOI [BibTex]


Thumb xl ijrr
Vision meets Robotics: The KITTI Dataset

Geiger, A., Lenz, P., Stiller, C., Urtasun, R.

International Journal of Robotics Research, 32(11):1231 - 1237 , Sage Publishing, September 2013 (article)

Abstract
We present a novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research. In total, we recorded 6 hours of traffic scenarios at 10-100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3D laser scanner and a high-precision GPS/IMU inertial navigation system. The scenarios are diverse, capturing real-world traffic situations and range from freeways over rural areas to inner-city scenes with many static and dynamic objects. Our data is calibrated, synchronized and timestamped, and we provide the rectified and raw image sequences. Our dataset also contains object labels in the form of 3D tracklets and we provide online benchmarks for stereo, optical flow, object detection and other tasks. This paper describes our recording platform, the data format and the utilities that we provide.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl teaser
Visualizing dimensionality reduction of systems biology data

Lehrmann, A. M., Huber, M., Polatkan, A. C., Pritzkau, A., Nieselt, K.

Data Mining and Knowledge Discovery, 1(27):146-165, Springer, July 2013 (article)

ps

pdf SpRay [BibTex]

pdf SpRay [BibTex]


Thumb xl jmiv2012 mut
Unscented Kalman Filtering on Riemannian Manifolds

Soren Hauberg, Francois Lauze, Kim S. Pedersen

Journal of Mathematical Imaging and Vision, 46(1):103-120, Springer Netherlands, May 2013 (article)

ps

Publishers site PDF [BibTex]

Publishers site PDF [BibTex]


Thumb xl thumb hennigk2012 2
Quasi-Newton Methods: A New Direction

Hennig, P., Kiefel, M.

Journal of Machine Learning Research, 14(1):843-865, March 2013 (article)

Abstract
Four decades after their invention, quasi-Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

ei ps pn

website+code pdf link (url) [BibTex]

website+code pdf link (url) [BibTex]


Thumb xl illuminationpami13
Simultaneous Cast Shadows, Illumination and Geometry Inference Using Hypergraphs

Panagopoulos, A., Wang, C., Samaras, D., Paragios, N.

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 35(2):437-449, 2013 (article)

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl training faces
Random Forests for Real Time 3D Face Analysis

Fanelli, G., Dantone, M., Gall, J., Fossati, A., van Gool, L.

International Journal of Computer Vision, 101(3):437-458, Springer, 2013 (article)

Abstract
We present a random forest-based framework for real time head pose estimation from depth images and extend it to localize a set of facial features in 3D. Our algorithm takes a voting approach, where each patch extracted from the depth image can directly cast a vote for the head pose or each of the facial features. Our system proves capable of handling large rotations, partial occlusions, and the noisy depth data acquired using commercial sensors. Moreover, the algorithm works on each frame independently and achieves real time performance without resorting to parallel computations on a GPU. We present extensive experiments on publicly available, challenging datasets and present a new annotated head pose database recorded using a Microsoft Kinect.

ps

data and code publisher's site pdf DOI Project Page [BibTex]

data and code publisher's site pdf DOI Project Page [BibTex]


Thumb xl humans3tracking
Markerless Motion Capture of Multiple Characters Using Multi-view Image Segmentation

Liu, Y., Gall, J., Stoll, C., Dai, Q., Seidel, H., Theobalt, C.

Transactions on Pattern Analysis and Machine Intelligence, 35(11):2720-2735, 2013 (article)

Abstract
Capturing the skeleton motion and detailed time-varying surface geometry of multiple, closely interacting peoples is a very challenging task, even in a multicamera setup, due to frequent occlusions and ambiguities in feature-to-person assignments. To address this task, we propose a framework that exploits multiview image segmentation. To this end, a probabilistic shape and appearance model is employed to segment the input images and to assign each pixel uniquely to one person. Given the articulated template models of each person and the labeled pixels, a combined optimization scheme, which splits the skeleton pose optimization problem into a local one and a lower dimensional global one, is applied one by one to each individual, followed with surface estimation to capture detailed nonrigid deformations. We show on various sequences that our approach can capture the 3D motion of humans accurately even if they move rapidly, if they wear wide apparel, and if they are engaged in challenging multiperson motions, including dancing, wrestling, and hugging.

ps

data and video pdf DOI Project Page [BibTex]

data and video pdf DOI Project Page [BibTex]


Thumb xl perception
Viewpoint and pose in body-form adaptation

Sekunova, A., Black, M., Parkinson, L., Barton, J. J. S.

Perception, 42(2):176-186, 2013 (article)

Abstract
Faces and bodies are complex structures, perception of which can play important roles in person identification and inference of emotional state. Face representations have been explored using behavioural adaptation: in particular, studies have shown that face aftereffects show relatively broad tuning for viewpoint, consistent with origin in a high-level structural descriptor far removed from the retinal image. Our goals were to determine first, if body aftereffects also showed a degree of viewpoint invariance, and second if they also showed pose invariance, given that changes in pose create even more dramatic changes in the 2-D retinal image. We used a 3-D model of the human body to generate headless body images, whose parameters could be varied to generate different body forms, viewpoints, and poses. In the first experiment, subjects adapted to varying viewpoints of either slim or heavy bodies in a neutral stance, followed by test stimuli that were all front-facing. In the second experiment, we used the same front-facing bodies in neutral stance as test stimuli, but compared adaptation from bodies in the same neutral stance to adaptation with the same bodies in different poses. We found that body aftereffects were obtained over substantial viewpoint changes, with no significant decline in aftereffect magnitude with increasing viewpoint difference between adapting and test images. Aftereffects also showed transfer across one change in pose but not across another. We conclude that body representations may have more viewpoint invariance than faces, and demonstrate at least some transfer across pose, consistent with a high-level structural description. Keywords: aftereffect, shape, face, representation

ps

pdf from publisher abstract pdf link (url) Project Page [BibTex]

pdf from publisher abstract pdf link (url) Project Page [BibTex]


Thumb xl 2013 ivc rkek teaser
Non-parametric hand pose estimation with object context

Romero, J., Kjellström, H., Ek, C. H., Kragic, D.

Image and Vision Computing , 31(8):555 - 564, 2013 (article)

Abstract
In the spirit of recent work on contextual recognition and estimation, we present a method for estimating the pose of human hands, employing information about the shape of the object in the hand. Despite the fact that most applications of human hand tracking involve grasping and manipulation of objects, the majority of methods in the literature assume a free hand, isolated from the surrounding environment. Occlusion of the hand from grasped objects does in fact often pose a severe challenge to the estimation of hand pose. In the presented method, object occlusion is not only compensated for, it contributes to the pose estimation in a contextual fashion; this without an explicit model of object shape. Our hand tracking method is non-parametric, performing a nearest neighbor search in a large database (.. entries) of hand poses with and without grasped objects. The system that operates in real time, is robust to self occlusions, object occlusions and segmentation errors, and provides full hand pose reconstruction from monocular video. Temporal consistency in hand pose is taken into account, without explicitly tracking the hand in the high-dim pose space. Experiments show the non-parametric method to outperform other state of the art regression methods, while operating at a significantly lower computational cost than comparable model-based hand tracking methods.

ps

Publisher site pdf link (url) [BibTex]

Publisher site pdf link (url) [BibTex]

2012


Thumb xl eigenmaps
An SVD-Based Approach for Ghost Detection and Removal in High Dynamic Range Images

Srikantha, A., Sidibe, D., Meriaudeau, F.

International Conference on Pattern Recognition (ICPR), pages: 380-383, November 2012 (article)

ps

pdf [BibTex]

2012


pdf [BibTex]


Thumb xl posear
Coupled Action Recognition and Pose Estimation from Multiple Views

Yao, A., Gall, J., van Gool, L.

International Journal of Computer Vision, 100(1):16-37, October 2012 (article)

ps

publisher's site code pdf Project Page Project Page Project Page [BibTex]

publisher's site code pdf Project Page Project Page Project Page [BibTex]


Thumb xl representativecrop
DRAPE: DRessing Any PErson

Guan, P., Reiss, L., Hirshberg, D., Weiss, A., Black, M. J.

ACM Trans. on Graphics (Proc. SIGGRAPH), 31(4):35:1-35:10, July 2012 (article)

Abstract
We describe a complete system for animating realistic clothing on synthetic bodies of any shape and pose without manual intervention. The key component of the method is a model of clothing called DRAPE (DRessing Any PErson) that is learned from a physics-based simulation of clothing on bodies of different shapes and poses. The DRAPE model has the desirable property of "factoring" clothing deformations due to body shape from those due to pose variation. This factorization provides an approximation to the physical clothing deformation and greatly simplifies clothing synthesis. Given a parameterized model of the human body with known shape and pose parameters, we describe an algorithm that dresses the body with a garment that is customized to fit and possesses realistic wrinkles. DRAPE can be used to dress static bodies or animated sequences with a learned model of the cloth dynamics. Since the method is fully automated, it is appropriate for dressing large numbers of virtual characters of varying shape. The method is significantly more efficient than physical simulation.

ps

YouTube pdf talk Project Page Project Page [BibTex]

YouTube pdf talk Project Page Project Page [BibTex]


Thumb xl ghosthdr
Ghost Detection and Removal for High Dynamic Range Images: Recent Advances

Srikantha, A., Sidib’e, D.

Signal Processing: Image Communication, 27, pages: 650-662, July 2012 (article)

ps

pdf link (url) [BibTex]

pdf link (url) [BibTex]


Thumb xl thumb screen shot 2012 10 06 at 11.48.38 am
Visual Servoing on Unknown Objects

Gratal, X., Romero, J., Bohg, J., Kragic, D.

Mechatronics, 22(4):423-435, Elsevier, June 2012, Visual Servoing \{SI\} (article)

Abstract
We study visual servoing in a framework of detection and grasping of unknown objects. Classically, visual servoing has been used for applications where the object to be servoed on is known to the robot prior to the task execution. In addition, most of the methods concentrate on aligning the robot hand with the object without grasping it. In our work, visual servoing techniques are used as building blocks in a system capable of detecting and grasping unknown objects in natural scenes. We show how different visual servoing techniques facilitate a complete grasping cycle.

am ps

Grasping sequence video Offline calibration video Pdf DOI [BibTex]

Grasping sequence video Offline calibration video Pdf DOI [BibTex]


Thumb xl jneuroscicrop
Visual Orientation and Directional Selectivity Through Thalamic Synchrony

Stanley, G., Jin, J., Wang, Y., Desbordes, G., Wang, Q., Black, M., Alonso, J.

Journal of Neuroscience, 32(26):9073-9088, June 2012 (article)

Abstract
Thalamic neurons respond to visual scenes by generating synchronous spike trains on the timescale of 10–20 ms that are very effective at driving cortical targets. Here we demonstrate that this synchronous activity contains unexpectedly rich information about fundamental properties of visual stimuli. We report that the occurrence of synchronous firing of cat thalamic cells with highly overlapping receptive fields is strongly sensitive to the orientation and the direction of motion of the visual stimulus. We show that this stimulus selectivity is robust, remaining relatively unchanged under different contrasts and temporal frequencies (stimulus velocities). A computational analysis based on an integrate-and-fire model of the direct thalamic input to a layer 4 cortical cell reveals a strong correlation between the degree of thalamic synchrony and the nonlinear relationship between cortical membrane potential and the resultant firing rate. Together, these findings suggest a novel population code in the synchronous firing of neurons in the early visual pathway that could serve as the substrate for establishing cortical representations of the visual scene.

ps

preprint publisher's site Project Page [BibTex]

preprint publisher's site Project Page [BibTex]


Thumb xl bilinear
Bilinear Spatiotemporal Basis Models

Akhter, I., Simon, T., Khan, S., Matthews, I., Sheikh, Y.

ACM Transactions on Graphics (TOG), 31(2):17, ACM, April 2012 (article)

Abstract
A variety of dynamic objects, such as faces, bodies, and cloth, are represented in computer graphics as a collection of moving spatial landmarks. Spatiotemporal data is inherent in a number of graphics applications including animation, simulation, and object and camera tracking. The principal modes of variation in the spatial geometry of objects are typically modeled using dimensionality reduction techniques, while concurrently, trajectory representations like splines and autoregressive models are widely used to exploit the temporal regularity of deformation. In this article, we present the bilinear spatiotemporal basis as a model that simultaneously exploits spatial and temporal regularity while maintaining the ability to generalize well to new sequences. This factorization allows the use of analytical, predefined functions to represent temporal variation (e.g., B-Splines or the Discrete Cosine Transform) resulting in efficient model representation and estimation. The model can be interpreted as representing the data as a linear combination of spatiotemporal sequences consisting of shape modes oscillating over time at key frequencies. We apply the bilinear model to natural spatiotemporal phenomena, including face, body, and cloth motion data, and compare it in terms of compaction, generalization ability, predictive precision, and efficiency to existing models. We demonstrate the application of the model to a number of graphics tasks including labeling, gap-filling, denoising, and motion touch-up.

ps

pdf project page link (url) [BibTex]

pdf project page link (url) [BibTex]


Thumb xl thumb latent space2
A metric for comparing the anthropomorphic motion capability of artificial hands

Feix, T., Romero, J., Ek, C. H., Schmiedmayer, H., Kragic, D.

IEEE RAS Transactions on Robotics, TRO, pages: 974-980, 2012 (article)

ps

Publisher site Human Grasping Database Project [BibTex]

Publisher site Human Grasping Database Project [BibTex]


Thumb xl rat4
The Ankyrin 3 (ANK3) Bipolar Disorder Gene Regulates Psychiatric-related Behaviors that are Modulated by Lithium and Stress

Leussis, M., Berry-Scott, E., Saito, M., Jhuang, H., Haan, G., Alkan, O., Luce, C., Madison, J., Sklar, P., Serre, T., Root, D., Petryshen, T.

Biological Psychiatry , 2012 (article)

ps

Prepublication Article Abstract [BibTex]

Prepublication Article Abstract [BibTex]


Thumb xl imavis2012
Natural Metrics and Least-Committed Priors for Articulated Tracking

Soren Hauberg, Stefan Sommer, Kim S. Pedersen

Image and Vision Computing, 30(6-7):453-461, Elsevier, 2012 (article)

ps

Publishers site Code PDF [BibTex]

Publishers site Code PDF [BibTex]

2007


no image
Learning static Gestalt laws through dynamic experience

Ostrovsky, Y., Wulff, J., Sinha, P.

Journal of Vision, 7(9):315-315, ARVO, June 2007 (article)

Abstract
The Gestalt laws (Wertheimer 1923) are widely regarded as the rules that help us parse the world into objects. However, it is unclear as to how these laws are acquired by an infant's visual system. Classically, these “laws” have been presumed to be innate (Kellman and Spelke 1983). But, more recent work in infant development, showing the protracted time-course over which these grouping principles emerge (e.g., Johnson and Aslin 1995; Craton 1996), suggests that visual experience might play a role in their genesis. Specifically, our studies of patients with late-onset vision (Project Prakash; VSS 2006) and evidence from infant development both point to an early role of common motion cues for object grouping. Here we explore the possibility that the privileged status of motion in the developmental timeline is not happenstance, but rather serves to bootstrap the learning of static Gestalt cues. Our approach involves computational analyses of real-world motion sequences to investigate whether primitive optic flow information is correlated with static figural cues that could eventually come to serve as proxies for grouping in the form of Gestalt principles. We calculated local optic flow maps and then examined how similarity of motion across image patches co-varied with similarity of certain figural properties in static frames. Results indicate that patches with similar motion are much more likely to have similar luminance, color, and orientation as compared to patches with dissimilar motion vectors. This regularity suggests that, in principle, common motion extracted from dynamic visual experience can provide enough information to bootstrap region grouping based on luminance and color and contour continuation mechanisms in static scenes. These observations, coupled with the cited experimental studies, lend credence to the hypothesis that static Gestalt laws might be learned through a bootstrapping process based on early dynamic experience.

ps

link (url) DOI [BibTex]

2007


link (url) DOI [BibTex]


Thumb xl pedestal
Neuromotor prosthesis development

Donoghue, J., Hochberg, L., Nurmikko, A., Black, M., Simeral, J., Friehs, G.

Medicine & Health Rhode Island, 90(1):12-15, January 2007 (article)

Abstract
Article describes a neuromotor prosthesis (NMP), in development at Brown University, that records human brain signals, decodes them, and transforms them into movement commands. An NMP is described as a system consisting of a neural interface, a decoding system, and a user interface, also called an effector; a closed-loop system would be completed by a feedback signal from the effector to the brain. The interface is based on neural spiking, a source of information-rich, rapid, complex control signals from the nervous system. The NMP described, named BrainGate, consists of a match-head sized platform with 100 thread-thin electrodes implanted just into the surface of the motor cortex where commands to move the hand emanate. Neural signals are decoded by a rack of computers that displays the resultant output as the motion of a cursor on a computer monitor. While computer cursor motion represents a form of virtual device control, this same command signal could be routed to a device to command motion of paralyzed muscles or the actions of prosthetic limbs. The researchers’ overall goal is the development of a fully implantable, wireless multi-neuron sensor for broad research, neural prosthetic, and human neurodiagnostic applications.

ps

pdf [BibTex]

pdf [BibTex]


Thumb xl ijcvflow2
On the spatial statistics of optical flow

Roth, S., Black, M. J.

International Journal of Computer Vision, 74(1):33-50, 2007 (article)

Abstract
We present an analysis of the spatial and temporal statistics of "natural" optical flow fields and a novel flow algorithm that exploits their spatial statistics. Training flow fields are constructed using range images of natural scenes and 3D camera motions recovered from hand-held and car-mounted video sequences. A detailed analysis of optical flow statistics in natural scenes is presented and machine learning methods are developed to learn a Markov random field model of optical flow. The prior probability of a flow field is formulated as a Field-of-Experts model that captures the spatial statistics in overlapping patches and is trained using contrastive divergence. This new optical flow prior is compared with previous robust priors and is incorporated into a recent, accurate algorithm for dense optical flow computation. Experiments with natural and synthetic sequences illustrate how the learned optical flow prior quantitatively improves flow accuracy and how it captures the rich spatial structure found in natural scene motion.

ps

pdf preprint pdf from publisher [BibTex]

pdf preprint pdf from publisher [BibTex]


Thumb xl arrayhd
Assistive technology and robotic control using MI ensemble-based neural interface systems in humans with tetraplegia

Donoghue, J. P., Nurmikko, A., Black, M. J., Hochberg, L.

Journal of Physiology, Special Issue on Brain Computer Interfaces, 579, pages: 603-611, 2007 (article)

Abstract
This review describes the rationale, early stage development, and initial human application of neural interface systems (NISs) for humans with paralysis. NISs are emerging medical devices designed to allowpersonswith paralysis to operate assistive technologies or to reanimatemuscles based upon a command signal that is obtained directly fromthe brain. Such systems require the development of sensors to detect brain signals, decoders to transformneural activity signals into a useful command, and an interface for the user.We review initial pilot trial results of an NIS that is based on an intracortical microelectrode sensor that derives control signals from the motor cortex.We review recent findings showing, first, that neurons engaged by movement intentions persist in motor cortex years after injury or disease to the motor system, and second, that signals derived from motor cortex can be used by persons with paralysis to operate a range of devices. We suggest that, with further development, this form of NIS holds promise as a useful new neurotechnology for those with limited motor function or communication.We also discuss the additional potential for neural sensors to be used in the diagnosis and management of various neurological conditions and as a new way to learn about human brain function.

ps

pdf preprint pdf from publisher DOI [BibTex]

pdf preprint pdf from publisher DOI [BibTex]

1996


Thumb xl bildschirmfoto 2012 12 07 um 11.52.07
Estimating optical flow in segmented images using variable-order parametric models with local deformations

Black, M. J., Jepson, A.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(10):972-986, October 1996 (article)

Abstract
This paper presents a new model for estimating optical flow based on the motion of planar regions plus local deformations. The approach exploits brightness information to organize and constrain the interpretation of the motion by using segmented regions of piecewise smooth brightness to hypothesize planar regions in the scene. Parametric flow models are estimated in these regions in a two step process which first computes a coarse fit and estimates the appropriate parameterization of the motion of the region (two, six, or eight parameters). The initial fit is refined using a generalization of the standard area-based regression approaches. Since the assumption of planarity is likely to be violated, we allow local deformations from the planar assumption in the same spirit as physically-based approaches which model shape using coarse parametric models plus local deformations. This parametric+deformation model exploits the strong constraints of parametric approaches while retaining the adaptive nature of regularization approaches. Experimental results on a variety of images indicate that the parametric+deformation model produces accurate flow estimates while the incorporation of brightness segmentation provides precise localization of motion boundaries.

ps

pdf pdf from publisher [BibTex]

1996


pdf pdf from publisher [BibTex]


Thumb xl bildschirmfoto 2012 12 07 um 11.59.00
On the unification of line processes, outlier rejection, and robust statistics with applications in early vision

Black, M., Rangarajan, A.

International Journal of Computer Vision , 19(1):57-92, July 1996 (article)

Abstract
The modeling of spatial discontinuities for problems such as surface recovery, segmentation, image reconstruction, and optical flow has been intensely studied in computer vision. While “line-process” models of discontinuities have received a great deal of attention, there has been recent interest in the use of robust statistical techniques to account for discontinuities. This paper unifies the two approaches. To achieve this we generalize the notion of a “line process” to that of an analog “outlier process” and show how a problem formulated in terms of outlier processes can be viewed in terms of robust statistics. We also characterize a class of robust statistical problems for which an equivalent outlier-process formulation exists and give a straightforward method for converting a robust estimation problem into an outlier-process formulation. We show how prior assumptions about the spatial structure of outliers can be expressed as constraints on the recovered analog outlier processes and how traditional continuation methods can be extended to the explicit outlier-process formulation. These results indicate that the outlier-process approach provides a general framework which subsumes the traditional line-process approaches as well as a wide class of robust estimation problems. Examples in surface reconstruction, image segmentation, and optical flow are presented to illustrate the use of outlier processes and to show how the relationship between outlier processes and robust statistics can be exploited. An appendix provides a catalog of common robust error norms and their equivalent outlier-process formulations.

ps

pdf pdf from publisher DOI [BibTex]


Thumb xl bildschirmfoto 2012 12 07 um 12.09.01
The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields

Black, M. J., Anandan, P.

Computer Vision and Image Understanding, 63(1):75-104, January 1996 (article)

Abstract
Most approaches for estimating optical flow assume that, within a finite image region, only a single motion is present. This single motion assumption is violated in common situations involving transparency, depth discontinuities, independently moving objects, shadows, and specular reflections. To robustly estimate optical flow, the single motion assumption must be relaxed. This paper presents a framework based on robust estimation that addresses violations of the brightness constancy and spatial smoothness assumptions caused by multiple motions. We show how the robust estimation framework can be applied to standard formulations of the optical flow problem thus reducing their sensitivity to violations of their underlying assumptions. The approach has been applied to three standard techniques for recovering optical flow: area-based regression, correlation, and regularization with motion discontinuities. This paper focuses on the recovery of multiple parametric motion models within a region, as well as the recovery of piecewise-smooth flow fields, and provides examples with natural and synthetic image sequences.

ps

pdf pdf from publisher [BibTex]

pdf pdf from publisher [BibTex]