During my PhD studies, I have supervised 7 student projects including 2 master's theses. This resulted in two publications.
As part of my contributions to Free and Open Source software development I have been leading the Octave-Forge project from 2006 to 2011. The project consists of 70+ independent developers.
From 2012 to 2014 I organized the Intelligent Systems Colloquium talk series in Tübingen
In general, I try to make software as available as is practical (sometimes the burdens of maintaining software in public outweighs the benifits). Below are some stuff that is currently available. If you know I have software which could be useful to you, but is not available here, just send me an e-mail and I'll do my best to help you out.
Most of the software used in our papers in articulated tracking / pose estimate / whatever-you-call-it is available at http://humim.org.
The software developed for the NIPS 2012 paper used for regression and dimensionality reduction using multiple metrics can be found at the project website.
Various Helper Functions
Cholesky-like decomposition of positive semidefinite matrices: when sampling from a Gaussian with covariance matrix S the best solution is generally to perform a Cholesky decomposition S = RTR and multiply isotropic Gaussian samples with R. Sadly, this fails if S is positive semidefinite, which is very often the case. The standard solution is to perform an eigen value decomposition of S, but that can be computationally quite demanding in high dimensions. I put together a simple function that solves this problem reasonably well by first trying a Cholesky decomposition and, if that fails, then a LDL decomposition.
New Position at DTU Compute
I am no longer a post doc at the Max Planck Institute, so this web page is not updated any more. My current employement is at DTU Compute: new web site.
A Summary of Me
Model what you can; learn the rest. A simple statement that summarizes my approach to computer vision, machine learning and science in general. I believe that we should always try to incorporate as much known information in our models before learning unknown parameters -- even if that means we have to derive new models from scratch! In practice, my work is mostly concerned with (but not limited to) the following topics:
Statistics on manifolds;
Human motion, shape and recognition;
Time series analysis;
Monte Carlo techniques.
Most naturally occuring phenomena are complex enough to necessitate machine learning as an integral part of building models of said phenomena. However, with machine learning comes a need for large amounts of data, which may be hard to acquire, e.g. due to price and time constraints or simply because the phenomenon is rare in nature. In such cases we rely on expert knowledge to guide the learning scheme. Sadly, our tools for incorporating such knowledge are not very strong: we often resort to add hoc regularization techniques or seek indirect sources of information such as data labels.
I believe we can do better!
Fundamentally: the more we know, the less we have to learn from data. Often experts can provide more direct pieces of information about the phenomenon, e.g. that the solution has to satisfy a certain set of constraints or that a specific distance measure is to be prefered over the one naturally implied by the vector representation of the data (assuming such a representation even exist). Sadly, most machine learning techniques are incapable of incorporating these clues in a principled way. This often forces practitioners to ignore the expert knowledge which increases the need for data as now the machine learning technique also has to learn what the expert already knew.
My research revolves around the idea that we should incorporate as much expert knowledge as possible and only attempt to learn what we do not already know. As expert knowledge is most often not linear, we are forced away from Euclidean models. This removes one of the most fundamental assumptions behind modern statistical tools and we need to create new ones.
My current work is centred around Riemannian geometry as I find that to be a natural and practical way of incorporating further expert knowledge. Still, there are many problems which cannot be described in this setting...
IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), December 2015 (article)
In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.
In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 1378 -1385, Columbus, Ohio, USA, IEEE Intenational Conference on Computer Vision and Pattern Recognition, June 2014 (inproceedings)
We consider the intersection of two research fields: transfer learning and statistics on manifolds. In particular, we consider, for manifold-valued data, transfer learning of tangent-space models such as Gaussians distributions, PCA, regression, or classifiers. Though one would hope to simply use ordinary Rn-transfer learning ideas, the manifold structure prevents it. We overcome this by basing our method on inner-product-preserving parallel transport, a well-known tool widely used in other problems of statistics on manifolds in computer vision. At first, this straightforward idea seems to suffer from an obvious shortcoming: Transporting large datasets is prohibitively expensive, hindering scalability. Fortunately, with our approach, we never transport data. Rather, we show how the statistical models themselves can be transported, and prove that for the tangent-space models above, the transport “commutes” with learning. Consequently, our compact framework, applicable to a large class of manifolds, is not restricted by the size of either the training or test sets. We demonstrate the approach by transferring PCA and logistic-regression models of real-world data involving 3D shapes and image descriptors.
This technical report is complementary to "Model Transport: Towards Scalable Transfer Learning on Manifolds" and contains proofs, explanation of the attached video (visualization of bases from the body shape experiments), and high-resolution images of select results of individual reconstructions from the shape experiments. It is identical to the supplemental mate- rial submitted to the Conference on Computer Vision and Pattern Recognition (CVPR 2014) on November 2013.
In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, 33, pages: 347-355, JMLR: Workshop and Conference Proceedings, (Editors: S Kaski and J Corander), Microtome Publishing, Brookline, MA, AISTATS, April 2014 (inproceedings)
We study a probabilistic numerical method for the solution of both
boundary and initial value problems that returns a joint Gaussian
process posterior over the solution. Such methods have concrete value
in the statistics on Riemannian manifolds, where non-analytic ordinary
differential equations are involved in virtually all computations. The
probabilistic formulation permits marginalising the uncertainty of the
numerical solution such that statistics are less sensitive to
inaccuracies. This leads to new Riemannian algorithms for mean value
computations and principal geodesic analysis. Marginalisation also
means results can be less precise than point estimates, enabling a
noticeable speed-up over the state of the art. Our approach is an
argument for a wider point that uncertainty caused by numerical
calculations should be tracked throughout the pipeline of machine
In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014, Lecture Notes in Computer Science Vol. 8675, pages: 265-272, (Editors: P. Golland, N. Hata, C. Barillot, J. Hornegger and R. Howe), Springer, Heidelberg, MICCAI, 2014 (inproceedings)
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems