Header logo is


2010


no image
Computationally efficient algorithms for statistical image processing: Implementation in R

Langovoy, M., Wittich, O.

(2010-053), EURANDOM, Technische Universiteit Eindhoven, December 2010 (techreport)

Abstract
In the series of our earlier papers on the subject, we proposed a novel statistical hy- pothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We developed algorithms that allowed to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of un- known distribution. No boundary shape constraints were imposed on the objects, only a weak bulk condition for the object's interior was required. Our algorithms have linear complexity and exponential accuracy. In the present paper, we describe an implementation of our nonparametric hypothesis testing method. We provide a program that can be used for statistical experiments in image processing. This program is written in the statistical programming language R.

ei

PDF [BibTex]

2010


PDF [BibTex]


no image
Fast Convergent Algorithms for Expectation Propagation Approximate Bayesian Inference

Seeger, M., Nickisch, H.

Max Planck Institute for Biological Cybernetics, December 2010 (techreport)

Abstract
We propose a novel algorithm to solve the expectation propagation relaxation of Bayesian inference for continuous-variable graphical models. In contrast to most previous algorithms, our method is provably convergent. By marrying convergent EP ideas from (Opper&Winther 05) with covariance decoupling techniques (Wipf&Nagarajan 08, Nickisch&Seeger 09), it runs at least an order of magnitude faster than the most commonly used EP solver.

ei

Web [BibTex]

Web [BibTex]


no image
Tackling Box-Constrained Optimization via a New Projected Quasi-Newton Approach

Kim, D., Sra, S., Dhillon, I.

SIAM Journal on Scientific Computing, 32(6):3548-3563 , December 2010 (article)

Abstract
Numerous scientific applications across a variety of fields depend on box-constrained convex optimization. Box-constrained problems therefore continue to attract research interest. We address box-constrained (strictly convex) problems by deriving two new quasi-Newton algorithms. Our algorithms are positioned between the projected-gradient [J. B. Rosen, J. SIAM, 8 (1960), pp. 181–217] and projected-Newton [D. P. Bertsekas, SIAM J. Control Optim., 20 (1982), pp. 221–246] methods. We also prove their convergence under a simple Armijo step-size rule. We provide experimental results for two particular box-constrained problems: nonnegative least squares (NNLS), and nonnegative Kullback–Leibler (NNKL) minimization. For both NNLS and NNKL our algorithms perform competitively as compared to well-established methods on medium-sized problems; for larger problems our approach frequently outperforms the competition.

ei

Web DOI Project Page [BibTex]

Web DOI Project Page [BibTex]


no image
Algorithmen zum Automatischen Erlernen von Motorfähigkeiten

Peters, J., Kober, J., Schaal, S.

at - Automatisierungstechnik, 58(12):688-694, December 2010 (article)

Abstract
Robot learning methods which allow autonomous robots to adapt to novel situations have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics. If possible, scaling was usually only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i. e., firstly, we study policy learning algorithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structures for task representation and execution.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Markerless tracking of Dynamic 3D Scans of Faces

Walder, C., Breidt, M., Bülthoff, H., Schölkopf, B., Curio, C.

In Dynamic Faces: Insights from Experiments and Computation, pages: 255-276, (Editors: Curio, C., Bülthoff, H. H. and Giese, M. A.), MIT Press, Cambridge, MA, USA, December 2010 (inbook)

ei

Web [BibTex]

Web [BibTex]


no image
PAC-Bayesian Analysis of Co-clustering and Beyond

Seldin, Y., Tishby, N.

Journal of Machine Learning Research, 11, pages: 3595-3646, December 2010 (article)

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Policy Gradient Methods

Peters, J., Bagnell, J.

In Encyclopedia of Machine Learning, pages: 774-776, (Editors: Sammut, C. and Webb, G. I.), Springer, Berlin, Germany, December 2010 (inbook)

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Gaussian Processes for Machine Learning (GPML) Toolbox

Rasmussen, C., Nickisch, H.

Journal of Machine Learning Research, 11, pages: 3011-3015, November 2010 (article)

Abstract
The GPML toolbox provides a wide range of functionality for Gaussian process (GP) inference and prediction. GPs are specified by mean and covariance functions; we offer a library of simple mean and covariance functions and mechanisms to compose more complex ones. Several likelihood functions are supported including Gaussian and heavy-tailed for regression as well as others suitable for classification. Finally, a range of inference methods is provided, including exact and variational inference, Expectation Propagation, and Laplace's method dealing with non-Gaussian likelihoods and FITC for dealing with large regression tasks.

ei

Web [BibTex]

Web [BibTex]


no image
Cryo-EM structure and rRNA model of a translating eukaryotic 80S ribosome at 5.5-Å resolution

Armache, J-P., Jarasch, A., Anger, AM., Villa, E., Becker, T., Bhushan, S., Jossinet, F., Habeck, M., Dindar, G., Franckenberg, S., Marquez, V., Mielke, T., Thomm, M., Berninghausen, O., Beatrix, B., Söding, J., Westhof, E., Wilson, DN., Beckmann, R.

Proceedings of the National Academy of Sciences of the United States of America, 107(46):19748-19753, November 2010 (article)

Abstract
Protein biosynthesis, the translation of the genetic code into polypeptides, occurs on ribonucleoprotein particles called ribosomes. Although X-ray structures of bacterial ribosomes are available, high-resolution structures of eukaryotic 80S ribosomes are lacking. Using cryoelectron microscopy and single-particle reconstruction, we have determined the structure of a translating plant (Triticum aestivum) 80S ribosome at 5.5-Å resolution. This map, together with a 6.1-Å map of a Saccharomyces cerevisiae 80S ribosome, has enabled us to model ∼98% of the rRNA. Accurate assignment of the rRNA expansion segments (ES) and variable regions has revealed unique ES–ES and r-protein–ES interactions, providing insight into the structure and evolution of the eukaryotic ribosome.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Policy gradient methods

Peters, J.

Scholarpedia, 5(11):3698, November 2010 (article)

Abstract
Policy gradient methods are a type of reinforcement learning techniques that rely upon optimizing parametrized policies with respect to the expected return (long-term cumulative reward) by gradient descent. They do not suffer from many of the problems that have been marring traditional reinforcement learning approaches such as the lack of guarantees of a value function, the intractability problem resulting from uncertain state information and the complexity arising from continuous states & actions.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Localization of eukaryote-specific ribosomal proteins in a 5.5-Å cryo-EM map of the 80S eukaryotic ribosome

Armache, J-P., Jarasch, A., Anger, AM., Villa, E., Becker, T., Bhushan, S., Jossinet, F., Habeck, M., Dindar, G., Franckenberg, S., Marquez, V., Mielke, T., Thomm, M., Berninghausen, O., Beatrix, B., Söding, J., Westhof, E., Wilson, DN., Beckmann, R.

Proceedings of the National Academy of Sciences of the United States of America, 107(46):19754-19759, November 2010 (article)

Abstract
Protein synthesis in all living organisms occurs on ribonucleoprotein particles, called ribosomes. Despite the universality of this process, eukaryotic ribosomes are significantly larger in size than their bacterial counterparts due in part to the presence of 80 r proteins rather than 54 in bacteria. Using cryoelectron microscopy reconstructions of a translating plant (Triticum aestivum) 80S ribosome at 5.5-Å resolution, together with a 6.1-Å map of a translating Saccharomyces cerevisiae 80S ribosome, we have localized and modeled 74/80 (92.5%) of the ribosomal proteins, encompassing 12 archaeal/eukaryote-specific small subunit proteins as well as the complete complement of the ribosomal proteins of the eukaryotic large subunit. Near-complete atomic models of the 80S ribosome provide insights into the structure, function, and evolution of the eukaryotic translational apparatus.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Spatio-Spectral Remote Sensing Image Classification With Graph Kernels

Camps-Valls, G., Shervashidze, N., Borgwardt, K.

IEEE Geoscience and Remote Sensing Letters, 7(4):741-745, October 2010 (article)

Abstract
This letter presents a graph kernel for spatio-spectral remote sensing image classification with support vector machines (SVMs). The method considers higher order relations in the neighborhood (beyond pairwise spatial relations) to iteratively compute a kernel matrix for SVM learning. The proposed kernel is easy to compute and constitutes a powerful alternative to existing approaches. The capabilities of the method are illustrated in several multi- and hyperspectral remote sensing images acquired over both urban and agricultural areas.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Causal Inference Using the Algorithmic Markov Condition

Janzing, D., Schölkopf, B.

IEEE Transactions on Information Theory, 56(10):5168-5194, October 2010 (article)

Abstract
Inferring the causal structure that links $n$ observables is usually based upon detecting statistical dependences and choosing simple graphs that make the joint measure Markovian. Here we argue why causal inference is also possible when the sample size is one. We develop a theory how to generate causal graphs explaining similarities between single objects. To this end, we replace the notion of conditional stochastic independence in the causal Markov condition with the vanishing of conditional algorithmic mutual information and describe the corresponding causal inference rules. We explain why a consistent reformulation of causal inference in terms of algorithmic complexity implies a new inference principle that takes into account also the complexity of conditional probability densities, making it possible to select among Markov equivalent causal graphs. This insight provides a theoretical foundation of a heuristic principle proposed in earlier work. We also sketch some ideas on how to replace Kolmogorov complexity with decidable complexity criteria. This can be seen as an algorithmic analog of replacing the empirically undecidable question of statistical independence with practical independence tests that are based on implicit or explicit assumptions on the underlying distribution.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Recurrent Policy Gradients

Wierstra, D., Förster, A., Peters, J., Schmidhuber, J.

Logic Journal of the IGPL, 18(5):620-634, October 2010 (article)

Abstract
Reinforcement learning for partially observable Markov decision problems (POMDPs) is a challenge as it requires policies with an internal state. Traditional approaches suffer significantly from this shortcoming and usually make strong assumptions on the problem domain such as perfect system models, state-estimators and a Markovian hidden system. Recurrent neural networks (RNNs) offer a natural framework for dealing with policy learning using hidden state and require only few limiting assumptions. As they can be trained well using gradient descent, they are suited for policy gradient approaches. In this paper, we present a policy gradient method, the Recurrent Policy Gradient which constitutes a model-free reinforcement learning method. It is aimed at training limited-memory stochastic policies on problems which require long-term memories of past observations. The approach involves approximating a policy gradient for a recurrent neural network by backpropagating return-weighted characteristic eligibilities through time. Using a ‘‘Long Short-Term Memory’’ RNN architecture, we are able to outperform previous RL methods on three important benchmark tasks. Furthermore, we show that using history-dependent baselines helps reducing estimation variance significantly, thus enabling our approach to tackle more challenging, highly stochastic environments.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Discriminative frequent subgraph mining with optimality guarantees

Thoma, M., Cheng, H., Gretton, A., Han, J., Kriegel, H., Smola, A., Song, L., Yu, P., Yan, X., Borgwardt, K.

Journal of Statistical Analysis and Data Mining, 3(5):302–318, October 2010 (article)

Abstract
The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Combining active learning and reactive control for robot grasping

Kroemer, O., Detry, R., Piater, J., Peters, J.

Robotics and Autonomous Systems, 58(9):1105-1116, September 2010 (article)

Abstract
Grasping an object is a task that inherently needs to be treated in a hybrid fashion. The system must decide both where and how to grasp the object. While selecting where to grasp requires learning about the object as a whole, the execution only needs to reactively adapt to the context close to the grasp’s location. We propose a hierarchical controller that reflects the structure of these two sub-problems, and attempts to learn solutions that work for both. A hybrid architecture is employed by the controller to make use of various machine learning methods that can cope with the large amount of uncertainty inherent to the task. The controller’s upper level selects where to grasp the object using a reinforcement learner, while the lower level comprises an imitation learner and a vision-based reactive controller to determine appropriate grasping motions. The resulting system is able to quickly learn good grasps of a novel object in an unstructured environment, by executing smooth reaching motions and preshapin g the hand depending on the object’s geometry. The system was evaluated both in simulation and on a real robot.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering

Seldin, Y.

Max Planck Institute for Biological Cybernetics, Tübingen, Germany, September 2010 (techreport)

Abstract
We formulate weighted graph clustering as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008; Seldin, 2009) to derive a PAC-Bayesian generalization bound for graph clustering. The bound shows that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-of-the-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports the empirical evidence by providing a better theoretical foundation, suggesting formal generalization guarantees, and offering a more accurate way to deal with finite sample issues. We derive a bound minimization algorithm and show that it provides good results in real-life problems and that the derived PAC-Bayesian bound is reasonably tight.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Nonparametric Regression between General Riemannian Manifolds

Steinke, F., Hein, M., Schölkopf, B.

SIAM Journal on Imaging Sciences, 3(3):527-563, September 2010 (article)

Abstract
We study nonparametric regression between Riemannian manifolds based on regularized empirical risk minimization. Regularization functionals for mappings between manifolds should respect the geometry of input and output manifold and be independent of the chosen parametrization of the manifolds. We define and analyze the three most simple regularization functionals with these properties and present a rather general scheme for solving the resulting optimization problem. As application examples we discuss interpolation on the sphere, fingerprint processing, and correspondence computations between three-dimensional surfaces. We conclude with characterizing interesting and sometimes counterintuitive implications and new open problems that are specific to learning between Riemannian manifolds and are not encountered in multivariate regression in Euclidean space.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Sparse nonnegative matrix approximation: new formulations and algorithms

Tandon, R., Sra, S.

(193), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, September 2010 (techreport)

Abstract
We introduce several new formulations for sparse nonnegative matrix approximation. Subsequently, we solve these formulations by developing generic algorithms. Further, to help selecting a particular sparse formulation, we briefly discuss the interpretation of each formulation. Finally, preliminary experiments are presented to illustrate the behavior of our formulations and algorithms.

ei

PDF [BibTex]

PDF [BibTex]


no image
Robust nonparametric detection of objects in noisy images

Langovoy, M., Wittich, O.

(2010-049), EURANDOM, Technische Universiteit Eindhoven, September 2010 (techreport)

Abstract
We propose a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We present an algorithm that allows to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of unknown distribution. No boundary shape constraints are imposed on the object, only a weak bulk condition for the object's interior is required. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. In this paper, we develop further the mathematical formalism of our method and explore im- portant connections to the mathematical theory of percolation and statistical physics. We prove results on consistency and algorithmic complexity of our testing procedure. In addition, we address not only an asymptotic behavior of the method, but also a nite sample performance of our test.

ei

PDF [BibTex]

PDF [BibTex]


no image
Large Scale Variational Inference and Experimental Design for Sparse Generalized Linear Models

Seeger, M., Nickisch, H.

Max Planck Institute for Biological Cybernetics, August 2010 (techreport)

Abstract
Many problems of low-level computer vision and image processing, such as denoising, deconvolution, tomographic reconstruction or super-resolution, can be addressed by maximizing the posterior distribution of a sparse linear model (SLM). We show how higher-order Bayesian decision-making problems, such as optimizing image acquisition in magnetic resonance scanners, can be addressed by querying the SLM posterior covariance, unrelated to the density's mode. We propose a scalable algorithmic framework, with which SLM posteriors over full, high-resolution images can be approximated for the first time, solving a variational optimization problem which is convex iff posterior mode finding is convex. These methods successfully drive the optimization of sampling trajectories for real-world magnetic resonance imaging through Bayesian experimental design, which has not been attempted before. Our methodology provides new insight into similarities and differences between sparse reconstruction and approximate Bayesian inference, and has important implications for compressive sensing of real-world images.

ei

Web [BibTex]


no image
Hybrid PET/MRI of Intracranial Masses: Initial Experiences and Comparison to PET/CT

Boss, A., Bisdas, S., Kolb, A., Hofmann, M., Ernemann, U., Claussen, C., Pfannenberg, C., Pichler, B., Reimold, M., Stegger, L.

Journal of Nuclear Medicine, 51(8):1198-1205, August 2010 (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models

Mooij, JM.

Journal of Machine Learning Research, 11, pages: 2169-2173, August 2010 (article)

Abstract
This paper describes the software package libDAI, a free & open source C++ library that provides implementations of various exact and approximate inference methods for graphical models with discrete-valued variables. libDAI supports directed graphical models (Bayesian networks) as well as undirected ones (Markov random fields and factor graphs). It offers various approximations of the partition sum, marginal probability distributions and maximum probability states. Parameter learning is also supported. A feature comparison with other open source software packages for approximate inference is given. libDAI is licensed under the GPL v2+ license and is available at http://www.libdai.org.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Convolutive blind source separation by efficient blind deconvolution and minimal filter distortion

Zhang, K., Chan, L.

Neurocomputing, 73(13-15):2580-2588, August 2010 (article)

Abstract
Convolutive blind source separation (BSS) usually encounters two difficulties—the filter indeterminacy in the recovered sources and the relatively high computational load. In this paper we propose an efficient method to convolutive BSS, by dealing with these two issues. It consists of two stages, namely, multichannel blind deconvolution (MBD) and learning the post-filters with the minimum filter distortion (MFD) principle. We present a computationally efficient approach to MBD in the first stage: a vector autoregression (VAR) model is first fitted to the data, admitting a closed-form solution and giving temporally independent errors; traditional independent component analysis (ICA) is then applied to these errors to produce the MBD results. In the second stage, the least linear reconstruction error (LLRE) constraint of the separation system, which was previously used to regularize the solutions to nonlinear ICA, enforces a MFD principle of the estimated mixing system for convolutive BSS. One can then easily learn the post-filters to preserve the temporal structure of the sources. We show that with this principle, each recovered source is approximately the principal component of the contributions of this source to all observations. Experimental results on both synthetic data and real room recordings show the good performance of this method.

ei

PDF PDF DOI [BibTex]


no image
Cooperative Cuts for Image Segmentation

Jegelka, S., Bilmes, J.

(UWEETR-1020-0003), University of Washington, Washington DC, USA, August 2010 (techreport)

Abstract
We propose a novel framework for graph-based cooperative regularization that uses submodular costs on graph edges. We introduce an efficient iterative algorithm to solve the resulting hard discrete optimization problem, and show that it has a guaranteed approximation factor. The edge-submodular formulation is amenable to the same extensions as standard graph cut approaches, and applicable to a range of problems. We apply this method to the image segmentation problem. Specifically, Here, we apply it to introduce a discount for homogeneous boundaries in binary image segmentation on very difficult images, precisely, long thin objects and color and grayscale images with a shading gradient. The experiments show that significant portions of previously truncated objects are now preserved.

ei

Web [BibTex]

Web [BibTex]


no image
Fast algorithms for total-variationbased optimization

Barbero, A., Sra, S.

(194), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, August 2010 (techreport)

Abstract
We derive a number of methods to solve efficiently simple optimization problems subject to a totalvariation (TV) regularization, under different norms of the TV operator and both for the case of 1-dimensional and 2-dimensional data. In spite of the non-smooth, non-separable nature of the TV terms considered, we show that a dual formulation with strong structure can be derived. Taking advantage of this structure we develop adaptions of existing algorithms from the optimization literature, resulting in efficient methods for the problem at hand. Experimental results show that for 1-dimensional data the proposed methods achieve convergence within good accuracy levels in practically linear time, both for L1 and L2 norms. For the more challenging 2-dimensional case a performance of order O(N2 log2 N) for N x N inputs is achieved when using the L2 norm. A final section suggests possible extensions and lines of further work.

ei

PDF [BibTex]

PDF [BibTex]


no image
Biased Feedback in Brain-Computer Interfaces

Barbero, A., Grosse-Wentrup, M.

Journal of NeuroEngineering and Rehabilitation, 7(34):1-4, July 2010 (article)

Abstract
Even though feedback is considered to play an important role in learning how to operate a brain-computer interface (BCI), to date no significant influence of feedback design on BCI-performance has been reported in literature. In this work, we adapt a standard motor-imagery BCI-paradigm to study how BCI-performance is affected by biasing the belief subjects have on their level of control over the BCI system. Our findings indicate that subjects already capable of operating a BCI are impeded by inaccurate feedback, while subjects normally performing on or close to chance level may actually benefit from an incorrect belief on their performance level. Our results imply that optimal feedback design in BCIs should take into account a subject‘s current skill level.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Varieties of Justification in Machine Learning

Corfield, D.

Minds and Machines, 20(2):291-301, July 2010 (article)

Abstract
Forms of justification for inductive machine learning techniques are discussed and classified into four types. This is done with a view to introduce some of these techniques and their justificatory guarantees to the attention of philosophers, and to initiate a discussion as to whether they must be treated separately or rather can be viewed consistently from within a single framework.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution

Görür, D., Rasmussen, C.

Journal of Computer Science and Technology, 25(4):653-664, July 2010 (article)

Abstract
In the Bayesian mixture modeling framework it is possible to infer the necessary number of components to model the data and therefore it is unnecessary to explicitly restrict the number of components. Nonparametric mixture models sidestep the problem of finding the “correct” number of mixture components by assuming infinitely many components. In this paper Dirichlet process mixture (DPM) models are cast as infinite mixture models and inference using Markov chain Monte Carlo is described. The specification of the priors on the model parameters is often guided by mathematical and practical convenience. The primary goal of this paper is to compare the choice of conjugate and non-conjugate base distributions on a particular class of DPM models which is widely used in applications, the Dirichlet process Gaussian mixture model (DPGMM). We compare computational efficiency and modeling performance of DPGMM defined using a conjugate and a conditionally conjugate base distribution. We show that better density models can result from using a wider class of priors with no or only a modest increase in computational effort.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Robust probabilistic superposition and comparison of protein structures

Mechelke, M., Habeck, M.

BMC Bioinformatics, 11(363):1-13, July 2010 (article)

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Results of the GREAT08 Challenge: An image analysis competition for cosmological lensing

Bridle, S., Balan, S., Bethge, M., Gentile, M., Harmeling, S., Heymans, C., Hirsch, M., Hosseini, R., Jarvis, M., Kirk, D., Kitching, T., Kuijken, K., Lewis, A., Paulin-Henriksson, S., Schölkopf, B., Velander, M., Voigt, L., Witherick, D., Amara, A., Bernstein, G., Courbin, F., Gill, M., Heavens, A., Mandelbaum, R., Massey, R., Moghaddam, B., Rassat, A., Refregier, A., Rhodes, J., Schrabback, T., Shawe-Taylor, J., Shmakova, M., van Waerbeke, L., Wittman, D.

Monthly Notices of the Royal Astronomical Society, 405(3):2044-2061, July 2010 (article)

Abstract
We present the results of the GREAT08 Challenge, a blind analysis challenge to infer weak gravitational lensing shear distortions from images. The primary goal was to stimulate new ideas by presenting the problem to researchers outside the shear measurement community. Six GREAT08 Team methods were presented at the launch of the Challenge and five additional groups submitted results during the 6 month competition. Participants analyzed 30 million simulated galaxies with a range in signal to noise ratio, point-spread function ellipticity, galaxy size, and galaxy type. The large quantity of simulations allowed shear measurement methods to be assessed at a level of accuracy suitable for currently planned future cosmic shear observations for the first time. Different methods perform well in different parts of simulation parameter space and come close to the target level of accuracy in several of these. A number of fresh ideas have emerged as a result of the Challenge including a re-examination of the process of combining information from different galaxies, which reduces the dependence on realistic galaxy modelling. The image simulations will become increasingly sophis- ticated in future GREAT challenges, meanwhile the GREAT08 simulations remain as a benchmark for additional developments in shear measurement algorithms.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Remote Sensing Feature Selection by Kernel Dependence Estimation

Camps-Valls, G., Mooij, J., Schölkopf, B.

IEEE Geoscience and Remote Sensing Letters, 7(3):587-591, July 2010 (article)

Abstract
This letter introduces a nonlinear measure of independence between random variables for remote sensing supervised feature selection. The so-called Hilbert–Schmidt independence criterion (HSIC) is a kernel method for evaluating statistical dependence and it is based on computing the Hilbert–Schmidt norm of the cross-covariance operator of mapped samples in the corresponding Hilbert spaces. The HSIC empirical estimator is easy to compute and has good theoretical and practical properties. Rather than using this estimate for maximizing the dependence between the selected features and the class labels, we propose the more sensitive criterion of minimizing the associated HSIC p-value. Results in multispectral, hyperspectral, and SAR data feature selection for classification show the good performance of the proposed approach.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Clustering stability: an overview

von Luxburg, U.

Foundations and Trends in Machine Learning, 2(3):235-274, July 2010 (article)

Abstract
A popular method for selecting the number of clusters is based on stability arguments: one chooses the number of clusters such that the corresponding clustering results are "most stable". In recent years, a series of papers has analyzed the behavior of this method from a theoretical point of view. However, the results are very technical and difficult to interpret for non-experts. In this paper we give a high-level overview about the existing literature on clustering stability. In addition to presenting the results in a slightly informal but accessible way, we relate them to each other and discuss their different implications.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Justifying Additive Noise Model-Based Causal Discovery via Algorithmic Information Theory

Janzing, D., Steudel, B.

Open Systems and Information Dynamics, 17(2):189-212, June 2010 (article)

Abstract
A recent method for causal discovery is in many cases able to infer whether X causes Y or Y causes X for just two observed variables X and Y. It is based on the observation that there exist (non-Gaussian) joint distributions P(X,Y) for which Y may be written as a function of X up to an additive noise term that is independent of X and no such model exists from Y to X. Whenever this is the case, one prefers the causal model X → Y. Here we justify this method by showing that the causal hypothesis Y → X is unlikely because it requires a specific tuning between P(Y) and P(X|Y) to generate a distribution that admits an additive noise model from X to Y. To quantify the amount of tuning, needed we derive lower bounds on the algorithmic information shared by P(Y) and P(X|Y). This way, our justification is consistent with recent approaches for using algorithmic information theory for causal reasoning. We extend this principle to the case where P(X,Y) almost admits an additive noise model. Our results suggest that the above conclusion is more reliable if the complexity of P(Y) is high.

ei

PDF Web DOI [BibTex]


no image
Sparse Spectrum Gaussian Process Regression

Lázaro-Gredilla, M., Quiñonero-Candela, J., Rasmussen, CE., Figueiras-Vidal, AR.

Journal of Machine Learning Research, 11, pages: 1865-1881, June 2010 (article)

Abstract
We present a new sparse Gaussian Process (GP) model for regression. The key novel idea is to sparsify the spectral representation of the GP. This leads to a simple, practical algorithm for regression tasks. We compare the achievable trade-offs between predictive accuracy and computational requirements, and show that these are typically superior to existing state-of-the-art sparse approximations. We discuss both the weight space and function space representations, and note that the new construction implies priors over functions which are always stationary, and can approximate any covariance function in this class.

ei

PDF [BibTex]

PDF [BibTex]


no image
Dynamic Dissimilarity Measure for Support-Based Clustering

Lee, D., Lee, J.

IEEE Transactions on Knowledge and Data Engineering, 22(6):900-905, June 2010 (article)

Abstract
Clustering methods utilizing support estimates of a data distribution have recently attracted much attention because of their ability to generate cluster boundaries of arbitrary shape and to deal with outliers efficiently. In this paper, we propose a novel dissimilarity measure based on a dynamical system associated with support estimating functions. Theoretical foundations of the proposed measure are developed and applied to construct a clustering method that can effectively partition the whole data space. Simulation results demonstrate that clustering based on the proposed dissimilarity measure is robust to the choice of kernel parameters and able to control the number of clusters efficiently.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Unsupervised Object Discovery: A Comparison

Tuytelaars, T., Lampert, CH., Blaschko, MB., Buntine, W.

International Journal of Computer Vision, 88(2):284-302, June 2010 (article)

Abstract
The goal of this paper is to evaluate and compare models and methods for learning to recognize basic entities in images in an unsupervised setting. In other words, we want to discover the objects present in the images by analyzing unlabeled data and searching for re-occurring patterns. We experiment with various baseline methods, methods based on latent variable models, as well as spectral clustering methods. The results are presented and compared both on subsets of Caltech256 and MSRC2, data sets that are larger and more challenging and that include more object classes than what has previously been reported in the literature. A rigorous framework for evaluating unsupervised object discovery methods is proposed.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
How to Explain Individual Classification Decisions

Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., Müller, K.

Journal of Machine Learning Research, 11, pages: 1803-1831, June 2010 (article)

Abstract
After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data. Thus, we get an answer to the question what is the most likely label of a given unseen data point. However, most methods will provide no answer why the model predicted a particular label for a single instance and what features were most influential for that particular instance. The only method that is currently able to provide such explanations are decision trees. This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of any classification method.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior

Kim, K., Kwon, Y.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6):1127-1133, June 2010 (article)

Abstract
This paper proposes a framework for single-image super-resolution. The underlying idea is to learn a map from input low-resolution images to target high-resolution images based on example pairs of input and output images. Kernel ridge regression (KRR) is adopted for this purpose. To reduce the time complexity of training and testing for KRR, a sparse solution is found by combining the ideas of kernel matching pursuit and gradient descent. As a regularized solution, KRR leads to a better generalization than simply storing the examples as has been done in existing example-based algorithms and results in much less noisy images. However, this may introduce blurring and ringing artifacts around major edges as sharp changes are penalized severely. A prior model of a generic image class which takes into account the discontinuity property of images is adopted to resolve this problem. Comparison with existing algorithms shows the effectiveness of the proposed method.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Imitation and Reinforcement Learning

Kober, J., Peters, J.

IEEE Robotics and Automation Magazine, 17(2):55-62, June 2010 (article)

Abstract
In this article, we present both novel learning algorithms and experiments using the dynamical system MPs. As such, we describe this MP representation in a way that it is straightforward to reproduce. We review an appropriate imitation learning method, i.e., locally weighted regression, and show how this method can be used both for initializing RL tasks as well as for modifying the start-up phase in a rhythmic task. We also show our current best-suited RL algorithm for this framework, i.e., PoWER. We present two complex motor tasks, i.e., ball-in-a-cup and ball paddling, learned on a real, physical Barrett WAM, using the methods presented in this article. Of particular interest is the ball-paddling application, as it requires a combination of both rhythmic and discrete dynamical systems MPs during the start-up phase to achieve a particular task.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Gaussian Mixture Modeling with Gaussian Process Latent Variable Models

Nickisch, H., Rasmussen, C.

Max Planck Institute for Biological Cybernetics, June 2010 (techreport)

Abstract
Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low dimensional latent space, and a stochastic map to the observed space. We show how it can be interpreted as a density model in the observed space. However, the GPLVM is not trained as a density model and therefore yields bad density estimates. We propose a new training strategy and obtain improved generalisation performance and better density estimates in comparative evaluations on several benchmark data sets.

ei

Web [BibTex]

Web [BibTex]


no image
Diffusion Tensor Imaging in a Human PET/MR Hybrid System

Boss, A., Kolb, A., Hofmann, M., Bisdas, S., Nägele, T., Ernemann, U., Stegger, L., Rossi, C., Schlemmer, H., Pfannenberg, C., Reimold, M., Claussen, C., Pichler, B., Klose, U.

Investigative Radiology, 45(5):270-274, May 2010 (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies

Stegle, O., Parts, L., Durbin, R., Winn, JM.

PLoS Computational Biology, 6(5):1-11, May 2010 (article)

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Robust Bayesian Two-Sample Test for Detecting Intervals of Differential Gene Expression in Microarray Time Series

Stegle, O., Denby, KJ., Cooke, EJ., Wild, DL., Ghahramani, Z., Borgwardt, KM.

Journal of Computational Biology, 17(3):355-367, May 2010 (article)

Abstract
Understanding the regulatory mechanisms that are responsible for an organism‘s response to environmental change is an important issue in molecular biology. A first and important step towards this goal is to detect genes whose expression levels are affected by altered external conditions. A range of methods to test for differential gene expression, both in static as well as in time-course experiments, have been proposed. While these tests answer the question whether a gene is differentially expressed, they do not explicitly address the question when a gene is differentially expressed, although this information may provide insights into the course and causal structure of regulatory programs. In this article, we propose a two-sample test for identifying intervals of differential gene expression in microarray time series. Our approach is based on Gaussian process regression, can deal with arbitrary numbers of replicates, and is robust with respect to outliers. We apply our algorithm to study the response of Arabidopsis thaliana genes to an infection by a fungal pathogen using a microarray time series dataset covering 30,336 gene probes at 24 observed time points. In classification experiments, our test compares favorably with existing methods and provides additional insights into time-dependent differential expression.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Statistical Tests for Detecting Differential RNA-Transcript Expression from Read Counts

Stegle, O., Drewe, P., Bohnert, R., Borgwardt, K., Rätsch, G.

Nature Precedings, 2010, pages: 1-11, May 2010 (article)

Abstract
As a fruit of the current revolution in sequencing technology, transcriptomes can now be analyzed at an unprecedented level of detail. These advances have been exploited for detecting differential expressed genes across biological samples and for quantifying the abundances of various RNA transcripts within one gene. However, explicit strategies for detecting the hidden differential abundances of RNA transcripts in biological samples have not been defined. In this work, we present two novel statistical tests to address this issue: a "gene structure sensitive" Poisson test for detecting differential expression when the transcript structure of the gene is known, and a kernel-based test called Maximum Mean Discrepancy when it is unknown. We analyzed the proposed approaches on simulated read data for two artificial samples as well as on factual reads generated by the Illumina Genome Analyzer for two C. elegans samples. Our analysis shows that the Poisson test identifies genes with differential transcript expression considerably better that previously proposed RNA transcript quantification approaches for this task. The MMD test is able to detect a large fraction (75%) of such differential cases without the knowledge of the annotated transcripts. It is therefore well-suited to analyze RNA-Seq experiments when the genome annotations are incomplete or not available, where other approaches have to fail.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Generalized Proximity and Projection with Norms and Mixed-norms

Sra, S.

(192), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, May 2010 (techreport)

Abstract
We discuss generalized proximity operators (GPO) and their associated generalized projection problems. On inputs of size n, we show how to efficiently apply GPOs and generalized projections for separable norms and distance-like functions to accuracy e in O(n log(1/e)) time. We also derive projection algorithms that run theoretically in O(n log n log(1/e)) time but can for suitable parameter ranges empirically outperform the O(n log(1/e)) projection method. The proximity and projection tasks are either separable, and solved directly, or are reduced to a single root-finding step. We highlight that as a byproduct, our analysis also yields an O(n log(1/e)) (weakly linear-time) procedure for Euclidean projections onto the l1;1-norm ball; previously only an O(n log n) method was known. We provide empirical evaluation to illustrate the performance of our methods, noting that for the l1;1-norm projection, our implementation is more than two orders of magnitude faster than the previously known method.

ei

PDF [BibTex]

PDF [BibTex]


no image
Estimation of a Structural Vector Autoregression Model Using Non-Gaussianity

Hyvärinen, A., Zhang, K., Shimizu, S., Hoyer, P.

Journal of Machine Learning Research, 11, pages: 1709-1731, May 2010 (article)

Abstract
Analysis of causal effects between continuous-valued variables typically uses either autoregressive models or structural equation models with instantaneous effects. Estimation of Gaussian, linear structural equation models poses serious identifiability problems, which is why it was recently proposed to use non-Gaussian models. Here, we show how to combine the non-Gaussian instantaneous model with autoregressive models. This is effectively what is called a structural vector autoregression (SVAR) model, and thus our work contributes to the long-standing problem of how to estimate SVAR‘s. We show that such a non-Gaussian model is identifiable without prior knowledge of network structure. We propose computationally efficient methods for estimating the model, as well as methods to assess the significance of the causal influences. The model is successfully applied on financial and brain imaging data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Parameter-exploring policy gradients

Sehnke, F., Osendorfer, C., Rückstiess, T., Graves, A., Peters, J., Schmidhuber, J.

Neural Networks, 21(4):551-559, May 2010 (article)

Abstract
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in parameter space, which leads to lower variance gradient estimates than obtained by regular policy gradient methods. We show that for several complex control tasks, including robust standing with a humanoid robot, this method outperforms well-known algorithms from the fields of standard policy gradients, finite difference methods and population based heuristics. We also show that the improvement is largest when the parameter samples are drawn symmetrically. Lastly we analyse the importance of the individual components of our method by incrementally incorporating them into the other algorithms, and measuring the gain in performance after each step.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Temporal Kernel CCA and its Application in Multimodal Neuronal Data Analysis

Biessmann, F., Meinecke, F., Gretton, A., Rauch, A., Rainer, G., Logothetis, N., Müller, K.

Machine Learning, 79(1-2):5-27, May 2010 (article)

Abstract
Data recorded from multiple sources sometimes exhibit non-instantaneous couplings. For simple data sets, cross-correlograms may reveal the coupling dynamics. But when dealing with high-dimensional multivariate data there is no such measure as the cross-correlogram. We propose a simple algorithm based on Kernel Canonical Correlation Analysis (kCCA) that computes a multivariate temporal filter which links one data modality to another one. The filters can be used to compute a multivariate extension of the cross-correlogram, the canonical correlogram, between data sources that have different dimensionalities and temporal resolutions. The canonical correlogram reflects the coupling dynamics between the two sources. The temporal filter reveals which features in the data give rise to these couplings and when they do so. We present results from simulations and neuroscientific experiments showing that tkCCA yields easily interpretable temporal filters and correlograms. In the experiments, we simultaneously performed electrode recordings and functional magnetic resonance imaging (fMRI) in primary visual cortex of the non-human primate. While electrode recordings reflect brain activity directly, fMRI provides only an indirect view of neural activity via the Blood Oxygen Level Dependent (BOLD) response. Thus it is crucial for our understanding and the interpretation of fMRI signals in general to relate them to direct measures of neural activity acquired with electrodes. The results computed by tkCCA confirm recent models of the hemodynamic response to neural activity and allow for a more detailed analysis of neurovascular coupling dynamics.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]