Header logo is


2009


no image
Detecting Objects in Large Image Collections and Videos by Efficient Subimage Retrieval

Lampert, CH.

In ICCV 2009, pages: 987-994, IEEE Computer Society, Piscataway, NJ, USA, Twelfth IEEE International Conference on Computer Vision, October 2009 (inproceedings)

Abstract
We study the task of detecting the occurrence of objects in large image collections or in videos, a problem that combines aspects of content based image retrieval and object localization. While most previous approaches are either limited to special kinds of queries, or do not scale to large image sets, we propose a new method, efficient subimage retrieval (ESR), which is at the same time very flexible and very efficient. Relying on a two-layered branch-and-bound setup, ESR performs object-based image retrieval in sets of 100,000 or more images within seconds. An extensive evaluation on several datasets shows that ESR is not only very fast, but it also achieves detection accuracies that are on par with or superior to previously published methods for object-based image retrieval.

ei

PDF Web DOI [BibTex]

2009


PDF Web DOI [BibTex]


no image
Inferring textual entailment with a probabilistically sound calculus

Harmeling, S.

Natural Language Engineering, 15(4):459-477, October 2009 (article)

Abstract
We introduce a system for textual entailment that is based on a probabilistic model of entailment. The model is defined using a calculus of transformations on dependency trees, which is characterized by the fact that derivations in that calculus preserve the truth only with a certain probability. The calculus is successfully evaluated on the datasets of the PASCAL Challenge on Recognizing Textual Entailment.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Modeling and Visualizing Uncertainty in Gene Expression Clusters using Dirichlet Process Mixtures

Rasmussen, CE., de la Cruz, BJ., Ghahramani, Z., Wild, DL.

IEEE/ACM Transactions on Computational Biology and Bioinformatics, 6(4):615-628, October 2009 (article)

Abstract
Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data, little attention has been paid to uncertainty in the results obtained. Dirichlet process mixture models provide a non-parametric Bayesian alternative to the bootstrap approach to modeling uncertainty in gene expression clustering. Most previously published applications of Bayesian model based clustering methods have been to short time series data. In this paper we present a case study of the application of non-parametric Bayesian clustering methods to the clustering of high-dimensional non-time series gene expression data using full Gaussian covariances. We use the probability that two genes belong to the same cluster in a Dirichlet process mixture model as a measure of the similarity of these gene expression profiles. Conversely, this probability can be used to define a dissimilarity measure, which, for the purposes of visualization, can be input to one of the standard linkage algorithms used for hierarchical clustering. Biologically plausible results are obtained from the Rosetta compendium of expression profiles which extend previously published cluster analyses of this data.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A new non-monotonic algorithm for PET image reconstruction

Sra, S., Kim, D., Dhillon, I., Schölkopf, B.

In IEEE - Nuclear Science Symposium Conference Record (NSS/MIC), 2009, pages: 2500-2502, (Editors: B Yu), IEEE, Piscataway, NJ, USA, IEEE Nuclear Science Symposium and Medical Imaging Conference, October 2009 (inproceedings)

Abstract
Maximizing some form of Poisson likelihood (either with or without penalization) is central to image reconstruction algorithms in emission tomography. In this paper we introduce NMML, a non-monotonic algorithm for maximum likelihood PET image reconstruction. NMML offers a simple and flexible procedure that also easily incorporates standard convex regular-ization for doing penalized likelihood estimation. A vast number image reconstruction algorithms have been developed for PET, and new ones continue to be designed. Among these, methods based on the expectation maximization (EM) and ordered-subsets (OS) framework seem to have enjoyed the greatest popularity. Our method NMML differs fundamentally from methods based on EM: i) it does not depend on the concept of optimization transfer (or surrogate functions); and ii) it is a rapidly converging nonmonotonic descent procedure. The greatest strengths of NMML, however, are its simplicity, efficiency, and scalability, which make it especially attractive for tomograph ic reconstruction. We provide a theoretical analysis NMML, and empirically observe it to outperform standard EM based methods, sometimes by orders of magnitude. NMML seamlessly allows integreation of penalties (regularizers) in the likelihood. This ability can prove to be crucial, especially because with the rapidly rising importance of combined PET/MR scanners, one will want to include more “prior” knowledge into the reconstruction.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Approximation Algorithms for Tensor Clustering

Jegelka, S., Sra, S., Banerjee, A.

In Algorithmic Learning Theory: 20th International Conference, pages: 368-383, (Editors: Gavalda, R. , G. Lugosi, T. Zeugmann, S. Zilles), Springer, Berlin, Germany, ALT, October 2009 (inproceedings)

Abstract
We present the first (to our knowledge) approximation algo- rithm for tensor clustering—a powerful generalization to basic 1D clustering. Tensors are increasingly common in modern applications dealing with complex heterogeneous data and clustering them is a fundamental tool for data analysis and pattern discovery. Akin to their 1D cousins, common tensor clustering formulations are NP-hard to optimize. But, unlike the 1D case no approximation algorithms seem to be known. We address this imbalance and build on recent co-clustering work to derive a tensor clustering algorithm with approximation guarantees, allowing metrics and divergences (e.g., Bregman) as objective functions. Therewith, we answer two open questions by Anagnostopoulos et al. (2008). Our analysis yields a constant approximation factor independent of data size; a worst-case example shows this factor to be tight for Euclidean co-clustering. However, empirically the approximation factor is observed to be conservative, so our method can also be used in practice.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Active learning using mean shift optimization for robot grasping

Kroemer, O., Detry, R., Piater, J., Peters, J.

In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009), pages: 2610-2615, IEEE Service Center, Piscataway, NJ, USA, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2009 (inproceedings)

Abstract
When children learn to grasp a new object, they often know several possible grasping points from observing a parent‘s demonstration and subsequently learn better grasps by trial and error. From a machine learning point of view, this process is an active learning approach. In this paper, we present a new robot learning framework for reproducing this ability in robot grasping. For doing so, we chose a straightforward approach: first, the robot observes a few good grasps by demonstration and learns a value function for these grasps using Gaussian process regression. Subsequently, it chooses grasps which are optimal with respect to this value function using a mean-shift optimization approach, and tries them out on the real system. Upon every completed trial, the value function is updated, and in the following trials it is more likely to choose even better grasping points. This method exhibits fast learning due to the data-efficiency of Gaussian process regression framework and the fact th at t he mean-shift method provides maxima of this cost function. Experiments were repeatedly carried out successfully on a real robot system. After less than sixty trials, our system has adapted its grasping policy to consistently exhibit successful grasps.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Sparse online model learning for robot control with support vector regression

Nguyen-Tuong, D., Schölkopf, B., Peters, J.

In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009), pages: 3121-3126, IEEE Service Center, Piscataway, NJ, USA, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2009 (inproceedings)

Abstract
The increasing complexity of modern robots makes it prohibitively hard to accurately model such systems as required by many applications. In such cases, machine learning methods offer a promising alternative for approximating such models using measured data. To date, high computational demands have largely restricted machine learning techniques to mostly offline applications. However, making the robots adaptive to changes in the dynamics and to cope with unexplored areas of the state space requires online learning. In this paper, we propose an approximation of the support vector regression (SVR) by sparsification based on the linear independency of training data. As a result, we obtain a method which is applicable in real-time online learning. It exhibits competitive learning accuracy when compared with standard regression techniques, such as nu-SVR, Gaussian process regression (GPR) and locally weighted projection regression (LWPR).

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Thermodynamic efficiency of information and heat flow

Allahverdyan, A., Janzing, D., Mahler, G.

Journal of Statistical Mechanics: Theory and Experiment, 2009(09):P09011, September 2009 (article)

Abstract
A basic task of information processing is information transfer (flow). P0 Here we study a pair of Brownian particles each coupled to a thermal bath at temperatures T1 and T2 . The information flow in such a system is defined via the time-shifted mutual information. The information flow nullifies at equilibrium, and its efficiency is defined as the ratio of the flow to the total entropy production in the system. For a stationary state the information flows from higher to lower temperatures, and its efficiency is bounded from above by (max[T1 , T2 ])/(|T1 − T2 |). This upper bound is imposed by the second law and it quantifies the thermodynamic cost for information flow in the present class of systems. It can be reached in the adiabatic situation, where the particles have widely different characteristic times. The efficiency of heat flow—defined as the heat flow over the total amount of dissipated heat—is limited from above by the same factor. There is a complementarity between heat and information flow: the set-up which is most efficient for the former is the least efficient for the latter and vice versa. The above bound for the efficiency can be (transiently) overcome in certain non-stationary situations, but the efficiency is still limited from above. We study yet another measure of information processing (transfer entropy) proposed in the literature. Though this measure does not require any thermodynamic cost, the information flow and transfer entropy are shown to be intimately related for stationary states.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Does Cognitive Science Need Kernels?

Jäkel, F., Schölkopf, B., Wichmann, F.

Trends in Cognitive Sciences, 13(9):381-388, September 2009 (article)

Abstract
Kernel methods are among the most successful tools in machine learning and are used in challenging data analysis problems in many disciplines. Here we provide examples where kernel methods have proven to be powerful tools for analyzing behavioral data, especially for identifying features in categorization experiments. We also demonstrate that kernel methods relate to perceptrons and exemplar models of categorization. Hence, we argue that kernel methods have neural and psychological plausibility, and theoretical results concerning their behavior are therefore potentially relevant for human category learning. In particular, we believe kernel methods have the potential to provide explanations ranging from the implementational via the algorithmic to the computational level.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Implicit Wiener Series Analysis of Epileptic Seizure Recordings

Barbero, A., Franz, M., Drongelen, W., Dorronsoro, J., Schölkopf, B., Grosse-Wentrup, M.

In EMBC 2009, pages: 5304-5307, (Editors: Y Kim and B He and G Worrell and X Pan), IEEE Service Center, Piscataway, NJ, USA, 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, September 2009 (inproceedings)

Abstract
Implicit Wiener series are a powerful tool to build Volterra representations of time series with any degree of nonlinearity. A natural question is then whether higher order representations yield more useful models. In this work we shall study this question for ECoG data channel relationships in epileptic seizure recordings, considering whether quadratic representations yield more accurate classifiers than linear ones. To do so we first show how to derive statistical information on the Volterra coefficient distribution and how to construct seizure classification patterns over that information. As our results illustrate, a quadratic model seems to provide no advantages over a linear one. Nevertheless, we shall also show that the interpretability of the implicit Wiener series provides insights into the inter-channel relationships of the recordings.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Incorporating Prior Knowledge on Class Probabilities into Local Similarity Measures for Intermodality Image Registration

Hofmann, M., Schölkopf, B., Bezrukov, I., Cahill, N.

In Proceedings of the MICCAI 2009 Workshop on Probabilistic Models for Medical Image Analysis , pages: 220-231, (Editors: W Wells and S Joshi and K Pohl), PMMIA, September 2009 (inproceedings)

Abstract
We present a methodology for incorporating prior knowledge on class probabilities into the registration process. By using knowledge from the imaging modality, pre-segmentations, and/or probabilistic atlases, we construct vectors of class probabilities for each image voxel. By defining new image similarity measures for distribution-valued images, we show how the class probability images can be nonrigidly registered in a variational framework. An experiment on nonrigid registration of MR and CT full-body scans illustrates that the proposed technique outperforms standard mutual information (MI) and normalized mutual information (NMI) based registration techniques when measured in terms of target registration error (TRE) of manually labeled fiducials.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Markerless 3D Face Tracking (DAGM 2009)

Walder, C., Breidt, M., Bülthoff, H., Schölkopf, B., Curio, C.

In Pattern Recognition, Lecture Notes in Computer Science, Vol. 5748 , pages: 41-50, (Editors: J Denzler and G Notni and H Süsse), Springer, Berlin, Germany, 31st Symposium of the German Association for Pattern Recognition (DAGM), September 2009 (inproceedings)

Abstract
We present a novel algorithm for the markerless tracking of deforming surfaces such as faces. We acquire a sequence of 3D scans along with color images at 40Hz. The data is then represented by implicit surface and color functions, using a novel partition-of-unity type method of efficiently combining local regressors using nearest neighbor searches. Both these functions act on the 4D space of 3D plus time, and use temporal information to handle the noise in individual scans. After interactive registration of a template mesh to the first frame, it is then automatically deformed to track the scanned surface, using the variation of both shape and color as features in a dynamic energy minimization problem. Our prototype system yields high-quality animated 3D models in correspondence, at a rate of approximately twenty seconds per timestep. Tracking results for faces and other objects are presented.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Robot Learning

Peters, J., Morimoto, J., Tedrake, R., Roy, N.

IEEE Robotics and Automation Magazine, 16(3):19-20, September 2009 (article)

Abstract
Creating autonomous robots that can learn to act in unpredictable environments has been a long-standing goal of robotics, artificial intelligence, and the cognitive sciences. In contrast, current commercially available industrial and service robots mostly execute fixed tasks and exhibit little adaptability. To bridge this gap, machine learning offers a myriad set of methods, some of which have already been applied with great success to robotics problems. As a result, there is an increasing interest in machine learning and statistics within the robotics community. At the same time, there has been a growth in the learning community in using robots as motivating applications for new algorithms and formalisms. Considerable evidence of this exists in the use of learning in high-profile competitions such as RoboCup and the Defense Advanced Research Projects Agency (DARPA) challenges, and the growing number of research programs funded by governments around the world.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Object Localization with Global and Local Context Kernels

Blaschko, M., Lampert, C.

In British Machine Vision Conference 2009, pages: 1-11, BMVC, September 2009 (inproceedings)

Abstract
Recent research has shown that the use of contextual cues significantly improves performance in sliding window type localization systems. In this work, we propose a method that incorporates both global and local context information through appropriately defined kernel functions. In particular, we make use of a weighted combination of kernels defined over local spatial regions, as well as a global context kernel. The relative importance of the context contributions is learned automatically, and the resulting discriminant function is of a form such that localization at test time can be solved efficiently using a branch and bound optimization scheme. By specifying context directly with a kernel learning approach, we achieve high localization accuracy with a simple and efficient representation. This is in contrast to other systems that incorporate context for which expensive inference needs to be done at test time. We show experimentally on the PASCAL VOC datasets that the inclusion of context can significantly improve localization performance, provided the relative contributions of context cues are learned appropriately.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Efficient Sample Reuse in EM-Based Policy Search

Hachiya, H., Peters, J., Sugiyama, M.

In 16th European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, pages: 469-484, (Editors: Buntine, W. , M. Grobelnik, D. Mladenic, J. Shawe-Taylor), Springer, Berlin, Germany, ECML PKDD, September 2009 (inproceedings)

Abstract
Direct policy search is a promising reinforcement learning framework in particular for controlling in continuous, high-dimensional systems such as anthropomorphic robots. Policy search often requires a large number of samples for obtaining a stable policy update estimator due to its high flexibility. However, this is prohibitive when the sampling cost is expensive. In this paper, we extend a EM-based policy search method so that previously collected samples can be efficiently reused. The usefulness of the proposed method, called Reward-weighted Regression with sample Reuse, is demonstrated through a robot learning experiment.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Active Structured Learning for High-Speed Object Detection

Lampert, C., Peters, J.

In DAGM 2009, pages: 221-231, (Editors: Denzler, J. , G. Notni, H. Süsse), Springer, Berlin, Germany, 31st Annual Symposium of the German Association for Pattern Recognition, September 2009 (inproceedings)

Abstract
High-speed smooth and accurate visual tracking of objects in arbitrary, unstructured environments is essential for robotics and human motion analysis. However, building a system that can adapt to arbitrary objects and a wide range of lighting conditions is a challenging problem, especially if hard real-time constraints apply like in robotics scenarios. In this work, we introduce a method for learning a discriminative object tracking system based on the recent structured regression framework for object localization. Using a kernel function that allows fast evaluation on the GPU, the resulting system can process video streams at speed of 100 frames per second or more. Consecutive frames in high speed video sequences are typically very redundant, and for training an object detection system, it is sufficient to have training labels from only a subset of all images. We propose an active learning method that select training examples in a data-driven way, thereby minimizing the required number of training labeling. Experiments on realistic data show that the active learning is superior to previously used methods for dataset subsampling for this task.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Kernel Methods in Computer Vision

Lampert, CH.

Foundations and Trends in Computer Graphics and Vision, 4(3):193-285, September 2009 (article)

Abstract
Over the last years, kernel methods have established themselves as powerful tools for computer vision researchers as well as for practitioners. In this tutorial, we give an introduction to kernel methods in computer vision from a geometric perspective, introducing not only the ubiquitous support vector machines, but also less known techniques for regression, dimensionality reduction, outlier detection and clustering. Additionally, we give an outlook on very recent, non-classical techniques for the prediction of structure data, for the estimation of statistical dependency and for learning the kernel function itself. All methods are illustrated with examples of successful application from the recent computer vision research literature.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Generalized Clustering via Kernel Embeddings

Jegelka, S., Gretton, A., Schölkopf, B., Sriperumbudur, B., von Luxburg, U.

In KI 2009: AI and Automation, Lecture Notes in Computer Science, Vol. 5803, pages: 144-152, (Editors: B Mertsching and M Hund and Z Aziz), Springer, Berlin, Germany, 32nd Annual Conference on Artificial Intelligence (KI), September 2009 (inproceedings)

Abstract
We generalize traditional goals of clustering towards distinguishing components in a non-parametric mixture model. The clusters are not necessarily based on point locations, but on higher order criteria. This framework can be implemented by embedding probability distributions in a Hilbert space. The corresponding clustering objective is very general and relates to a range of common clustering concepts.

ei

PDF PDF Web DOI [BibTex]

PDF PDF Web DOI [BibTex]


no image
Discovering Temporal Patterns of Differential Gene Expression in Microarray Time Series

Stegle, O., Denby, KJ., Wild, DL., McHattie, S., Mead, A., Ghahramani, Z., Borgwardt, KM.

In Proceedings of the German Conference on Bioinformatics 2009 (GCB 2009), pages: 133-142, (Editors: Grosse, I. , S. Neumann, S. Posch, F. Schreiber, P. F. Stadler), Gesellschaft für Informatik, Bonn, Germany, German Conference on Bioinformatics (GCB '09), September 2009 (inproceedings)

Abstract
A wealth of time series of microarray measurements have become available over recent years. Several two-sample tests for detecting differential gene expression in these time series have been defined, but they can only answer the question whether a gene is differentially expressed across the whole time series, not in which intervals it is differentially expressed. In this article, we propose a Gaussian process based approach for studying these dynamics of differential gene expression. In experiments on Arabidopsis thaliana gene expression levels, our novel technique helps us to uncover that the family of WRKY transcription factors appears to be involved in the early response to infection by a fungal pathogen.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Fast Kernel-Based Independent Component Analysis

Shen, H., Jegelka, S., Gretton, A.

IEEE Transactions on Signal Processing, 57(9):3498-3511, September 2009 (article)

Abstract
Recent approaches to independent component analysis (ICA) have used kernel independence measures to obtain highly accurate solutions, particularly where classical methods experience difficulty (for instance, sources with near-zero kurtosis). FastKICA (fast HSIC-based kernel ICA) is a new optimization method for one such kernel independence measure, the Hilbert-Schmidt Independence Criterion (HSIC). The high computational efficiency of this approach is achieved by combining geometric optimization techniques, specifically an approximate Newton-like method on the orthogonal group, with accurate estimates of the gradient and Hessian based on an incomplete Cholesky decomposition. In contrast to other efficient kernel-based ICA algorithms, FastKICA is applicable to any twice differentiable kernel function. Experimental results for problems with large numbers of sources and observations indicate that FastKICA provides more accurate solutions at a given cost than gradient descent on HSIC. Comparing with other recently published ICA methods, FastKICA is competitive in terms of accuracy, relatively insensitive to local minima when initialized far from independence, and more robust towards outliers. An analysis of the local convergence properties of FastKICA is provided.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Guest editorial: Special issue on robot learning, Part B

Peters, J., Ng, A.

Autonomous Robots, 27(2):91-92, August 2009 (article)

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Policy Search for Motor Primitives

Peters, J., Kober, J.

KI - Zeitschrift K{\"u}nstliche Intelligenz, 23(3):38-40, August 2009 (article)

Abstract
Many motor skills in humanoid robotics can be learned using parametrized motor primitives from demonstrations. However, most interesting motor learning problems require self-improvement often beyond the reach of current reinforcement learning methods due to the high dimensionality of the state-space. We develop an EM-inspired algorithm applicable to complex motor learning tasks. We compare this algorithm to several well-known parametrized policy search methods and show that it outperforms them. We apply it to motor learning problems and show that it can learn the complex Ball-in-a-Cup task using a real Barrett WAM robot arm.

ei

Web [BibTex]

Web [BibTex]


no image
A neurophysiologically plausible population code model for human contrast discrimination

Goris, R., Wichmann, F., Henning, G.

Journal of Vision, 9(7):1-22, July 2009 (article)

Abstract
The pedestal effect is the improvement in the detectability of a sinusoidal grating in the presence of another grating of the same orientation, spatial frequency, and phase—usually called the pedestal. Recent evidence has demonstrated that the pedestal effect is differently modified by spectrally flat and notch-filtered noise: The pedestal effect is reduced in flat noise but virtually disappears in the presence of notched noise (G. B. Henning & F. A. Wichmann, 2007). Here we consider a network consisting of units whose contrast response functions resemble those of the cortical cells believed to underlie human pattern vision and demonstrate that, when the outputs of multiple units are combined by simple weighted summation—a heuristic decision rule that resembles optimal information combination and produces a contrast-dependent weighting profile—the network produces contrast-discrimination data consistent with psychophysical observations: The pedestal effect is present without noise, reduced in broadband noise, but almost disappears in notched noise. These findings follow naturally from the normalization model of simple cells in primary visual cortex, followed by response-based pooling, and suggest that in processing even low-contrast sinusoidal gratings, the visual system may combine information across neurons tuned to different spatial frequencies and orientations.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Falsificationism and Statistical Learning Theory: Comparing the Popper and Vapnik-Chervonenkis Dimensions

Corfield, D., Schölkopf, B., Vapnik, V.

Journal for General Philosophy of Science, 40(1):51-58, July 2009 (article)

Abstract
We compare Karl Popper’s ideas concerning the falsifiability of a theory with similar notions from the part of statistical learning theory known as VC-theory. Popper’s notion of the dimension of a theory is contrasted with the apparently very similar VC-dimension. Having located some divergences, we discuss how best to view Popper’s work from the perspective of statistical learning theory, either as a precursor or as aiming to capture a different learning activity.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Guest editorial: Special issue on robot learning, Part A

Peters, J., Ng, A.

Autonomous Robots, 27(1):1-2, July 2009 (article)

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
A Geometric Approach to Confidence Sets for Ratios: Fieller’s Theorem, Generalizations, and Bootstrap

von Luxburg, U., Franz, V.

Statistica Sinica, 19(3):1095-1117, July 2009 (article)

Abstract
We present a geometric method to determine confidence sets for the ratio E(Y)/E(X) of the means of random variables X and Y. This method reduces the problem of constructing confidence sets for the ratio of two random variables to the problem of constructing confidence sets for the means of one-dimensional random variables. It is valid in a large variety of circumstances. In the case of normally distributed random variables, the so constructed confidence sets coincide with the standard Fieller confidence sets. Generalizations of our construction lead to definitions of exact and conservative confidence sets for very general classes of distributions, provided the joint expectation of (X,Y) exists and the linear combinations of the form aX + bY are well-behaved. Finally, our geometric method allows to derive a very simple bootstrap approach for constructing conservative confidence sets for ratios which perform favorably in certain situations, in particular in the asymmetric heavy-tailed regime.

ei

PDF PDF Web [BibTex]


no image
Learning Motor Primitives for Robotics

Kober, J., Peters, J., Oztop, E.

Advanced Telecommunications Research Center ATR, June 2009 (talk)

Abstract
The acquisition and self-improvement of novel motor skills is among the most important problems in robotics. Motor primitives offer one of the most promising frameworks for the application of machine learning techniques in this context. Employing the Dynamic Systems Motor primitives originally introduced by Ijspeert et al. (2003), appropriate learning algorithms for a concerted approach of both imitation and reinforcement learning are presented. Using these algorithms new motor skills, i.e., Ball-in-a-Cup, Ball-Paddling and Dart-Throwing, are learned.

ei

[BibTex]

[BibTex]


no image
Varieties of Justification in Machine Learning

Corfield, D.

In Proceedings of Multiplicity and Unification in Statistics and Probability, pages: 1-10, Multiplicity and Unification in Statistics and Probability, June 2009 (inproceedings)

Abstract
The field of machine learning has flourished over the past couple of decades. With huge amounts of data available, efficient algorithms can learn to extrapolate from their training sets to become very accurate classifiers. For example, it is straightforward now to develop classifiers which achieve accuracies of around 99% on databases of handwritten digits. Now these algorithms have been devised by theorists who arrive at the problem of machine learning with a range of different philosophical outlooks on the subject of inductive reasoning. This has led to a wide range of theoretical rationales for their work. In this talk I shall classify the different forms of justification for inductive machine learning into four kinds, and make some comparisons between them. With little by way of theoretical knowledge to aid in the learning tasks, while the relevance of these justificatory approaches for the inductive reasoning of the natural sciences is questionable, certain issues surrounding the presuppositions of inductive reasoning are brought sharply into focus. In particular, Frequentist, Bayesian and MDL outlooks can be compared.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Effects of Stimulus Type and of Error-Correcting Code Design on BCI Speller Performance

Hill, J., Farquhar, J., Martens, S., Biessmann, F., Schölkopf, B.

In Advances in neural information processing systems 21, pages: 665-672, (Editors: D Koller and D Schuurmans and Y Bengio and L Bottou), Curran, Red Hook, NY, USA, 22nd Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
From an information-theoretic perspective, a noisy transmission system such as a visual Brain-Computer Interface (BCI) speller could benefit from the use of errorcorrecting codes. However, optimizing the code solely according to the maximal minimum-Hamming-distance criterion tends to lead to an overall increase in target frequency of target stimuli, and hence a significantly reduced average target-to-target interval (TTI), leading to difficulties in classifying the individual event-related potentials (ERPs) due to overlap and refractory effects. Clearly any change to the stimulus setup must also respect the possible psychophysiological consequences. Here we report new EEG data from experiments in which we explore stimulus types and codebooks in a within-subject design, finding an interaction between the two factors. Our data demonstrate that the traditional, rowcolumn code has particular spatial properties that lead to better performance than one would expect from its TTIs and Hamming-distances alone, but nonetheless error-correcting codes can improve performance provided the right stimulus type is used.

ei

PDF PDF Web [BibTex]

PDF PDF Web [BibTex]


no image
Influence of graph construction on graph-based clustering measures

Maier, M., von Luxburg, U., Hein, M.

In Advances in neural information processing systems 21, pages: 1025-1032, (Editors: Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou), Curran, Red Hook, NY, USA, Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
Graph clustering methods such as spectral clustering are defined for general weighted graphs. In machine learning, however, data often is not given in form of a graph, but in terms of similarity (or distance) values between points. In this case, first a neighborhood graph is constructed using the similarities between the points and then a graph clustering algorithm is applied to this graph. In this paper we investigate the influence of the construction of the similarity graph on the clustering results. We first study the convergence of graph clustering criteria such as the normalized cut (Ncut) as the sample size tends to infinity. We find that the limit expressions are different for different types of graph, for example the r-neighborhood graph or the k-nearest neighbor graph. In plain words: Ncut on a kNN graph does something systematically different than Ncut on an r-neighborhood graph! This finding shows that graph clustering criteria cannot be studied independently of the kind of graph they are applied to. We also provide examples which show that these differences can be observed for toy and real data already for rather small sample sizes.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Local Gaussian Process Regression for Real Time Online Model Learning and Control

Nguyen-Tuong, D., Seeger, M., Peters, J.

In Advances in neural information processing systems 21, pages: 1193-1200, (Editors: Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou), Curran, Red Hook, NY, USA, Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
Learning in real-time applications, e.g., online approximation of the inverse dynamics model for model-based robot control, requires fast online regression techniques. Inspired by local learning, we propose a method to speed up standard Gaussian Process regression (GPR) with local GP models (LGP). The training data is partitioned in local regions, for each an individual GP model is trained. The prediction for a query point is performed by weighted estimation using nearby local models. Unlike other GP approximations, such as mixtures of experts, we use a distance based measure for partitioning of the data and weighted prediction. The proposed method achieves online learning and prediction in real-time. Comparisons with other nonparametric regression methods show that LGP has higher accuracy than LWPR and close to the performance of standard GPR and nu-SVR.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Detecting the Direction of Causal Time Series

Peters, J., Janzing, D., Gretton, A., Schölkopf, B.

In Proceedings of the 26th International Conference on Machine Learning, pages: 801-808, (Editors: A Danyluk and L Bottou and ML Littman), ACM Press, New York, NY, USA, ICML, June 2009 (inproceedings)

Abstract
We propose a method that detects the true direction of time series, by fitting an autoregressive moving average model to the data. Whenever the noise is independent of the previous samples for one ordering of the observations, but dependent for the opposite ordering, we infer the former direction to be the true one. We prove that our method works in the population case as long as the noise of the process is not normally distributed (for the latter case, the direction is not identificable). A new and important implication of our result is that it confirms a fundamental conjecture in causal reasoning - if after regression the noise is independent of signal for one direction and dependent for the other, then the former represents the true causal direction - in the case of time series. We test our approach on two types of data: simulated data sets conforming to our modeling assumptions, and real world EEG time series. Our method makes a decision for a significant fraction of both data sets, and these decisions are mostly correct. For real world data, our approach outperforms alternative solutions to the problem of time direction recovery.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer

Lampert, C.

IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), June 2009 (talk)

ei

Web [BibTex]

Web [BibTex]


no image
Learning Taxonomies by Dependence Maximization

Blaschko, M., Gretton, A.

In Advances in neural information processing systems 21, pages: 153-160, (Editors: Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou), Curran, Red Hook, NY, USA, Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
We introduce a family of unsupervised algorithms, numerical taxonomy clustering, to simultaneously cluster data, and to learn a taxonomy that encodes the relationship between the clusters. The algorithms work by maximizing the dependence between the taxonomy and the original data. The resulting taxonomy is a more informative visualization of complex data than simple clustering; in addition, taking into account the relations between different clusters is shown to substantially improve the quality of the clustering, when compared with state-ofthe-art algorithms in the literature (both spectral clustering and a previous dependence maximization approach). We demonstrate our algorithm on image and text data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning object-specific grasp affordance densities

Detry, R., Baseski, E., Popovic, M., Touati, Y., Krüger, N., Kroemer, O., Peters, J., Piater, J.

In 8th IEEE International Conference on Development and Learning, pages: 1-7, IEEE Service Center, Piscataway, NJ, USA, ICDL, June 2009 (inproceedings)

Abstract
This paper addresses the issue of learning and representing object grasp affordances, i.e. object-gripper relative configurations that lead to successful grasps. The purpose of grasp affordances is to organize and store the whole knowledge that an agent has about the grasping of an object, in order to facilitate reasoning on grasping solutions and their achievability. The affordance representation consists in a continuous probability density function defined on the 6D gripper pose space-3D position and orientation-, within an object-relative reference frame. Grasp affordances are initially learned from various sources, e.g. from imitation or from visual cues, leading to grasp hypothesis densities. Grasp densities are attached to a learned 3D visual object model, and pose estimation of the visual model allows a robotic agent to execute samples from a grasp hypothesis density under various object poses. Grasp outcomes are used to learn grasp empirical densities, i.e. grasps that have been confirmed through experience. We show the result of learning grasp hypothesis densities from both imitation and visual cues, and present grasp empirical densities learned from physical experience by a robot.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Understanding Brain Connectivity Patterns during Motor Imagery for Brain-Computer Interfacing

Grosse-Wentrup, M.

In Advances in neural information processing systems 21, pages: 561-568, (Editors: Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou), Curran, Red Hook, NY, USA, Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
EEG connectivity measures could provide a new type of feature space for inferring a subject‘s intention in Brain-Computer Interfaces (BCIs). However, very little is known on EEG connectivity patterns for BCIs. In this study, EEG connectivity during motor imagery (MI) of the left and right is investigated in a broad frequency range across the whole scalp by combining Beamforming with Transfer Entropy and taking into account possible volume conduction effects. Observed connectivity patterns indicate that modulation intentionally induced by MI is strongest in the gamma-band, i.e., above 35 Hz. Furthermore, modulation between MI and rest is found to be more pronounced than between MI of different hands. This is in contrast to results on MI obtained with bandpower features, and might provide an explanation for the so far only moderate success of connectivity features in BCIs. It is concluded that future studies on connectivity based BCIs should focus on high frequency bands and con side r ex peri mental paradigms that maximally vary cognitive demands between conditions.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Nonlinear causal discovery with additive noise models

Hoyer, P., Janzing, D., Mooij, J., Peters, J., Schölkopf, B.

In Advances in neural information processing systems 21, pages: 689-696, (Editors: D Koller and D Schuurmans and Y Bengio and L Bottou), Curran, Red Hook, NY, USA, 22nd Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
The discovery of causal relationships between a set of observed variables is a fundamental problem in science. For continuous-valued data linear acyclic causal models are often used because these models are well understood and there are well-known methods to fit them to data. In reality, of course, many causal relationships are more or less nonlinear, raising some doubts as to the applicability and usefulness of purely linear methods. In this contribution we show that in fact the basic linear framework can be generalized to nonlinear models with additive noise. In this extended framework, nonlinearities in the data-generating process are in fact a blessing rather than a curse, as they typically provide information on the underlying causal system and allow more aspects of the true data-generating mechanisms to be identified. In addition to theoretical results we show simulations and some simple real data experiments illustrating the identification power provided by nonlinearities.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Bounds on marginal probability distributions

Mooij, JM., Kappen, B.

In Advances in neural information processing systems 21, pages: 1105-1112, (Editors: Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou), Curran, Red Hook, NY, USA, Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
We propose a novel bound on single-variable marginal probability distributions in factor graphs with discrete variables. The bound is obtained by propagating local bounds (convex sets of probability distributions) over a subtree of the factor graph, rooted in the variable of interest. By construction, the method not only bounds the exact marginal probability distribution of a variable, but also its approximate Belief Propagation marginal ("belief"). Thus, apart from providing a practical means to calculate bounds on marginals, our contribution also lies in providing a better understanding of the error made by Belief Propagation. We show that our bound outperforms the state-of-the-art on some inference problems arising in medical diagnosis.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Convex variational Bayesian inference for large scale generalized linear models

Nickisch, H., Seeger, M.

In ICML 2009, pages: 761-768, (Editors: Danyluk, A. , L. Bottou, M. Littman), ACM Press, New York, NY, USA, 26th International Conference on Machine Learning, June 2009 (inproceedings)

Abstract
We show how variational Bayesian inference can be implemented for very large generalized linear models. Our relaxation is proven to be a convex problem for any log-concave model. We provide a generic double loop algorithm for solving this relaxation on models with arbitrary super-Gaussian potentials. By iteratively decoupling the criterion, most of the work can be done by solving large linear systems, rendering our algorithm orders of magnitude faster than previously proposed solvers for the same problem. We evaluate our method on problems of Bayesian active learning for large binary classification models, and show how to address settings with many candidates and sequential inclusion steps.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis

Schweikert, G., Widmer, C., Schölkopf, B., Rätsch, G.

In Advances in neural information processing systems 21, pages: 1433-1440, (Editors: D Koller and D Schuurmans and Y Bengio and L Bottou), Curran, Red Hook, NY, USA, 22nd Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
We study the problem of domain transfer for a supervised classification task in mRNA splicing. We consider a number of recent domain transfer methods from machine learning, including some that are novel, and evaluate them on genomic sequence data from model organisms of varying evolutionary distance. We find that in cases where the organisms are not closely related, the use of domain adaptation methods can help improve classification performance.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Diffeomorphic Dimensionality Reduction

Walder, C., Schölkopf, B.

In Advances in neural information processing systems 21, pages: 1713-1720, (Editors: D Koller and D Schuurmans and Y Bengio and L Bottou), Curran, Red Hook, NY, USA, 22nd Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
This paper introduces a new approach to constructing meaningful lower dimensional representations of sets of data points. We argue that constraining the mapping between the high and low dimensional spaces to be a diffeomorphism is a natural way of ensuring that pairwise distances are approximately preserved. Accordingly we develop an algorithm which diffeomorphically maps the data near to a lower dimensional subspace and then projects onto that subspace. The problem of solving for the mapping is transformed into one of solving for an Eulerian flow field which we compute using ideas from kernel methods. We demonstrate the efficacy of our approach on various real world data sets.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Combining appearance and motion for human action classification in videos

Dhillon, P., Nowozin, S., Lampert, C.

In 1st International Workshop on Visual Scene Understanding, pages: 22-29, IEEE Service Center, Piscataway, NJ, USA, ViSU, June 2009 (inproceedings)

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Let the Kernel Figure it Out: Principled Learning of Pre-processing for Kernel Classifiers

Gehler, P., Nowozin, S.

In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages: 2836-2843, IEEE Service Center, Piscataway, NJ, USA, CVPR, June 2009 (inproceedings)

Abstract
Most modern computer vision systems for high-level tasks, such as image classification, object recognition and segmentation, are based on learning algorithms that are able to separate discriminative information from noise. In practice, however, the typical system consists of a long pipeline of pre-processing steps, such as extraction of different kinds of features, various kinds of normalizations, feature selection, and quantization into aggregated representations such as histograms. Along this pipeline, there are many parameters to set and choices to make, and their effect on the overall system performance is a-priori unclear. In this work, we shorten the pipeline in a principled way. We move pre-processing steps into the learning system by means of kernel parameters, letting the learning algorithm decide upon suitable parameter values. Learning to optimize the pre-processing choices becomes learning the kernel parameters. We realize this paradigm by extending the recent Multiple Kernel Learning formulation from the finite case of having a fixed number of kernels which can be combined to the general infinite case where each possible parameter setting induces an associated kernel. We evaluate the new paradigm extensively on image classification and object classification tasks. We show that it is possible to learn optimal discriminative codebooks and optimal spatial pyramid schemes, consistently outperforming all previous state-of-the-art approaches.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Identifying confounders using additive noise models

Janzing, D., Peters, J., Mooij, J., Schölkopf, B.

In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pages: 249-257, (Editors: J Bilmes and AY Ng), AUAI Press, Corvallis, OR, USA, UAI, June 2009 (inproceedings)

Abstract
We propose a method for inferring the existence of a latent common cause ("confounder") of two observed random variables. The method assumes that the two effects of the confounder are (possibly nonlinear) functions of the confounder plus independent, additive noise. We discuss under which conditions the model is identifiable (up to an arbitrary reparameterization of the confounder) from the joint distribution of the effects. We state and prove a theoretical result that provides evidence for the conjecture that the model is generically identifiable under suitable technical conditions. In addition, we propose a practical method to estimate the confounder from a finite i.i.d. sample of the effects and illustrate that the method works well on both simulated and real-world data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Regression by dependence minimization and its application to causal inference in additive noise models

Mooij, J., Janzing, D., Peters, J., Schölkopf, B.

In Proceedings of the 26th International Conference on Machine Learning, pages: 745-752, (Editors: A Danyluk and L Bottou and M Littman), ACM Press, New York, NY, USA, ICML, June 2009 (inproceedings)

Abstract
Motivated by causal inference problems, we propose a novel method for regression that minimizes the statistical dependence between regressors and residuals. The key advantage of this approach to regression is that it does not assume a particular distribution of the noise, i.e., it is non-parametric with respect to the noise distribution. We argue that the proposed regression method is well suited to the task of causal inference in additive noise models. A practical disadvantage is that the resulting optimization problem is generally non-convex and can be difficult to solve. Nevertheless, we report good results on one of the tasks of the NIPS 2008 Causality Challenge, where the goal is to distinguish causes from effects in pairs of statistically dependent variables. In addition, we propose an algorithm for efficiently inferring causal models from observational data for more than two variables. The required number of regressions and independence tests is quadratic in the number of variables, which is a significant improvement over the simple method that tests all possible DAGs.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Fitted Q-iteration by Advantage Weighted Regression

Neumann, G., Peters, J.

In Advances in neural information processing systems 21, pages: 1177-1184, (Editors: Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou), Curran, Red Hook, NY, USA, Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
Recently, fitted Q-iteration (FQI) based methods have become more popular due to their increased sample efficiency, a more stable learning process and the higher quality of the resulting policy. However, these methods remain hard to use for continuous action spaces which frequently occur in real-world tasks, e.g., in robotics and other technical applications. The greedy action selection commonly used for the policy improvement step is particularly problematic as it is expensive for continuous actions, can cause an unstable learning process, introduces an optimization bias and results in highly non-smooth policies unsuitable for real-world systems. In this paper, we show that by using a soft-greedy action selection the policy improvement step used in FQI can be simplified to an inexpensive advantage-weighted regression. With this result, we are able to derive a new, computationally efficient FQI algorithm which can even deal with high dimensional action spaces.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Global Connectivity Potentials for Random Field Models

Nowozin, S., Lampert, C.

In CVPR 2009, pages: 818-825, IEEE Service Center, Piscataway, NJ, USA, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 2009 (inproceedings)

Abstract
Markov random field (MRF, CRF) models are popular in computer vision. However, in order to be computationally tractable they are limited to incorporate only local interactions and cannot model global properties, such as connectedness, which is a potentially useful high-level prior for object segmentation. In this work, we overcome this limitation by deriving a potential function that enforces the output labeling to be connected and that can naturally be used in the framework of recent MAP-MRF LP relaxations. Using techniques from polyhedral combinatorics, we show that a provably tight approximation to the MAP solution of the resulting MRF can still be found efficiently by solving a sequence of max-flow problems. The efficiency of the inference procedure also allows us to learn the parameters of a MRF with global connectivity potentials by means of a cutting plane algorithm. We experimentally evaluate our algorithm on both synthetic data and on the challenging segmentation task of the PASCAL VOC 2008 data set. We show that in both cases the addition of a connectedness prior significantly reduces the segmentation error.

ei

PDF PDF Web DOI [BibTex]

PDF PDF Web DOI [BibTex]


no image
Bayesian Experimental Design of Magnetic Resonance Imaging Sequences

Seeger, M., Nickisch, H., Pohmann, R., Schölkopf, B.

In Advances in neural information processing systems 21, pages: 1441-1448, (Editors: D Koller and D Schuurmans and Y Bengio and L Bottou), Curran, Red Hook, NY, USA, 22nd Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
We show how improved sequences for magnetic resonance imaging can be found through automated optimization of Bayesian design scores. Combining recent advances in approximate Bayesian inference and natural image statistics with high-performance numerical computation, we propose the first scalable Bayesian experimental design framework for this problem of high relevance to clinical and brain research. Our solution requires approximate inference for dense, non-Gaussian models on a scale seldom addressed before. We propose a novel scalable variational inference algorithm, and show how powerful methods of numerical mathematics can be modified to compute primitives in our framework. Our approach is evaluated on a realistic setup with raw data from a 3T MR scanner.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Characteristic Kernels on Groups and Semigroups

Fukumizu, K., Sriperumbudur, B., Gretton, A., Schölkopf, B.

In Advances in neural information processing systems 21, pages: 473-480, (Editors: D Koller and D Schuurmans and Y Bengio and L Bottou), Curran, Red Hook, NY, USA, 22nd Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
Embeddings of random variables in reproducing kernel Hilbert spaces (RKHSs) may be used to conduct statistical inference based on higher order moments. For sufficiently rich (characteristic) RKHSs, each probability distribution has a unique embedding, allowing all statistical properties of the distribution to be taken into consideration. Necessary and sufficient conditions for an RKHS to be characteristic exist for Rn. In the present work, conditions are established for an RKHS to be characteristic on groups and semigroups. Illustrative examples are provided, including characteristic kernels on periodic domains, rotation matrices, and Rn+.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Policy Search for Motor Primitives in Robotics

Kober, J., Peters, J.

In Advances in neural information processing systems 21, pages: 849-856, (Editors: Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou), Curran, Red Hook, NY, USA, Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS), June 2009 (inproceedings)

Abstract
Many motor skills in humanoid robotics can be learned using parametrized motor primitives as done in imitation learning. However, most interesting motor learning problems are high-dimensional reinforcement learning problems often beyond the reach of current methods. In this paper, we extend previous work on policy learning from the immediate reward case to episodic reinforcement learning. We show that this results into a general, common framework also connected to policy gradient methods and yielding a novel algorithm for policy learning by assuming a form of exploration that is particularly well-suited for dynamic motor primitives. The resulting algorithm is an EM-inspired algorithm applicable in complex motor learning tasks. We compare this algorithm to alternative parametrized policy search methods and show that it outperforms previous methods. We apply it in the context of motor learning and show that it can learn a complex Ball-in-a-Cup task using a real Barrett WAM robot arm.

ei

PDF Web [BibTex]

PDF Web [BibTex]