Header logo is


2010


no image
Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution

Görür, D., Rasmussen, C.

Journal of Computer Science and Technology, 25(4):653-664, July 2010 (article)

Abstract
In the Bayesian mixture modeling framework it is possible to infer the necessary number of components to model the data and therefore it is unnecessary to explicitly restrict the number of components. Nonparametric mixture models sidestep the problem of finding the “correct” number of mixture components by assuming infinitely many components. In this paper Dirichlet process mixture (DPM) models are cast as infinite mixture models and inference using Markov chain Monte Carlo is described. The specification of the priors on the model parameters is often guided by mathematical and practical convenience. The primary goal of this paper is to compare the choice of conjugate and non-conjugate base distributions on a particular class of DPM models which is widely used in applications, the Dirichlet process Gaussian mixture model (DPGMM). We compare computational efficiency and modeling performance of DPGMM defined using a conjugate and a conditionally conjugate base distribution. We show that better density models can result from using a wider class of priors with no or only a modest increase in computational effort.

ei

PDF PDF DOI [BibTex]

2010


PDF PDF DOI [BibTex]


no image
Robust probabilistic superposition and comparison of protein structures

Mechelke, M., Habeck, M.

BMC Bioinformatics, 11(363):1-13, July 2010 (article)

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Inferring deterministic causal relations

Daniusis, P., Janzing, D., Mooij, J., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.

In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pages: 143-150, (Editors: P Grünwald and P Spirtes), AUAI Press, Corvallis, OR, USA, UAI, July 2010 (inproceedings)

Abstract
We consider two variables that are related to each other by an invertible function. While it has previously been shown that the dependence structure of the noise can provide hints to determine which of the two variables is the cause, we presently show that even in the deterministic (noise-free) case, there are asymmetries that can be exploited for causal inference. Our method is based on the idea that if the function and the probability density of the cause are chosen independently, then the distribution of the effect will, in a certain sense, depend on the function. We provide a theoretical analysis of this method, showing that it also works in the low noise regime, and link it to information geometry. We report strong empirical results on various real-world data sets from different domains.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Recent trends in classification of remote sensing data: active and semisupervised machine learning paradigms

Bruzzone, L., Persello, C.

In pages: 3720-3723 , IEEE, Piscataway, NJ, USA, IEEE International Geoscience and Remote Sensing Symposium (IGARSS), July 2010 (inproceedings)

Abstract
This paper addresses the recent trends in machine learning methods for the automatic classification of remote sensing (RS) images. In particular, we focus on two new paradigms: semisupervised and active learning. These two paradigms allow one to address classification problems in the critical conditions where the available labeled training samples are limited. These operational conditions are very usual in RS problems, due to the high cost and time associated with the collection of labeled samples. Semisupervised and active learning techniques allow one to enrich the initial training set information and to improve classification accuracy by exploiting unlabeled samples or requiring additional labeling phases from the user, respectively. The two aforementioned strategies are theoretically and experimentally analyzed considering SVM-based techniques in order to highlight advantages and disadvantages of both strategies.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Results of the GREAT08 Challenge: An image analysis competition for cosmological lensing

Bridle, S., Balan, S., Bethge, M., Gentile, M., Harmeling, S., Heymans, C., Hirsch, M., Hosseini, R., Jarvis, M., Kirk, D., Kitching, T., Kuijken, K., Lewis, A., Paulin-Henriksson, S., Schölkopf, B., Velander, M., Voigt, L., Witherick, D., Amara, A., Bernstein, G., Courbin, F., Gill, M., Heavens, A., Mandelbaum, R., Massey, R., Moghaddam, B., Rassat, A., Refregier, A., Rhodes, J., Schrabback, T., Shawe-Taylor, J., Shmakova, M., van Waerbeke, L., Wittman, D.

Monthly Notices of the Royal Astronomical Society, 405(3):2044-2061, July 2010 (article)

Abstract
We present the results of the GREAT08 Challenge, a blind analysis challenge to infer weak gravitational lensing shear distortions from images. The primary goal was to stimulate new ideas by presenting the problem to researchers outside the shear measurement community. Six GREAT08 Team methods were presented at the launch of the Challenge and five additional groups submitted results during the 6 month competition. Participants analyzed 30 million simulated galaxies with a range in signal to noise ratio, point-spread function ellipticity, galaxy size, and galaxy type. The large quantity of simulations allowed shear measurement methods to be assessed at a level of accuracy suitable for currently planned future cosmic shear observations for the first time. Different methods perform well in different parts of simulation parameter space and come close to the target level of accuracy in several of these. A number of fresh ideas have emerged as a result of the Challenge including a re-examination of the process of combining information from different galaxies, which reduces the dependence on realistic galaxy modelling. The image simulations will become increasingly sophis- ticated in future GREAT challenges, meanwhile the GREAT08 simulations remain as a benchmark for additional developments in shear measurement algorithms.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Source Separation and Higher-Order Causal Analysis of MEG and EEG

Zhang, K., Hyvärinen, A.

In Uncertainty in Artificial Intelligence: Proceedings of the Twenty-Sixth Conference (UAI 2010), pages: 709-716, (Editors: Grünwald, P. , P. Spirtes), AUAI Press, Corvallis, OR, USA, 26th Conference on Uncertainty in Artificial Intelligence (UAI), July 2010 (inproceedings)

Abstract
Separation of the sources and analysis of their connectivity have been an important topic in EEG/MEG analysis. To solve this problem in an automatic manner, we propose a twolayer model, in which the sources are conditionally uncorrelated from each other, but not independent; the dependence is caused by the causality in their time-varying variances (envelopes). The model is identified in two steps. We first propose a new source separation technique which takes into account the autocorrelations (which may be time-varying) and time-varying variances of the sources. The causality in the envelopes is then discovered by exploiting a special kind of multivariate GARCH (generalized autoregressive conditional heteroscedasticity) model. The resulting causal diagram gives the effective connectivity between the separated sources; in our experimental results on MEG data, sources with similar functions are grouped together, with negative influences between groups, and the groups are connected via some interesting sources.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Invariant Gaussian Process Latent Variable Models and Application in Causal Discovery

Zhang, K., Schölkopf, B., Janzing, D.

In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pages: 717-724, (Editors: P Grünwald and P Spirtes), AUAI Press, Corvallis, OR, USA, UAI, July 2010 (inproceedings)

Abstract
In nonlinear latent variable models or dynamic models, if we consider the latent variables as confounders (common causes), the noise dependencies imply further relations between the observed variables. Such models are then closely related to causal discovery in the presence of nonlinear confounders, which is a challenging problem. However, generally in such models the observation noise is assumed to be independent across data dimensions, and consequently the noise dependencies are ignored. In this paper we focus on the Gaussian process latent variable model (GPLVM), from which we develop an extended model called invariant GPLVM (IGPLVM), which can adapt to arbitrary noise covariances. With the Gaussian process prior put on a particular transformation of the latent nonlinear functions, instead of the original ones, the algorithm for IGPLVM involves almost the same computational loads as that for the original GPLVM. Besides its potential application in causal discovery, IGPLVM has the advantage that its estimated latent nonlinear manifold is invariant to any nonsingular linear transformation of the data. Experimental results on both synthetic and realworld data show its encouraging performance in nonlinear manifold learning and causal discovery.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Remote Sensing Feature Selection by Kernel Dependence Estimation

Camps-Valls, G., Mooij, J., Schölkopf, B.

IEEE Geoscience and Remote Sensing Letters, 7(3):587-591, July 2010 (article)

Abstract
This letter introduces a nonlinear measure of independence between random variables for remote sensing supervised feature selection. The so-called Hilbert–Schmidt independence criterion (HSIC) is a kernel method for evaluating statistical dependence and it is based on computing the Hilbert–Schmidt norm of the cross-covariance operator of mapped samples in the corresponding Hilbert spaces. The HSIC empirical estimator is easy to compute and has good theoretical and practical properties. Rather than using this estimate for maximizing the dependence between the selected features and the class labels, we propose the more sensitive criterion of minimizing the associated HSIC p-value. Results in multispectral, hyperspectral, and SAR data feature selection for classification show the good performance of the proposed approach.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Clustering stability: an overview

von Luxburg, U.

Foundations and Trends in Machine Learning, 2(3):235-274, July 2010 (article)

Abstract
A popular method for selecting the number of clusters is based on stability arguments: one chooses the number of clusters such that the corresponding clustering results are "most stable". In recent years, a series of papers has analyzed the behavior of this method from a theoretical point of view. However, the results are very technical and difficult to interpret for non-experts. In this paper we give a high-level overview about the existing literature on clustering stability. In addition to presenting the results in a slightly informal but accessible way, we relate them to each other and discuss their different implications.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Multi-Label Learning by Exploiting Label Dependency

Zhang, M., Zhang, K.

In Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2010), pages: 999-1008, (Editors: Rao, B. , B. Krishnapuram, A. Tomkins, Q. Yang), ACM Press, New York, NY, USA, 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), July 2010 (inproceedings)

Abstract
In multi-label learning, each training example is associated with a set of labels and the task is to predict the proper label set for the unseen example. Due to the tremendous (exponential) number of possible label sets, the task of learning from multi-label examples is rather challenging. Therefore, the key to successful multi-label learning is how to effectively exploit correlations between different labels to facilitate the learning process. In this paper, we propose to use a Bayesian network structure to efficiently encode the condi- tional dependencies of the labels as well as the feature set, with the feature set as the common parent of all labels. To make it practical, we give an approximate yet efficient procedure to find such a network structure. With the help of this network, multi-label learning is decomposed into a series of single-label classification problems, where a classifier is constructed for each label by incorporating its parental labels as additional features. Label sets of unseen examples are predicted recursively according to the label ordering given by the network. Extensive experiments on a broad range of data sets validate the effectiveness of our approach against other well-established methods.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Efficient Filter Flow for Space-Variant Multiframe Blind Deconvolution

Hirsch, M., Sra, S., Schölkopf, B., Harmeling, S.

In Proceedings of the 23rd IEEE Conference on Computer Vision and Pattern Recognition, pages: 607-614, IEEE, Piscataway, NJ, USA, CVPR, June 2010 (inproceedings)

Abstract
Ultimately being motivated by facilitating space-variant blind deconvolution, we present a class of linear transformations, that are expressive enough for space-variant filters, but at the same time especially designed for efficient matrix-vector-multiplications. Successful results on astronomical imaging through atmospheric turbulences and on noisy magnetic resonance images of constantly moving objects demonstrate the practical significance of our approach.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Grasping with Vision Descriptors and Motor Primitives

Kroemer, O., Detry, R., Piater, J., Peters, J.

In Proceedings of the 7th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2010), pages: 47-54, (Editors: Filipe, J. , J. Andrade-Cetto, J.-L. Ferrier), SciTePress , Lisboa, Portugal, 7th International Conference on Informatics in Control, Automation and Robotics (ICINCO), June 2010 (inproceedings)

Abstract
Grasping is one of the most important abilities needed for future service robots. Given the task of picking up an object from betweem clutter, traditional robotics approaches would determine a suitable grasping point and then use a movement planner to reach the goal. The planner would require precise and accurate information about the environment and long computation times, both of which may not always be available. Therefore, methods for executing grasps are required, which perform well with information gathered from only standard stereo vision, and make only a few necessary assumptions about the task environment. We propose techniques that reactively modify the robot’s learned motor primitives based on information derived from Early Cognitive Vision descriptors. The proposed techniques employ non-parametric potential fields centered on the Early Cognitive Vision descriptors to allow for curving hand trajectories around objects, and finger motions that adapt to the object’s local geometry. The methods were tested on a real robot and found to allow for easier imitation learning of human movements and give a considerable improvement to the robot’s performance in grasping tasks.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
An efficient divide-and-conquer cascade for nonlinear object detection

Lampert, CH.

In Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), pages: 1022-1029, IEEE, Piscataway, NJ, USA, Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2010 (inproceedings)

Abstract
We introduce a method to accelerate the evaluation of object detection cascades with the help of a divide-and-conquer procedure in the space of candidate regions. Compared to the exhaustive procedure that thus far is the state-of-the-art for cascade evaluation, the proposed method requires fewer evaluations of the classifier functions, thereby speeding up the search. Furthermore, we show how the recently developed efficient subwindow search (ESS) procedure [11] can be integrated into the last stage of our method. This allows us to use our method to act not only as a faster procedure for cascade evaluation, but also as a tool to perform efficient branch-and-bound object detection with nonlinear quality functions, in particular kernelized support vector machines. Experiments on the PASCAL VOC 2006 dataset show an acceleration of more than 50% by our method compared to standard cascade evaluation.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Non-parametric estimation of integral probability metrics

Sriperumbudur, B., Fukumizu, K., Gretton, A., Schölkopf, B., Lanckriet, G.

In Proceedings of the IEEE International Symposium on Information Theory (ISIT 2010), pages: 1428-1432, IEEE, Piscataway, NJ, USA, IEEE International Symposium on Information Theory (ISIT), June 2010 (inproceedings)

Abstract
In this paper, we develop and analyze a nonparametric method for estimating the class of integral probability metrics (IPMs), examples of which include the Wasserstein distance, Dudley metric, and maximum mean discrepancy (MMD). We show that these distances can be estimated efficiently by solving a linear program in the case of Wasserstein distance and Dudley metric, while MMD is computable in a closed form. All these estimators are shown to be strongly consistent and their convergence rates are analyzed. Based on these results, we show that IPMs are simple to estimate and the estimators exhibit good convergence behavior compared to fi-divergence estimators.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Causal Markov condition for submodular information measures

Steudel, B., Janzing, D., Schölkopf, B.

In Proceedings of the 23rd Annual Conference on Learning Theory, pages: 464-476, (Editors: AT Kalai and M Mohri), OmniPress, Madison, WI, USA, COLT, June 2010 (inproceedings)

Abstract
The causal Markov condition (CMC) is a postulate that links observations to causality. It describes the conditional independences among the observations that are entailed by a causal hypothesis in terms of a directed acyclic graph. In the conventional setting, the observations are random variables and the independence is a statistical one, i.e., the information content of observations is measured in terms of Shannon entropy. We formulate a generalized CMC for any kind of observations on which independence is defined via an arbitrary submodular information measure. Recently, this has been discussed for observations in terms of binary strings where information is understood in the sense of Kolmogorov complexity. Our approach enables us to find computable alternatives to Kolmogorov complexity, e.g., the length of a text after applying existing data compression schemes. We show that our CMC is justified if one restricts the attention to a class of causal mechanisms that is adapted to the respective information measure. Our justification is similar to deriving the statistical CMC from functional models of causality, where every variable is a deterministic function of its observed causes and an unobserved noise term. Our experiments on real data demonstrate the performance of compression based causal inference.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
UDP Communication channel design of master-slave robot system

Hong, A., Cho, JH., Wang, H., Lee, DY.

In pages: 231-232, 2010 KSME Conference, June 2010 (inproceedings)

ei

[BibTex]

[BibTex]


no image
Justifying Additive Noise Model-Based Causal Discovery via Algorithmic Information Theory

Janzing, D., Steudel, B.

Open Systems and Information Dynamics, 17(2):189-212, June 2010 (article)

Abstract
A recent method for causal discovery is in many cases able to infer whether X causes Y or Y causes X for just two observed variables X and Y. It is based on the observation that there exist (non-Gaussian) joint distributions P(X,Y) for which Y may be written as a function of X up to an additive noise term that is independent of X and no such model exists from Y to X. Whenever this is the case, one prefers the causal model X → Y. Here we justify this method by showing that the causal hypothesis Y → X is unlikely because it requires a specific tuning between P(Y) and P(X|Y) to generate a distribution that admits an additive noise model from X to Y. To quantify the amount of tuning, needed we derive lower bounds on the algorithmic information shared by P(Y) and P(X|Y). This way, our justification is consistent with recent approaches for using algorithmic information theory for causal reasoning. We extend this principle to the case where P(X,Y) almost admits an additive noise model. Our results suggest that the above conclusion is more reliable if the complexity of P(Y) is high.

ei

PDF Web DOI [BibTex]


no image
Telling cause from effect based on high-dimensional observations

Janzing, D., Hoyer, P., Schölkopf, B.

In Proceedings of the 27th International Conference on Machine Learning, pages: 479-486, (Editors: J Fürnkranz and T Joachims), International Machine Learning Society, Madison, WI, USA, ICML, June 2010 (inproceedings)

Abstract
We describe a method for inferring linear causal relations among multi-dimensional variables. The idea is to use an asymmetry between the distributions of cause and effect that occurs if the covariance matrix of the cause and the structure matrix mapping the cause to the effect are independently chosen. The method applies to both stochastic and deterministic causal relations, provided that the dimensionality is sufficiently high (in some experiments, 5 was enough). It is applicable to Gaussian as well as non-Gaussian data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Dynamic Dissimilarity Measure for Support-Based Clustering

Lee, D., Lee, J.

IEEE Transactions on Knowledge and Data Engineering, 22(6):900-905, June 2010 (article)

Abstract
Clustering methods utilizing support estimates of a data distribution have recently attracted much attention because of their ability to generate cluster boundaries of arbitrary shape and to deal with outliers efficiently. In this paper, we propose a novel dissimilarity measure based on a dynamical system associated with support estimating functions. Theoretical foundations of the proposed measure are developed and applied to construct a clustering method that can effectively partition the whole data space. Simulation results demonstrate that clustering based on the proposed dissimilarity measure is robust to the choice of kernel parameters and able to control the number of clusters efficiently.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Sparse Spectrum Gaussian Process Regression

Lázaro-Gredilla, M., Quiñonero-Candela, J., Rasmussen, CE., Figueiras-Vidal, AR.

Journal of Machine Learning Research, 11, pages: 1865-1881, June 2010 (article)

Abstract
We present a new sparse Gaussian Process (GP) model for regression. The key novel idea is to sparsify the spectral representation of the GP. This leads to a simple, practical algorithm for regression tasks. We compare the achievable trade-offs between predictive accuracy and computational requirements, and show that these are typically superior to existing state-of-the-art sparse approximations. We discuss both the weight space and function space representations, and note that the new construction implies priors over functions which are always stationary, and can approximate any covariance function in this class.

ei

PDF [BibTex]

PDF [BibTex]


no image
A scalable trust-region algorithm with application to mixed-norm regression

Kim, D., Sra, S., Dhillon, I.

In Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pages: 519-526, (Editors: Fürnkranz, J. , T. Joachims), International Machine Learning Society, Madison, WI, USA, 27th International Conference on Machine Learning (ICML), June 2010 (inproceedings)

Abstract
We present a new algorithm for minimizing a convex loss-function subject to regularization. Our framework applies to numerous problems in machine learning and statistics; notably, for sparsity-promoting regularizers such as ℓ1 or ℓ1, ∞ norms, it enables efficient computation of sparse solutions. Our approach is based on the trust-region framework with nonsmooth objectives, which allows us to build on known results to provide convergence analysis. We avoid the computational overheads associated with the conventional Hessian approximation used by trust-region methods by instead using a simple separable quadratic approximation. This approximation also enables use of proximity operators for tackling nonsmooth regularizers. We illustrate the versatility of our resulting algorithm by specializing it to three mixed-norm regression problems: group lasso [36], group logistic regression [21], and multi-task lasso [19]. We experiment with both synthetic and real-world large-scale data—our method is seen to be competitive, robust, and scalable.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
The Influence of the Image Basis on Modeling and Steganalysis Performance

Schwamberger, V., Le, P., Schölkopf, B., Franz, M.

In Information Hiding, pages: 133-144, (Editors: R Böhme and PWL Fong and R Safavi-Naini), Springer, Berlin, Germany, 12th international Workshop (IH), June 2010 (inproceedings)

Abstract
We compare two image bases with respect to their capabilities for image modeling and steganalysis. The first basis consists of wavelets, the second is a Laplacian pyramid. Both bases are used to decompose the image into subbands where the local dependency structure is modeled with a linear Bayesian estimator. Similar to existing approaches, the image model is used to predict coefficient values from their neighborhoods, and the final classification step uses statistical descriptors of the residual. Our findings are counter-intuitive on first sight: Although Laplacian pyramids have better image modeling capabilities than wavelets, steganalysis based on wavelets is much more successful. We present a number of experiments that suggest possible explanations for this result.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Unsupervised Object Discovery: A Comparison

Tuytelaars, T., Lampert, CH., Blaschko, MB., Buntine, W.

International Journal of Computer Vision, 88(2):284-302, June 2010 (article)

Abstract
The goal of this paper is to evaluate and compare models and methods for learning to recognize basic entities in images in an unsupervised setting. In other words, we want to discover the objects present in the images by analyzing unlabeled data and searching for re-occurring patterns. We experiment with various baseline methods, methods based on latent variable models, as well as spectral clustering methods. The results are presented and compared both on subsets of Caltech256 and MSRC2, data sets that are larger and more challenging and that include more object classes than what has previously been reported in the literature. A rigorous framework for evaluating unsupervised object discovery methods is proposed.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
A PAC-Bayesian Analysis of Co-clustering, Graph Clustering, and Pairwise Clustering

Seldin, Y.

In ICML 2010 Workshop on Social Analytics: Learning from human interactions, pages: 1-5, ICML Workshop on Social Analytics: Learning from human interactions, June 2010 (inproceedings)

Abstract
We review briefly the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008, 2009, 2010), which provided generalization guarantees and regularization terms absent in the preceding formulations of this problem and achieved state-of-the-art prediction results in MovieLens collaborative filtering task. Inspired by this analysis we formulate weighted graph clustering1 as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. Following the lines of (Seldin and Tishby, 2010) we derive PAC-Bayesian generalization bounds for graph clustering. The bounds show that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-of-the-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports the empirical evidence by providing a better theoretical foundation, suggesting formal generalization guarantees, and offering a more accurate way to deal with finite sample issues.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
How to Explain Individual Classification Decisions

Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., Müller, K.

Journal of Machine Learning Research, 11, pages: 1803-1831, June 2010 (article)

Abstract
After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data. Thus, we get an answer to the question what is the most likely label of a given unseen data point. However, most methods will provide no answer why the model predicted a particular label for a single instance and what features were most influential for that particular instance. The only method that is currently able to provide such explanations are decision trees. This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of any classification method.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Single-Image Super-Resolution Using Sparse Regression and Natural Image Prior

Kim, K., Kwon, Y.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6):1127-1133, June 2010 (article)

Abstract
This paper proposes a framework for single-image super-resolution. The underlying idea is to learn a map from input low-resolution images to target high-resolution images based on example pairs of input and output images. Kernel ridge regression (KRR) is adopted for this purpose. To reduce the time complexity of training and testing for KRR, a sparse solution is found by combining the ideas of kernel matching pursuit and gradient descent. As a regularized solution, KRR leads to a better generalization than simply storing the examples as has been done in existing example-based algorithms and results in much less noisy images. However, this may introduce blurring and ringing artifacts around major edges as sharp changes are penalized severely. A prior model of a generic image class which takes into account the discontinuity property of images is adopted to resolve this problem. Comparison with existing algorithms shows the effectiveness of the proposed method.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Imitation and Reinforcement Learning

Kober, J., Peters, J.

IEEE Robotics and Automation Magazine, 17(2):55-62, June 2010 (article)

Abstract
In this article, we present both novel learning algorithms and experiments using the dynamical system MPs. As such, we describe this MP representation in a way that it is straightforward to reproduce. We review an appropriate imitation learning method, i.e., locally weighted regression, and show how this method can be used both for initializing RL tasks as well as for modifying the start-up phase in a rhythmic task. We also show our current best-suited RL algorithm for this framework, i.e., PoWER. We present two complex motor tasks, i.e., ball-in-a-cup and ball paddling, learned on a real, physical Barrett WAM, using the methods presented in this article. Of particular interest is the ball-paddling application, as it requires a combination of both rhythmic and discrete dynamical systems MPs during the start-up phase to achieve a particular task.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Reinforcement learning of motor skills in high dimensions: A path integral approach

Theodorou, E., Buchli, J., Schaal, S.

In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages: 2397-2403, May 2010, clmc (inproceedings)

Abstract
Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far due to the computational difficulties that reinforcement learning encounters in high dimensional continuous state-action spaces. In this paper, we derive a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals. While solidly grounded in optimal control theory and estimation theory, the update equations for learning are surprisingly simple and have no danger of numerical instabilities as neither matrix inversions nor gradient learning rates are required. Empirical evaluations demonstrate significant performance improvements over gradient-based policy learning and scalability to high-dimensional control problems. Finally, a learning experiment on a robot dog illustrates the functionality of our algorithm in a real-world scenario. We believe that our new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Inverse dynamics control of floating base systems using orthogonal decomposition

Mistry, M., Buchli, J., Schaal, S.

In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages: 3406-3412, May 2010, clmc (inproceedings)

Abstract
Model-based control methods can be used to enable fast, dexterous, and compliant motion of robots without sacrificing control accuracy. However, implementing such techniques on floating base robots, e.g., humanoids and legged systems, is non-trivial due to under-actuation, dynamically changing constraints from the environment, and potentially closed loop kinematics. In this paper, we show how to compute the analytically correct inverse dynamics torques for model-based control of sufficiently constrained floating base rigid-body systems, such as humanoid robots with one or two feet in contact with the environment. While our previous inverse dynamics approach relied on an estimation of contact forces to compute an approximate inverse dynamics solution, here we present an analytically correct solution by using an orthogonal decomposition to project the robot dynamics onto a reduced dimensional space, independent of contact forces. We demonstrate the feasibility and robustness of our approach on a simulated floating base bipedal humanoid robot and an actual robot dog locomoting over rough terrain.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Fast, robust quadruped locomotion over challenging terrain

Kalakrishnan, M., Buchli, J., Pastor, P., Mistry, M., Schaal, S.

In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages: 2665-2670, May 2010, clmc (inproceedings)

Abstract
We present a control architecture for fast quadruped locomotion over rough terrain. We approach the problem by decomposing it into many sub-systems, in which we apply state-of-the-art learning, planning, optimization and control techniques to achieve robust, fast locomotion. Unique features of our control strategy include: (1) a system that learns optimal foothold choices from expert demonstration using terrain templates, (2) a body trajectory optimizer based on the Zero-Moment Point (ZMP) stability criterion, and (3) a floating-base inverse dynamics controller that, in conjunction with force control, allows for robust, compliant locomotion over unperceived obstacles. We evaluate the performance of our controller by testing it on the LittleDog quadruped robot, over a wide variety of rough terrain of varying difficulty levels. We demonstrate the generalization ability of this controller by presenting test results from an independent external test team on terrains that have never been shown to us.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Apprenticeship learning via soft local homomorphisms

Boularias, A., Chaib-Draa, B.

In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA 2010), pages: 2971-2976, IEEE, Piscataway, NJ, USA, 2010 IEEE International Conference on Robotics and Automation (ICRA), May 2010 (inproceedings)

Abstract
We consider the problem of apprenticeship learning when the expert's demonstration covers only a small part of a large state space. Inverse Reinforcement Learning (IRL) provides an efficient solution to this problem based on the assumption that the expert is optimally acting in a Markov Decision Process (MDP). However, past work on IRL requires an accurate estimate of the frequency of encountering each feature of the states when the robot follows the expert‘s policy. Given that the complete policy of the expert is unknown, the features frequencies can only be empirically estimated from the demonstrated trajectories. In this paper, we propose to use a transfer method, known as soft homomorphism, in order to generalize the expert‘s policy to unvisited regions of the state space. The generalized policy can be used either as the robot‘s final policy, or to calculate the features frequencies within an IRL algorithm. Empirical results show that our approach is able to learn good policies from a small number of demonstrations.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Diffusion Tensor Imaging in a Human PET/MR Hybrid System

Boss, A., Kolb, A., Hofmann, M., Bisdas, S., Nägele, T., Ernemann, U., Stegger, L., Rossi, C., Schlemmer, H., Pfannenberg, C., Reimold, M., Claussen, C., Pichler, B., Klose, U.

Investigative Radiology, 45(5):270-274, May 2010 (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies

Stegle, O., Parts, L., Durbin, R., Winn, JM.

PLoS Computational Biology, 6(5):1-11, May 2010 (article)

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Estimation of a Structural Vector Autoregression Model Using Non-Gaussianity

Hyvärinen, A., Zhang, K., Shimizu, S., Hoyer, P.

Journal of Machine Learning Research, 11, pages: 1709-1731, May 2010 (article)

Abstract
Analysis of causal effects between continuous-valued variables typically uses either autoregressive models or structural equation models with instantaneous effects. Estimation of Gaussian, linear structural equation models poses serious identifiability problems, which is why it was recently proposed to use non-Gaussian models. Here, we show how to combine the non-Gaussian instantaneous model with autoregressive models. This is effectively what is called a structural vector autoregression (SVAR) model, and thus our work contributes to the long-standing problem of how to estimate SVAR‘s. We show that such a non-Gaussian model is identifiable without prior knowledge of network structure. We propose computationally efficient methods for estimating the model, as well as methods to assess the significance of the causal influences. The model is successfully applied on financial and brain imaging data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Robust Bayesian Two-Sample Test for Detecting Intervals of Differential Gene Expression in Microarray Time Series

Stegle, O., Denby, KJ., Cooke, EJ., Wild, DL., Ghahramani, Z., Borgwardt, KM.

Journal of Computational Biology, 17(3):355-367, May 2010 (article)

Abstract
Understanding the regulatory mechanisms that are responsible for an organism‘s response to environmental change is an important issue in molecular biology. A first and important step towards this goal is to detect genes whose expression levels are affected by altered external conditions. A range of methods to test for differential gene expression, both in static as well as in time-course experiments, have been proposed. While these tests answer the question whether a gene is differentially expressed, they do not explicitly address the question when a gene is differentially expressed, although this information may provide insights into the course and causal structure of regulatory programs. In this article, we propose a two-sample test for identifying intervals of differential gene expression in microarray time series. Our approach is based on Gaussian process regression, can deal with arbitrary numbers of replicates, and is robust with respect to outliers. We apply our algorithm to study the response of Arabidopsis thaliana genes to an infection by a fungal pathogen using a microarray time series dataset covering 30,336 gene probes at 24 observed time points. In classification experiments, our test compares favorably with existing methods and provides additional insights into time-dependent differential expression.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Statistical Tests for Detecting Differential RNA-Transcript Expression from Read Counts

Stegle, O., Drewe, P., Bohnert, R., Borgwardt, K., Rätsch, G.

Nature Precedings, 2010, pages: 1-11, May 2010 (article)

Abstract
As a fruit of the current revolution in sequencing technology, transcriptomes can now be analyzed at an unprecedented level of detail. These advances have been exploited for detecting differential expressed genes across biological samples and for quantifying the abundances of various RNA transcripts within one gene. However, explicit strategies for detecting the hidden differential abundances of RNA transcripts in biological samples have not been defined. In this work, we present two novel statistical tests to address this issue: a "gene structure sensitive" Poisson test for detecting differential expression when the transcript structure of the gene is known, and a kernel-based test called Maximum Mean Discrepancy when it is unknown. We analyzed the proposed approaches on simulated read data for two artificial samples as well as on factual reads generated by the Illumina Genome Analyzer for two C. elegans samples. Our analysis shows that the Poisson test identifies genes with differential transcript expression considerably better that previously proposed RNA transcript quantification approaches for this task. The MMD test is able to detect a large fraction (75%) of such differential cases without the knowledge of the annotated transcripts. It is therefore well-suited to analyze RNA-Seq experiments when the genome annotations are incomplete or not available, where other approaches have to fail.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Using Model Knowledge for Learning Inverse Dynamics

Nguyen-Tuong, D., Peters, J.

In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA 2010), pages: 2677-2682, IEEE, Piscataway, NJ, USA, 2010 IEEE International Conference on Robotics and Automation (ICRA), May 2010 (inproceedings)

Abstract
In recent years, learning models from data has become an increasingly interesting tool for robotics, as it allows straightforward and accurate model approximation. However, in most robot learning approaches, the model is learned from scratch disregarding all prior knowledge about the system. For many complex robot systems, available prior knowledge from advanced physics-based modeling techniques can entail valuable information for model learning that may result in faster learning speed, higher accuracy and better generalization. In this paper, we investigate how parametric physical models (e.g., obtained from rigid body dynamics) can be used to improve the learning performance, and, especially, how semiparametric regression methods can be applied in this context. We present two possible semiparametric regression approaches, where the knowledge of the physical model can either become part of the mean function or of the kernel in a nonparametric Gaussian process regression. We compare the learning performance o f these methods first on sampled data and, subsequently, apply the obtained inverse dynamics models in tracking control on a real Barrett WAM. The results show that the semiparametric models learned with rigid body dynamics as prior outperform the standard rigid body dynamics models on real data while generalizing better for unknown parts of the state space.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Coherent Inference on Optimal Play in Game Trees

Hennig, P., Stern, D., Graepel, T.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 326-333, (Editors: Teh, Y.W. , M. Titterington ), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics, May 2010 (inproceedings)

Abstract
Round-based games are an instance of discrete planning problems. Some of the best contemporary game tree search algorithms use random roll-outs as data. Relying on a good policy, they learn on-policy values by propagating information upwards in the tree, but not between sibling nodes. Here, we present a generative model and a corresponding approximate message passing scheme for inference on the optimal, off-policy value of nodes in smooth AND/OR trees, given random roll-outs. The crucial insight is that the distribution of values in game trees is not completely arbitrary. We define a generative model of the on-policy values using a latent score for each state, representing the value under the random roll-out policy. Inference on the values under the optimal policy separates into an inductive, pre-data step and a deductive, post-data part. Both can be solved approximately with Expectation Propagation, allowing off-policy value inference for any node in the (exponentially big) tree in linear time.

ei pn

PDF Web [BibTex]

PDF Web [BibTex]


no image
Incremental Sparsification for Real-time Online Model Learning

Nguyen-Tuong, D., Peters, J.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 557-564, (Editors: Teh, Y.W. , M. Titterington), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics, May 2010 (inproceedings)

Abstract
Online model learning in real-time is required by many applications such as in robot tracking control. It poses a difficult problem, as fast and incremental online regression with large data sets is the essential component which cannot be achieved by straightforward usage of off-the-shelf machine learning methods (such as Gaussian process regression or support vector regression). In this paper, we propose a framework for online, incremental sparsification with a fixed budget designed for large scale real-time model learning. The proposed approach combines a sparsification method based on an independence measure with a large scale database. In combination with an incremental learning approach such as sequential support vector regression, we obtain a regression method which is applicable in real-time online learning. It exhibits competitive learning accuracy when compared with standard regression techniques. Implementation on a real robot emphasizes the applicability of the proposed approach in real-time online model learning for real world systems.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Parameter-exploring policy gradients

Sehnke, F., Osendorfer, C., Rückstiess, T., Graves, A., Peters, J., Schmidhuber, J.

Neural Networks, 21(4):551-559, May 2010 (article)

Abstract
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in parameter space, which leads to lower variance gradient estimates than obtained by regular policy gradient methods. We show that for several complex control tasks, including robust standing with a humanoid robot, this method outperforms well-known algorithms from the fields of standard policy gradients, finite difference methods and population based heuristics. We also show that the improvement is largest when the parameter samples are drawn symmetrically. Lastly we analyse the importance of the individual components of our method by incrementally incorporating them into the other algorithms, and measuring the gain in performance after each step.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Multitask Learning for Brain-Computer Interfaces

Alamgir, M., Grosse-Wentrup, M., Altun, Y.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 17-24, (Editors: Teh, Y.W. , M. Titterington), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics , May 2010 (inproceedings)

Abstract
Brain-computer interfaces (BCIs) are limited in their applicability in everyday settings by the current necessity to record subjectspecific calibration data prior to actual use of the BCI for communication. In this paper, we utilize the framework of multitask learning to construct a BCI that can be used without any subject-specific calibration process. We discuss how this out-of-the-box BCI can be further improved in a computationally efficient manner as subject-specific data becomes available. The feasibility of the approach is demonstrated on two sets of experimental EEG data recorded during a standard two-class motor imagery paradigm from a total of 19 healthy subjects. Specifically, we show that satisfactory classification results can be achieved with zero training data, and combining prior recordings with subjectspecific calibration data substantially outperforms using subject-specific data only. Our results further show that transfer between recordings under slightly different experimental setups is feasible.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Identifying Cause and Effect on Discrete Data using Additive Noise Models

Peters, J., Janzing, D., Schölkopf, B.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 597-604, (Editors: YW Teh and M Titterington), JMLR, Cambridge, MA, USA, 13th International Conference on Artificial Intelligence and Statistics, May 2010 (inproceedings)

Abstract
Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. Whenever the joint distribution P(X;Y ) admits such a model in one direction, e.g. Y = f(X) + N; N ? X, it does not admit the reversed model X = g(Y ) + ~N ; ~N ? Y as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. We show that this algorithm works both on synthetic and real data sets.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Temporal Kernel CCA and its Application in Multimodal Neuronal Data Analysis

Biessmann, F., Meinecke, F., Gretton, A., Rauch, A., Rainer, G., Logothetis, N., Müller, K.

Machine Learning, 79(1-2):5-27, May 2010 (article)

Abstract
Data recorded from multiple sources sometimes exhibit non-instantaneous couplings. For simple data sets, cross-correlograms may reveal the coupling dynamics. But when dealing with high-dimensional multivariate data there is no such measure as the cross-correlogram. We propose a simple algorithm based on Kernel Canonical Correlation Analysis (kCCA) that computes a multivariate temporal filter which links one data modality to another one. The filters can be used to compute a multivariate extension of the cross-correlogram, the canonical correlogram, between data sources that have different dimensionalities and temporal resolutions. The canonical correlogram reflects the coupling dynamics between the two sources. The temporal filter reveals which features in the data give rise to these couplings and when they do so. We present results from simulations and neuroscientific experiments showing that tkCCA yields easily interpretable temporal filters and correlograms. In the experiments, we simultaneously performed electrode recordings and functional magnetic resonance imaging (fMRI) in primary visual cortex of the non-human primate. While electrode recordings reflect brain activity directly, fMRI provides only an indirect view of neural activity via the Blood Oxygen Level Dependent (BOLD) response. Thus it is crucial for our understanding and the interpretation of fMRI signals in general to relate them to direct measures of neural activity acquired with electrodes. The results computed by tkCCA confirm recent models of the hemodynamic response to neural activity and allow for a more detailed analysis of neurovascular coupling dynamics.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Estimating predictive stimulus features from psychophysical data: The decision image technique applied to human faces

Macke, J., Wichmann, F.

Journal of Vision, 10(5:22):1-24, May 2010 (article)

Abstract
One major challenge in the sensory sciences is to identify the stimulus features on which sensory systems base their computations, and which are predictive of a behavioral decision: they are a prerequisite for computational models of perception. We describe a technique (decision images) for extracting predictive stimulus features using logistic regression. A decision image not only defines a region of interest within a stimulus but is a quantitative template which defines a direction in stimulus space. Decision images thus enable the development of predictive models, as well as the generation of optimized stimuli for subsequent psychophysical investigations. Here we describe our method and apply it to data from a human face classification experiment. We show that decision images are able to predict human responses not only in terms of overall percent correct but also in terms of the probabilities with which individual faces are (mis-) classified by individual observers. We show that the most predictive dimension for gender categorization is neither aligned with the axis defined by the two class-means, nor with the first principal component of all faces-two hypotheses frequently entertained in the literature. Our method can be applied to a wide range of binary classification tasks in vision or other psychophysical contexts.

ei

Web DOI [BibTex]


no image
Semi-supervised Learning via Generalized Maximum Entropy

Erkan, A., Altun, Y.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 209-216, (Editors: Teh, Y.W. , M. Titterington), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics , May 2010 (inproceedings)

Abstract
Various supervised inference methods can be analyzed as convex duals of the generalized maximum entropy (MaxEnt) framework. Generalized MaxEnt aims to find a distribution that maximizes an entropy function while respecting prior information represented as potential functions in miscellaneous forms of constraints and/or penalties. We extend this framework to semi-supervised learning by incorporating unlabeled data via modifications to these potential functions reflecting structural assumptions on the data geometry. The proposed approach leads to a family of discriminative semi-supervised algorithms, that are convex, scalable, inherently multi-class, easy to implement, and that can be kernelized naturally. Experimental evaluation of special cases shows the competitiveness of our methodology.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
A New Algorithm for Improving the Resolution of Cryo-EM Density Maps

Hirsch, M., Schölkopf, B., Habeck, M.

In Research in Computational Molecular Biology, Lecture Notes in Bioinformatics, Vol. 6044 , pages: 174-188, (Editors: B Berger), Springer, Berlin, Germany, 14th International Conference on Research in Computational Molecular Biology (RECOMB), May 2010 (inproceedings)

Abstract
Cryo-electron microscopy (cryo-EM) plays an increasingly prominent role in structure elucidation of macromolecular assemblies. Advances in experimental instrumentation and computational power have spawned numerous cryo-EM studies of large biomolecular complexes resulting in the reconstruction of three-dimensional density maps at intermediate and low resolution. In this resolution range, identification and interpretation of structural elements and modeling of biomolecular structure with atomic detail becomes problematic. In this paper, we present a novel algorithm that enhances the resolution of intermediate- and low-resolution density maps. Our underlying assumption is to model the low-resolution density map as a blurred and possibly noise-corrupted version of an unknown high-resolution map that we seek to recover by deconvolution. By exploiting the nonnegativity of both the high-resolution map and blur kernel we derive multiplicative updates reminiscent of those used in nonnegative matrix factorization. Our framework allows for easy incorporation of additional prior knowledge such as smoothness and sparseness, on both the sharpened density map and the blur kernel. A probabilistic formulation enables us to derive updates for the hyperparameters, therefore our approach has no parameter that needs adjustment. We apply the algorithm to simulated three-dimensional electron microscopic data. We show that our method provides better resolved density maps when compared with B-factor sharpening, especially in the presence of noise. Moreover, our method can use additional information provided by homologous structures, which helps to improve the resolution even further.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Movement Templates for Learning of Hitting and Batting

Kober, J., Mülling, K., Krömer, O., Lampert, C., Schölkopf, B., Peters, J.

In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA 2010), pages: 853-858, IEEE, Piscataway, NJ, USA, 2010 IEEE International Conference on Robotics and Automation (ICRA), May 2010 (inproceedings)

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Animal detection in natural scenes: Critical features revisited

Wichmann, F., Drewes, J., Rosas, P., Gegenfurtner, K.

Journal of Vision, 10(4):1-27, April 2010 (article)

Abstract
S. J. Thorpe, D. Fize, and C. Marlot (1996) showed how rapidly observers can detect animals in images of natural scenes, but it is still unclear which image features support this rapid detection. A. B. Torralba and A. Oliva (2003) suggested that a simple image statistic based on the power spectrum allows the absence or presence of objects in natural scenes to be predicted. We tested whether human observers make use of power spectral differences between image categories when detecting animals in natural scenes. In Experiments 1 and 2 we found performance to be essentially independent of the power spectrum. Computational analysis revealed that the ease of classification correlates with the proposed spectral cue without being caused by it. This result is consistent with the hypothesis that in commercial stock photo databases a majority of animal images are pre-segmented from the background by the photographers and this pre-segmentation causes the power spectral differences between image categories and may, furthermore, help rapid animal detection. Data from a third experiment are consistent with this hypothesis. Together, our results make it exceedingly unlikely that human observers make use of power spectral differences between animal- and no-animal images during rapid animal detection. In addition, our results point to potential confounds in the commercially available “natural image” databases whose statistics may be less natural than commonly presumed.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
A generative model approach for decoding in the visual event-related potential-based brain-computer interface speller

Martens, SMM., Leiva, JM.

Journal of Neural Engineering, 7(2):1-10, April 2010 (article)

Abstract
There is a strong tendency towards discriminative approaches in brain-computer interface (BCI) research. We argue that generative model-based approaches are worth pursuing and propose a simple generative model for the visual ERP-based BCI speller which incorporates prior knowledge about the brain signals. We show that the proposed generative method needs less training data to reach a given letter prediction performance than the state of the art discriminative approaches.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Hilbert Space Embeddings and Metrics on Probability Measures

Sriperumbudur, B., Gretton, A., Fukumizu, K., Schölkopf, B., Lanckriet, G.

Journal of Machine Learning Research, 11, pages: 1517-1561, April 2010 (article)

ei

PDF [BibTex]

PDF [BibTex]