Header logo is


2009


no image
Inferring textual entailment with a probabilistically sound calculus

Harmeling, S.

Natural Language Engineering, 15(4):459-477, October 2009 (article)

Abstract
We introduce a system for textual entailment that is based on a probabilistic model of entailment. The model is defined using a calculus of transformations on dependency trees, which is characterized by the fact that derivations in that calculus preserve the truth only with a certain probability. The calculus is successfully evaluated on the datasets of the PASCAL Challenge on Recognizing Textual Entailment.

ei

PDF Web DOI [BibTex]

2009


PDF Web DOI [BibTex]


no image
Modeling and Visualizing Uncertainty in Gene Expression Clusters using Dirichlet Process Mixtures

Rasmussen, CE., de la Cruz, BJ., Ghahramani, Z., Wild, DL.

IEEE/ACM Transactions on Computational Biology and Bioinformatics, 6(4):615-628, October 2009 (article)

Abstract
Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data, little attention has been paid to uncertainty in the results obtained. Dirichlet process mixture models provide a non-parametric Bayesian alternative to the bootstrap approach to modeling uncertainty in gene expression clustering. Most previously published applications of Bayesian model based clustering methods have been to short time series data. In this paper we present a case study of the application of non-parametric Bayesian clustering methods to the clustering of high-dimensional non-time series gene expression data using full Gaussian covariances. We use the probability that two genes belong to the same cluster in a Dirichlet process mixture model as a measure of the similarity of these gene expression profiles. Conversely, this probability can be used to define a dissimilarity measure, which, for the purposes of visualization, can be input to one of the standard linkage algorithms used for hierarchical clustering. Biologically plausible results are obtained from the Rosetta compendium of expression profiles which extend previously published cluster analyses of this data.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Clinical PET/MRI-System and Its Applications with MRI Based Attenuation Correction

Kolb, A., Hofmann, M., Sossi, V., Wehrl, H., Sauter, A., Schmid, A., Schlemmer, H., Claussen, C., Pichler, B.

IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC 2009), 2009, pages: 1, October 2009 (poster)

Abstract
Clinical PET/MRI is an emerging new hybrid imaging modality. In addition to provide an unique possibility for multifunctional imaging with temporally and spatially matched data, it also provides anatomical information that can also be used for attenuation correction with no radiation exposure to the subjects. A plus of combined compared to sequential PET and MR imaging is the reduction of total scan time. Here we present our initial experience with a hybrid brain PET/MRI system. Due to the ethical approval patient scans could only be performed after a diagnostic PET/CT. We estimate that in approximately 50% of the cases PET/MRI was of superior diagnostic value compared to PET/CT and was able to provide additional information, such as DTI, spectroscopy and Time Of Flight (TOF) angiography. Here we present 3 patient cases in oncology, a retropharyngeal carcinoma in neurooncology, a relapsing meningioma and in neurology a pharyngeal carcinoma in addition to an infraction of the right hemisphere. For quantitative PET imaging attenuation correction is obligatory. In current PET/MRI setup we used our MRI based atlas method for calculating the mu-map for attenuation correction. MR-based attenuation correction accuracy was quantitatively compared to CT-based PET attenuation correction. Extensive studies to assess potential mutual interferences between PET and MR imaging modalities as well as NEMA measurements have been performed. The first patient studies as well as the phantom tests clearly demonstrated the overall good imaging performance of this first human PET/MRI system. Ongoing work concentrates on advanced normalization and reconstruction methods incorporating count-rate based algorithms.

ei

Web [BibTex]

Web [BibTex]


no image
Kernel Learning Approaches for Image Classification

Gehler, PV.

Biologische Kybernetik, Universität des Saarlandes, Saarbrücken, Germany, October 2009 (phdthesis)

Abstract
This thesis extends the use of kernel learning techniques to specific problems of image classification. Kernel learning is a paradigm in the field of machine learning that generalizes the use of inner products to compute similarities between arbitrary objects. In image classification one aims to separate images based on their visual content. We address two important problems that arise in this context: learning with weak label information and combination of heterogeneous data sources. The contributions we report on are not unique to image classification, and apply to a more general class of problems. We study the problem of learning with label ambiguity in the multiple instance learning framework. We discuss several different image classification scenarios that arise in this context and argue that the standard multiple instance learning requires a more detailed disambiguation. Finally we review kernel learning approaches proposed for this problem and derive a more efficient algorithm to solve them. The multiple kernel learning framework is an approach to automatically select kernel parameters. We extend it to its infinite limit and present an algorithm to solve the resulting problem. This result is then applied in two directions. We show how to learn kernels that adapt to the special structure of images. Finally we compare different ways of combining image features for object classification and present significant improvements compared to previous methods.

ei

PDF [BibTex]

PDF [BibTex]


no image
Thermodynamic efficiency of information and heat flow

Allahverdyan, A., Janzing, D., Mahler, G.

Journal of Statistical Mechanics: Theory and Experiment, 2009(09):P09011, September 2009 (article)

Abstract
A basic task of information processing is information transfer (flow). P0 Here we study a pair of Brownian particles each coupled to a thermal bath at temperatures T1 and T2 . The information flow in such a system is defined via the time-shifted mutual information. The information flow nullifies at equilibrium, and its efficiency is defined as the ratio of the flow to the total entropy production in the system. For a stationary state the information flows from higher to lower temperatures, and its efficiency is bounded from above by (max[T1 , T2 ])/(|T1 − T2 |). This upper bound is imposed by the second law and it quantifies the thermodynamic cost for information flow in the present class of systems. It can be reached in the adiabatic situation, where the particles have widely different characteristic times. The efficiency of heat flow—defined as the heat flow over the total amount of dissipated heat—is limited from above by the same factor. There is a complementarity between heat and information flow: the set-up which is most efficient for the former is the least efficient for the latter and vice versa. The above bound for the efficiency can be (transiently) overcome in certain non-stationary situations, but the efficiency is still limited from above. We study yet another measure of information processing (transfer entropy) proposed in the literature. Though this measure does not require any thermodynamic cost, the information flow and transfer entropy are shown to be intimately related for stationary states.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Does Cognitive Science Need Kernels?

Jäkel, F., Schölkopf, B., Wichmann, F.

Trends in Cognitive Sciences, 13(9):381-388, September 2009 (article)

Abstract
Kernel methods are among the most successful tools in machine learning and are used in challenging data analysis problems in many disciplines. Here we provide examples where kernel methods have proven to be powerful tools for analyzing behavioral data, especially for identifying features in categorization experiments. We also demonstrate that kernel methods relate to perceptrons and exemplar models of categorization. Hence, we argue that kernel methods have neural and psychological plausibility, and theoretical results concerning their behavior are therefore potentially relevant for human category learning. In particular, we believe kernel methods have the potential to provide explanations ranging from the implementational via the algorithmic to the computational level.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A flowering-time gene network model for association analysis in Arabidopsis thaliana

Klotzbücher, K., Kobayashi, Y., Shervashidze, N., Borgwardt, K., Weigel, D.

2009(39):95-96, German Conference on Bioinformatics (GCB '09), September 2009 (poster)

Abstract
In our project we want to determine a set of single nucleotide polymorphisms (SNPs), which have a major effect on the flowering time of Arabidopsis thaliana. Instead of performing a genome-wide association study on all SNPs in the genome of Arabidopsis thaliana, we examine the subset of SNPs from the flowering-time gene network model. We are interested in how the results of the association study vary when using only the ascertained subset of SNPs from the flowering network model, and when additionally using the information encoded by the structure of the network model. The network model is compiled from the literature by manual analysis and contains genes which have been found to affect the flowering time of Arabidopsis thaliana [Far+08; KW07]. The genes in this model are annotated with the SNPs that are located in these genes, or in near proximity to them. In a baseline comparison between the subset of SNPs from the graph and the set of all SNPs, we omit the structural information and calculate the correlation between the individual SNPs and the flowering time phenotype by use of statistical methods. Through this we can determine the subset of SNPs with the highest correlation to the flowering time. In order to further refine this subset, we include the additional information provided by the network structure by conducting a graph-based feature pre-selection. In the further course of this project we want to validate and examine the resulting set of SNPs and their corresponding genes with experimental methods.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Robot Learning

Peters, J., Morimoto, J., Tedrake, R., Roy, N.

IEEE Robotics and Automation Magazine, 16(3):19-20, September 2009 (article)

Abstract
Creating autonomous robots that can learn to act in unpredictable environments has been a long-standing goal of robotics, artificial intelligence, and the cognitive sciences. In contrast, current commercially available industrial and service robots mostly execute fixed tasks and exhibit little adaptability. To bridge this gap, machine learning offers a myriad set of methods, some of which have already been applied with great success to robotics problems. As a result, there is an increasing interest in machine learning and statistics within the robotics community. At the same time, there has been a growth in the learning community in using robots as motivating applications for new algorithms and formalisms. Considerable evidence of this exists in the use of learning in high-profile competitions such as RoboCup and the Defense Advanced Research Projects Agency (DARPA) challenges, and the growing number of research programs funded by governments around the world.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Kernel Methods in Computer Vision

Lampert, CH.

Foundations and Trends in Computer Graphics and Vision, 4(3):193-285, September 2009 (article)

Abstract
Over the last years, kernel methods have established themselves as powerful tools for computer vision researchers as well as for practitioners. In this tutorial, we give an introduction to kernel methods in computer vision from a geometric perspective, introducing not only the ubiquitous support vector machines, but also less known techniques for regression, dimensionality reduction, outlier detection and clustering. Additionally, we give an outlook on very recent, non-classical techniques for the prediction of structure data, for the estimation of statistical dependency and for learning the kernel function itself. All methods are illustrated with examples of successful application from the recent computer vision research literature.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Initial Data from a first PET/MRI-System and its Applications in Clinical Studies Using MRI Based Attenuation Correction

Kolb, A., Hofmann, M., Sossi, V., Wehrl, H., Sauter, A., Schmid, A., Judenhofer, M., Schlemmer, H., Claussen, C., Pichler, B.

2009 World Molecular Imaging Congress, 2009, pages: 1200, September 2009 (poster)

ei

Web [BibTex]

Web [BibTex]


no image
A High-Speed Object Tracker from Off-the-Shelf Components

Lampert, C., Peters, J.

First IEEE Workshop on Computer Vision for Humanoid Robots in Real Environments at ICCV 2009, 1, pages: 1, September 2009 (poster)

Abstract
We introduce RTblob, an open-source real-time vision system for 3D object detection that achieves over 200 Hz tracking speed with only off-the-shelf hardware component. It allows fast and accurate tracking of colored objects in 3D without expensive and often custom-built hardware, instead making use of the PC graphics cards for the necessary image processing operations.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Fast Kernel-Based Independent Component Analysis

Shen, H., Jegelka, S., Gretton, A.

IEEE Transactions on Signal Processing, 57(9):3498-3511, September 2009 (article)

Abstract
Recent approaches to independent component analysis (ICA) have used kernel independence measures to obtain highly accurate solutions, particularly where classical methods experience difficulty (for instance, sources with near-zero kurtosis). FastKICA (fast HSIC-based kernel ICA) is a new optimization method for one such kernel independence measure, the Hilbert-Schmidt Independence Criterion (HSIC). The high computational efficiency of this approach is achieved by combining geometric optimization techniques, specifically an approximate Newton-like method on the orthogonal group, with accurate estimates of the gradient and Hessian based on an incomplete Cholesky decomposition. In contrast to other efficient kernel-based ICA algorithms, FastKICA is applicable to any twice differentiable kernel function. Experimental results for problems with large numbers of sources and observations indicate that FastKICA provides more accurate solutions at a given cost than gradient descent on HSIC. Comparing with other recently published ICA methods, FastKICA is competitive in terms of accuracy, relatively insensitive to local minima when initialized far from independence, and more robust towards outliers. An analysis of the local convergence properties of FastKICA is provided.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Guest editorial: Special issue on robot learning, Part B

Peters, J., Ng, A.

Autonomous Robots, 27(2):91-92, August 2009 (article)

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Estimating Critical Stimulus Features from Psychophysical Data: The Decision-Image Technique Applied to Human Faces

Macke, J., Wichmann, F.

Journal of Vision, 9(8):31, 9th Annual Meeting of the Vision Sciences Society (VSS), August 2009 (poster)

Abstract
One of the main challenges in the sensory sciences is to identify the stimulus features on which the sensory systems base their computations: they are a pre-requisite for computational models of perception. We describe a technique---decision-images--- for extracting critical stimulus features based on logistic regression. Rather than embedding the stimuli in noise, as is done in classification image analysis, we want to infer the important features directly from physically heterogeneous stimuli. A Decision-image not only defines the critical region-of-interest within a stimulus but is a quantitative template which defines a direction in stimulus space. Decision-images thus enable the development of predictive models, as well as the generation of optimized stimuli for subsequent psychophysical investigations. Here we describe our method and apply it to data from a human face discrimination experiment. We show that decision-images are able to predict human responses not only in terms of overall percent correct but are able to predict, for individual observers, the probabilities with which individual faces are (mis-) classified. We then test the predictions of the models using optimized stimuli. Finally, we discuss possible generalizations of the approach and its relationships with other models.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Policy Search for Motor Primitives

Peters, J., Kober, J.

KI - Zeitschrift K{\"u}nstliche Intelligenz, 23(3):38-40, August 2009 (article)

Abstract
Many motor skills in humanoid robotics can be learned using parametrized motor primitives from demonstrations. However, most interesting motor learning problems require self-improvement often beyond the reach of current reinforcement learning methods due to the high dimensionality of the state-space. We develop an EM-inspired algorithm applicable to complex motor learning tasks. We compare this algorithm to several well-known parametrized policy search methods and show that it outperforms them. We apply it to motor learning problems and show that it can learn the complex Ball-in-a-Cup task using a real Barrett WAM robot arm.

ei

Web [BibTex]

Web [BibTex]


no image
A neurophysiologically plausible population code model for human contrast discrimination

Goris, R., Wichmann, F., Henning, G.

Journal of Vision, 9(7):1-22, July 2009 (article)

Abstract
The pedestal effect is the improvement in the detectability of a sinusoidal grating in the presence of another grating of the same orientation, spatial frequency, and phase—usually called the pedestal. Recent evidence has demonstrated that the pedestal effect is differently modified by spectrally flat and notch-filtered noise: The pedestal effect is reduced in flat noise but virtually disappears in the presence of notched noise (G. B. Henning & F. A. Wichmann, 2007). Here we consider a network consisting of units whose contrast response functions resemble those of the cortical cells believed to underlie human pattern vision and demonstrate that, when the outputs of multiple units are combined by simple weighted summation—a heuristic decision rule that resembles optimal information combination and produces a contrast-dependent weighting profile—the network produces contrast-discrimination data consistent with psychophysical observations: The pedestal effect is present without noise, reduced in broadband noise, but almost disappears in notched noise. These findings follow naturally from the normalization model of simple cells in primary visual cortex, followed by response-based pooling, and suggest that in processing even low-contrast sinusoidal gratings, the visual system may combine information across neurons tuned to different spatial frequencies and orientations.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Falsificationism and Statistical Learning Theory: Comparing the Popper and Vapnik-Chervonenkis Dimensions

Corfield, D., Schölkopf, B., Vapnik, V.

Journal for General Philosophy of Science, 40(1):51-58, July 2009 (article)

Abstract
We compare Karl Popper’s ideas concerning the falsifiability of a theory with similar notions from the part of statistical learning theory known as VC-theory. Popper’s notion of the dimension of a theory is contrasted with the apparently very similar VC-dimension. Having located some divergences, we discuss how best to view Popper’s work from the perspective of statistical learning theory, either as a precursor or as aiming to capture a different learning activity.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Semi-supervised Analysis of Human fMRI Data

Shelton, JA., Blaschko, MB., Lampert, CH., Bartels, A.

Berlin Brain Computer Interface Workshop on Advances in Neurotechnology, 2009, pages: 1, July 2009 (poster)

Abstract
Kernel Canonical Correlation Analysis (KCCA) is a general technique for subspace learning that incorporates principal components analysis (PCA) and Fisher linear discriminant analysis (LDA) as special cases. By finding directions that maximize correlation, CCA learns representations tied more closely to underlying process generating the the data and can ignore high-variance noise directions. However, for data where acquisition in a given modality is expensive or otherwise limited, CCA may suffer from small sample effects. We propose to use semisupervised Laplacian regularization to utilize data that are present in only one modality. This approach is able to find highly correlated directions that also lie along the data manifold, resulting in a more robust estimate of correlated subspaces. Functional magnetic resonance imaging (fMRI) acquired data are naturally amenable to subspace techniques as data are well aligned. fMRI data of the human brain are a particularly interesting candidate. In this study we implemented various supervised and semi-supervised versions of CCA on human fMRI data, with regression to single and multivariate labels (corresponding to video content subjects viewed during the image acquisition). In each variate condition, the semi-supervised variants of CCA performed better than the supervised variants, including a supervised variant with Laplacian regularization. We additionally analyze the weights learned by the regression in order to infer brain regions that are important to different types of visual processing.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Guest editorial: Special issue on robot learning, Part A

Peters, J., Ng, A.

Autonomous Robots, 27(1):1-2, July 2009 (article)

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
A Geometric Approach to Confidence Sets for Ratios: Fieller’s Theorem, Generalizations, and Bootstrap

von Luxburg, U., Franz, V.

Statistica Sinica, 19(3):1095-1117, July 2009 (article)

Abstract
We present a geometric method to determine confidence sets for the ratio E(Y)/E(X) of the means of random variables X and Y. This method reduces the problem of constructing confidence sets for the ratio of two random variables to the problem of constructing confidence sets for the means of one-dimensional random variables. It is valid in a large variety of circumstances. In the case of normally distributed random variables, the so constructed confidence sets coincide with the standard Fieller confidence sets. Generalizations of our construction lead to definitions of exact and conservative confidence sets for very general classes of distributions, provided the joint expectation of (X,Y) exists and the linear combinations of the form aX + bY are well-behaved. Finally, our geometric method allows to derive a very simple bootstrap approach for constructing conservative confidence sets for ratios which perform favorably in certain situations, in particular in the asymmetric heavy-tailed regime.

ei

PDF PDF Web [BibTex]


no image
Text Clustering with Mixture of von Mises-Fisher Distributions

Sra, S., Banerjee, A., Ghosh, J., Dhillon, I.

In Text mining: classification, clustering, and applications, pages: 121-161, Chapman & Hall/CRC data mining and knowledge discovery series, (Editors: Srivastava, A. N. and Sahami, M.), CRC Press, Boca Raton, FL, USA, June 2009 (inbook)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Center-surround patterns emerge as optimal predictors for human saccade targets

Kienzle, W., Franz, M., Schölkopf, B., Wichmann, F.

Journal of Vision, 9(5:7):1-15, May 2009 (article)

Abstract
The human visual system is foveated, that is, outside the central visual field resolution and acuity drop rapidly. Nonetheless much of a visual scene is perceived after only a few saccadic eye movements, suggesting an effective strategy for selecting saccade targets. It has been known for some time that local image structure at saccade targets influences the selection process. However, the question of what the most relevant visual features are is still under debate. Here we show that center-surround patterns emerge as the optimal solution for predicting saccade targets from their local image structure. The resulting model, a one-layer feed-forward network, is surprisingly simple compared to previously suggested models which assume much more complex computations such as multi-scale processing and multiple feature channels. Nevertheless, our model is equally predictive. Furthermore, our findings are consistent with neurophysiological hardware in the superior colliculus. Bottom-up visual saliency may thus not be computed cortically as has been thought previously.

ei

PDF DOI [BibTex]


no image
Data Mining for Biologists

Tsuda, K.

In Biological Data Mining in Protein Interaction Networks, pages: 14-27, (Editors: Li, X. and Ng, S.-K.), Medical Information Science Reference, Hershey, PA, USA, May 2009 (inbook)

Abstract
In this tutorial chapter, we review basics about frequent pattern mining algorithms, including itemset mining, association rule mining and graph mining. These algorithms can find frequently appearing substructures in discrete data. They can discover structural motifs, for example, from mutation data, protein structures and chemical compounds. As they have been primarily used for business data, biological applications are not so common yet, but their potential impact would be large. Recent advances in computers including multicore machines and ever increasing memory capacity support the application of such methods to larger datasets. We explain technical aspects of the algorithms, but do not go into details. Current biological applications are summarized and possible future directions are given.

ei

Web [BibTex]

Web [BibTex]


no image
Influence of Different Assignment Conditions on the Determination of Symmetric Homo-dimeric Structures with ARIA

Bardiaux, B., Bernard, A., Rieping, W., Habeck, M., Malliavin, TE., Nilges, M.

Proteins, 75(3):569-585, May 2009 (article)

Abstract
The ambiguous restraint for iterative assignment (ARIA) approach for NMR structure calculation is evaluated for symmetric homodimeric proteins by assessing the effect of several data analysis and assignment methods on the structure quality. In particular, we study the effects of network anchoring and spin-diffusion correction. The spin-diffusion correction improves the protein structure quality systematically, whereas network anchoring enhances the assignment efficiency by speeding up the convergence and coping with highly ambiguous data. For some homodimeric folds, network anchoring has been proved essential for unraveling both chain and proton assignment ambiguities.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Beamforming in Noninvasive Brain-Computer Interfaces

Grosse-Wentrup, M., Liefhold, C., Gramann, K., Buss, M.

IEEE Transactions on Biomedical Engineering, 56(4):1209-1219, April 2009 (article)

Abstract
Spatial filtering (SF) constitutes an integral part of building EEG-based brain–computer interfaces (BCIs). Algorithms frequently used for SF, such as common spatial patterns (CSPs) and independent component analysis, require labeled training data for identifying filters that provide information on a subject‘s intention, which renders these algorithms susceptible to overfitting on artifactual EEG components. In this study, beamforming is employed to construct spatial filters that extract EEG sources originating within predefined regions of interest within the brain. In this way, neurophysiological knowledge on which brain regions are relevant for a certain experimental paradigm can be utilized to construct unsupervised spatial filters that are robust against artifactual EEG components. Beamforming is experimentally compared with CSP and Laplacian spatial filtering (LP) in a two-class motor-imagery paradigm. It is demonstrated that beamforming outperforms CSP and LP on noisy datasets, while CSP and beamforming perform almost equally well on datasets with few artifactual trials. It is concluded that beamforming constitutes an alternative method for SF that might be particularly useful for BCIs used in clinical settings, i.e., in an environment where artifact-free datasets are difficult to obtain.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Constructing Sparse Kernel Machines Using Attractors

Lee, D., Jung, K., Lee, J.

IEEE Transactions on Neural Networks, 20(4):721-729, April 2009 (article)

Abstract
In this brief, a novel method that constructs a sparse kernel machine is proposed. The proposed method generates attractors as sparse solutions from a built-in kernel machine via a dynamical system framework. By readjusting the corresponding coefficients and bias terms, a sparse kernel machine that approximates a conventional kernel machine is constructed. The simulation results show that the constructed sparse kernel machine improves the efficiency of testing phase while maintaining comparable test error.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Optimal construction of k-nearest-neighbor graphs for identifying noisy clusters

Maier, M., Hein, M., von Luxburg, U.

Theoretical Computer Science, 410(19):1749-1764, April 2009 (article)

Abstract
We study clustering algorithms based on neighborhood graphs on a random sample of data points. The question we ask is how such a graph should be constructed in order to obtain optimal clustering results. Which type of neighborhood graph should one choose, mutual k-nearest-neighbor or symmetric k-nearest-neighbor? What is the optimal parameter k? In our setting, clusters are defined as connected components of the t-level set of the underlying probability distribution. Clusters are said to be identified in the neighborhood graph if connected components in the graph correspond to the true underlying clusters. Using techniques from random geometric graph theory, we prove bounds on the probability that clusters are identified successfully, both in a noise-free and in a noisy setting. Those bounds lead to several conclusions. First, k has to be chosen surprisingly high (rather of the order n than of the order logn) to maximize the probability of cluster identification. Secondly, the major difference between the mutual and the symmetric k-nearest-neighbor graph occurs when one attempts to detect the most significant cluster only.

ei

PDF PDF DOI [BibTex]


no image
Optimization of k-Space Trajectories by Bayesian Experimental Design

Seeger, M., Nickisch, H., Pohmann, R., Schölkopf, B.

17(2627), 17th Annual Meeting of the International Society for Magnetic Resonance in Medicine (ISMRM), April 2009 (poster)

Abstract
MR image reconstruction from undersampled k-space can be improved by nonlinear denoising estimators since they incorporate statistical prior knowledge about image sparsity. Reconstruction quality depends crucially on the undersampling design (k-space trajectory), in a manner complicated by the nonlinear and signal-dependent characteristics of these methods. We propose an algorithm to assess and optimize k-space trajectories for sparse MRI reconstruction, based on Bayesian experimental design, which is scaled up to full MR images by a novel variational relaxation to iteratively reweighted FFT or gridding computations. Designs are built sequentially by adding phase encodes predicted to be most informative, given the combination of previous measurements with image prior information.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Overlap and refractory effects in a Brain-Computer Interface speller based on the visual P300 Event-Related Potential

Martens, S., Hill, N., Farquhar, J., Schölkopf, B.

Journal of Neural Engineering, 6(2):1-9, April 2009 (article)

Abstract
We reveal the presence of refractory and overlap effects in the event-related potentials in visual P300 speller datasets, and we show their negative impact on the performance of the system. This finding has important implications for how to encode the letters that can be selected for communication. However, we show that such effects are dependent on stimulus parameters: an alternative stimulus type based on apparent motion suffers less from the refractory effects and leads to an improved letter prediction performance.

ei

PDF DOI [BibTex]


no image
MR-Based Attenuation Correction for PET/MR

Hofmann, M., Steinke, F., Bezrukov, I., Kolb, A., Aschoff, P., Lichy, M., Erb, M., Nägele, T., Brady, M., Schölkopf, B., Pichler, B.

17(260), 17th Annual Meeting of the International Society for Magnetic Resonance in Medicine (ISMRM), April 2009 (poster)

Abstract
There has recently been a growing interest in combining PET and MR. Attenuation correction (AC), which accounts for radiation attenuation properties of the tissue, is mandatory for quantitative PET. In the case of PET/MR the attenuation map needs to be determined from the MR image. This is intrinsically difficult as MR intensities are not related to the electron density information of the attenuation map. Using ultra-short echo (UTE) acquisition, atlas registration and machine learning, we present methods that allow prediction of the attenuation map based on the MR image both for brain and whole body imaging.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Kernel Methods in Computer Vision:Object Localization, Clustering,and Taxonomy Discovery

Blaschko, MB.

Biologische Kybernetik, Technische Universität Berlin, Berlin, Germany, March 2009 (phdthesis)

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions

Bubeck, S., von Luxburg, U.

Journal of Machine Learning Research, 10, pages: 657-698, March 2009 (article)

Abstract
Clustering is often formulated as a discrete optimization problem. The objective is to find, among all partitions of the data set, the best one according to some quality measure. However, in the statistical setting where we assume that the finite data set has been sampled from some underlying space, the goal is not to find the best partition of the given sample, but to approximate the true partition of the underlying space. We argue that the discrete optimization approach usually does not achieve this goal, and instead can lead to inconsistency. We construct examples which provably have this behavior. As in the case of supervised learning, the cure is to restrict the size of the function classes under consideration. For appropriate “small” function classes we can prove very general consistency theorems for clustering optimization schemes. As one particular algorithm for clustering with a restricted function space we introduce “nearest neighbor clustering”. Similar to the k-nearest neighbor classifier in supervised learning, this algorithm can be seen as a general baseline algorithm to minimize arbitrary clustering objective functions. We prove that it is statistically consistent for all commonly used clustering objective functions.

ei

PDF Web [BibTex]


no image
Protein Functional Class Prediction With a Combined Graph

Shin, H., Tsuda, K., Schölkopf, B.

Expert Systems with Applications, 36(2):3284-3292, March 2009 (article)

Abstract
In bioinformatics, there exist multiple descriptions of graphs for the same set of genes or proteins. For instance, in yeast systems, graph edges can represent different relationships such as protein–protein interactions, genetic interactions, or co-participation in a protein complex, etc. Relying on similarities between nodes, each graph can be used independently for prediction of protein function. However, since different graphs contain partly independent and partly complementary information about the problem at hand, one can enhance the total information extracted by combining all graphs. In this paper, we propose a method for integrating multiple graphs within a framework of semi-supervised learning. The method alternates between minimizing the objective function with respect to network output and with respect to combining weights. We apply the method to the task of protein functional class prediction in yeast. The proposed method performs significantly better than the same algorithm trained on any singl e graph.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Gaussian Process Dynamic Programming

Deisenroth, M., Rasmussen, C., Peters, J.

Neurocomputing, 72(7-9):1508-1524, March 2009 (article)

Abstract
Reinforcement learning (RL) and optimal control of systems with contin- uous states and actions require approximation techniques in most interesting cases. In this article, we introduce Gaussian process dynamic programming (GPDP), an approximate value-function based RL algorithm. We consider both a classic optimal control problem, where problem-specific prior knowl- edge is available, and a classic RL problem, where only very general priors can be used. For the classic optimal control problem, GPDP models the unknown value functions with Gaussian processes and generalizes dynamic programming to continuous-valued states and actions. For the RL problem, GPDP starts from a given initial state and explores the state space using Bayesian active learning. To design a fast learner, available data has to be used efficiently. Hence, we propose to learn probabilistic models of the a priori unknown transition dynamics and the value functions on the fly. In both cases, we successfully apply the resulting continuous-valued controllers to the under-actuated pendulum swing up and analyze the performances of the suggested algorithms. It turns out that GPDP uses data very efficiently and can be applied to problems, where classic dynamic programming would be cumbersome.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Towards quantitative PET/MRI: a review of MR-based attenuation correction techniques

Hofmann, M., Pichler, B., Schölkopf, B., Beyer, T.

European Journal of Nuclear Medicine and Molecular Imaging, 36(Supplement 1):93-104, March 2009 (article)

Abstract
Introduction Positron emission tomography (PET) is a fully quantitative technology for imaging metabolic pathways and dynamic processes in vivo. Attenuation correction of raw PET data is a prerequisite for quantification and is typically based on separate transmission measurements. In PET/CT attenuation correction, however, is performed routinely based on the available CT transmission data. Objective Recently, combined PET/magnetic resonance (MR) has been proposed as a viable alternative to PET/CT. Current concepts of PET/MRI do not include CT-like transmission sources and, therefore, alternative methods of PET attenuation correction must be found. This article reviews existing approaches to MR-based attenuation correction (MR-AC). Most groups have proposed MR-AC algorithms for brain PET studies and more recently also for torso PET/MR imaging. Most MR-AC strategies require the use of complementary MR and transmission images, or morphology templates generated from transmission images. We review and discuss these algorithms and point out challenges for using MR-AC in clinical routine. Discussion MR-AC is work-in-progress with potentially promising results from a template-based approach applicable to both brain and torso imaging. While efforts are ongoing in making clinically viable MR-AC fully automatic, further studies are required to realize the potential benefits of MR-based motion compensation and partial volume correction of the PET data.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Generating Spike Trains with Specified Correlation Coefficients

Macke, J., Berens, P., Ecker, A., Tolias, A., Bethge, M.

Neural Computation, 21(2):397-423, February 2009 (article)

Abstract
Spike trains recorded from populations of neurons can exhibit substantial pairwise correlations between neurons and rich temporal structure. Thus, for the realistic simulation and analysis of neural systems, it is essential to have efficient methods for generating artificial spike trains with specified correlation structure. Here we show how correlated binary spike trains can be simulated by means of a latent multivariate gaussian model. Sampling from the model is computationally very efficient and, in particular, feasible even for large populations of neurons. The entropy of the model is close to the theoretical maximum for a wide range of parameters. In addition, this framework naturally extends to correlations over time and offers an elegant way to model correlated neural spike counts with arbitrary marginal distributions.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Automatic detection of preclinical neurodegeneration: Presymptomatic Huntington disease

Klöppel, S., Chu, C., Tan, G., Draganski, B., Johnson, H., Paulsen, J., Kienzle, W., Tabrizi, S., Ashburner, J., Frackowiak, R.

Neurology, 72(5):426-431, February 2009 (article)

Abstract
Background: Treatment of neurodegenerative diseases is likely to be most beneficial in the very early, possibly preclinical stages of degeneration. We explored the usefulness of fully automatic structural MRI classification methods for detecting subtle degenerative change. The availability of a definitive genetic test for Huntington disease (HD) provides an excellent metric for judging the performance of such methods in gene mutation carriers who are free of symptoms. Methods: Using the gray matter segment of MRI scans, this study explored the usefulness of a multivariate support vector machine to automatically identify presymptomatic HD gene mutation carriers (PSCs) in the absence of any a priori information. A multicenter data set of 96 PSCs and 95 age- and sex-matched controls was studied. The PSC group was subclassified into three groups based on time from predicted clinical onset, an estimate that is a function of DNA mutation size and age. Results: Subjects with at least a 33% chance of developing unequivocal signs of HD in 5 years were correctly assigned to the PSC group 69% of the time. Accuracy improved to 83% when regions affected by the disease were selected a priori for analysis. Performance was at chance when the probability of developing symptoms in 5 years was less than 10%. Conclusions: Presymptomatic Huntington disease gene mutation carriers close to estimated diagnostic onset were successfully separated from controls on the basis of single anatomic scans, without additional a priori information. Prior information is required to allow separation when degenerative changes are either subtle or variable.

ei

Web [BibTex]

Web [BibTex]


no image
Enumeration of condition-dependent dense modules in protein interaction networks

Georgii, E., Dietmann, S., Uno, T., Pagel, P., Tsuda, K.

Bioinformatics, 25(7):933-940, February 2009 (article)

Abstract
Motivation: Modern systems biology aims at understanding how the different molecular components of a biological cell interact. Often, cellular functions are performed by complexes consisting of many different proteins. The composition of these complexes may change according to the cellular environment, and one protein may be involved in several different processes. The automatic discovery of functional complexes from protein interaction data is challenging. While previous approaches use approximations to extract dense modules, our approach exactly solves the problem of dense module enumeration. Furthermore, constraints from additional information sources such as gene expression and phenotype data can be integrated, so we can systematically mine for dense modules with interesting profiles. Results: Given a weighted protein interaction network, our method discovers all protein sets that satisfy a user-defined minimum density threshold. We employ a reverse search strategy, which allows us to exploit the density criterion in an efficient way. Our experiments show that the novel approach is feasible and produces biologically meaningful results. In comparative validation studies using yeast data, the method achieved the best overall prediction performance with respect to confirmed complexes. Moreover, by enhancing the yeast network with phenotypic and phylogenetic profiles and the human network with tissue-specific expression data, we identified condition-dependent complex variants.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Prototype Classification: Insights from Machine Learning

Graf, A., Bousquet, O., Rätsch, G., Schölkopf, B.

Neural Computation, 21(1):272-300, January 2009 (article)

Abstract
We shed light on the discrimination between patterns belonging to two different classes by casting this decoding problem into a generalized prototype framework. The discrimination process is then separated into two stages: a projection stage that reduces the dimensionality of the data by projecting it on a line and a threshold stage where the distributions of the projected patterns of both classes are separated. For this, we extend the popular mean-of-class prototype classification using algorithms from machine learning that satisfy a set of invariance properties. We report a simple yet general approach to express different types of linear classification algorithms in an identical and easy-to-visualize formal framework using generalized prototypes where these prototypes are used to express the normal vector and offset of the hyperplane. We investigate nonmargin classifiers such as the classical prototype classifier, the Fisher classifier, and the relevance vector machine. We then study hard and soft margin cl assifiers such as the support vector machine and a boosted version of the prototype classifier. Subsequently, we relate mean-of-class prototype classification to other classification algorithms by showing that the prototype classifier is a limit of any soft margin classifier and that boosting a prototype classifier yields the support vector machine. While giving novel insights into classification per se by presenting a common and unified formalism, our generalized prototype framework also provides an efficient visualization and a principled comparison of machine learning classification.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
The DICS repository: module-assisted analysis of disease-related gene lists

Dietmann, S., Georgii, E., Antonov, A., Tsuda, K., Mewes, H.

Bioinformatics, 25(6):830-831, January 2009 (article)

Abstract
The DICS database is a dynamic web repository of computationally predicted functional modules from the human protein–protein interaction network. It provides references to the CORUM, DrugBank, KEGG and Reactome pathway databases. DICS can be accessed for retrieving sets of overlapping modules and protein complexes that are significantly enriched in a gene list, thereby providing valuable information about the functional context.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Large Margin Methods for Part of Speech Tagging

Altun, Y.

In Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods, pages: 141-160, (Editors: Keshet, J. and Bengio, S.), Wiley, Hoboken, NJ, USA, January 2009 (inbook)

ei

Web [BibTex]

Web [BibTex]


no image
Motor Control and Learning in Table Tennis

Mülling, K.

Eberhard Karls Universität Tübingen, Gerrmany, 2009 (diplomathesis)

ei

[BibTex]

[BibTex]


no image
Hierarchical Clustering and Density Estimation Based on k-nearest-neighbor graphs

Drewe, P.

Eberhard Karls Universität Tübingen, Germany, 2009 (diplomathesis)

ei

[BibTex]

[BibTex]


no image
mGene: accurate SVM-based gene finding with an application to nematode genomes

Schweikert, G., Zien, A., Zeller, G., Behr, J., Dieterich, C., Ong, C., Philips, P., De Bona, F., Hartmann, L., Bohlen, A., Krüger, N., Sonnenburg, S., Rätsch, G.

Genome Research, 19(11):2133-43, 2009 (article)

Abstract
We present a highly accurate gene-prediction system for eukaryotic genomes, called mGene. It combines in an unprecedented manner the flexibility of generalized hidden Markov models (gHMMs) with the predictive power of modern machine learning methods, such as Support Vector Machines (SVMs). Its excellent performance was proved in an objective competition based on the genome of the nematode Caenorhabditis elegans. Considering the average of sensitivity and specificity, the developmental version of mGene exhibited the best prediction performance on nucleotide, exon, and transcript level for ab initio and multiple-genome gene-prediction tasks. The fully developed version shows superior performance in 10 out of 12 evaluation criteria compared with the other participating gene finders, including Fgenesh++ and Augustus. An in-depth analysis of mGene's genome-wide predictions revealed that approximately 2200 predicted genes were not contained in the current genome annotation. Testing a subset of 57 of these genes by RT-PCR and sequencing, we confirmed expression for 24 (42%) of them. mGene missed 300 annotated genes, out of which 205 were unconfirmed. RT-PCR testing of 24 of these genes resulted in a success rate of merely 8%. These findings suggest that even the gene catalog of a well-studied organism such as C. elegans can be substantially improved by mGene's predictions. We also provide gene predictions for the four nematodes C. briggsae, C. brenneri, C. japonica, and C. remanei. Comparing the resulting proteomes among these organisms and to the known protein universe, we identified many species-specific gene inventions. In a quality assessment of several available annotations for these genomes, we find that mGene's predictions are most accurate.

ei

DOI [BibTex]

DOI [BibTex]


no image
Learning with Structured Data: Applications to Computer Vision

Nowozin, S.

Technische Universität Berlin, Germany, 2009 (phdthesis)

ei

PDF [BibTex]

PDF [BibTex]


no image
Structure and activity of the N-terminal substrate recognition domains in proteasomal ATPases

Djuranovic, S., Hartmann, MD., Habeck, M., Ursinus, A., Zwickl, P., Martin, J., Lupas, AN., Zeth, K.

Molecular Cell, 34(5):580-590, 2009 (article)

Abstract
The proteasome forms the core of the protein quality control system in archaea and eukaryotes and also occurs in one bacterial lineage, the Actinobacteria. Access to its proteolytic compartment is controlled by AAA ATPases, whose N-terminal domains (N domains) are thought to mediate substrate recognition. The N domains of an archaeal proteasomal ATPase, Archaeoglobus fulgidus PAN, and of its actinobacterial homolog, Rhodococcus erythropolis ARC, form hexameric rings, whose subunits consist of an N-terminal coiled coil and a C-terminal OB domain. In ARC-N, the OB domains are duplicated and form separate rings. PAN-N and ARC-N can act as chaperones, preventing the aggregation of heterologous proteins in vitro, and this activity is preserved in various chimeras, even when these include coiled coils and OB domains from unrelated proteins. The structures suggest a molecular mechanism for substrate processing based on concerted radial motions of the coiled coils relative to the OB rings.

ei

DOI [BibTex]

DOI [BibTex]


no image
Discussion of: Brownian Distance Covariance

Gretton, A., Fukumizu, K., Sriperumbudur, B.

The Annals of Applied Statistics, 3(4):1285-1294, 2009 (article)

ei

[BibTex]

[BibTex]


no image
Covariate shift and local learning by distribution matching

Gretton, A., Smola, A., Huang, J., Schmittfull, M., Borgwardt, K., Schölkopf, B.

In Dataset Shift in Machine Learning, pages: 131-160, (Editors: Quiñonero-Candela, J., Sugiyama, M., Schwaighofer, A. and Lawrence, N. D.), MIT Press, Cambridge, MA, USA, 2009 (inbook)

Abstract
Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this goal by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing kernel Hilbert space). This approach does not require distribution estimation. Instead, the sample weights are obtained by a simple quadratic programming procedure. We provide a uniform convergence bound on the distance between the reweighted training feature mean and the test feature mean, a transductive bound on the expected loss of an algorithm trained on the reweighted data, and a connection to single class SVMs. While our method is designed to deal with the case of simple covariate shift (in the sense of Chapter ??), we have also found benefits for sample selection bias on the labels. Our correction procedure yields its greatest and most consistent advantages when the learning algorithm returns a classifier/regressor that is simpler" than the data might suggest.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
From Differential Equations to Differential Geometry: Aspects of Regularisation in Machine Learning

Steinke, F.

Universität des Saarlandes, Saarbrücken, Germany, 2009 (phdthesis)

ei

PDF [BibTex]