Header logo is


2008


no image
Natural Actor-Critic

Peters, J., Schaal, S.

Neurocomputing, 71(7-9):1180-1190, March 2008 (article)

Abstract
In this paper, we suggest a novel reinforcement learning architecture, the Natural Actor-Critic. The actor updates are achieved using stochastic policy gradients em- ploying Amari’s natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by lin- ear regression. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gra- dients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and Bradtke’s Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Empirical evaluations illustrate the effectiveness of our techniques in comparison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm.

ei

PDF PDF DOI [BibTex]

2008


PDF PDF DOI [BibTex]


no image
Inferring Spike Trains From Local Field Potentials

Rasch, M., Gretton, A., Murayama, Y., Maass, W., Logothetis, N.

Journal of Neurophysiology, 99(3):1461-1476, March 2008 (article)

Abstract
We investigated whether it is possible to infer spike trains solely on the basis of the underlying local field potentials (LFPs). Using support vector machines and linear regression models, we found that in the primary visual cortex (V1) of monkeys, spikes can indeed be inferred from LFPs, at least with moderate success. Although there is a considerable degree of variation across electrodes, the low-frequency structure in spike trains (in the 100-ms range) can be inferred with reasonable accuracy, whereas exact spike positions are not reliably predicted. Two kinds of features of the LFP are exploited for prediction: the frequency power of bands in the high gamma-range (40–90 Hz) and information contained in lowfrequency oscillations ( 10 Hz), where both phase and power modulations are informative. Information analysis revealed that both features code (mainly) independent aspects of the spike-to-LFP relationship, with the low-frequency LFP phase coding for temporally clustered spiking activity. Although both features and prediction quality are similar during seminatural movie stimuli and spontaneous activity, prediction performance during spontaneous activity degrades much more slowly with increasing electrode distance. The general trend of data obtained with anesthetized animals is qualitatively mirrored in that of a more limited data set recorded in V1 of non-anesthetized monkeys. In contrast to the cortical field potentials, thalamic LFPs (e.g., LFPs derived from recordings in the dorsal lateral geniculate nucleus) hold no useful information for predicting spiking activity.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
ISD: A Software Package for Bayesian NMR Structure Calculation

Rieping, W., Nilges, M., Habeck, M.

Bioinformatics, 24(8):1104-1105, February 2008 (article)

Abstract
SUMMARY: The conventional approach to calculating biomolecular structures from nuclear magnetic resonance (NMR) data is often viewed as subjective due to its dependence on rules of thumb for deriving geometric constraints and suitable values for theory parameters from noisy experimental data. As a result, it can be difficult to judge the precision of an NMR structure in an objective manner. The Inferential Structure Determination (ISD) framework, which has been introduced recently, addresses this problem by using Bayesian inference to derive a probability distribution that represents both the unknown structure and its uncertainty. It also determines additional unknowns, such as theory parameters, that normally need be chosen empirically. Here we give an overview of the ISD software package, which implements this methodology. AVAILABILITY: The program is available at http://www.bioc.cam.ac.uk/isd

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Probabilistic Structure Calculation

Nilges, M., Habeck, M., Rieping, W.

Comptes Rendus Chimie, 11(4-5):356-369, February 2008 (article)

Abstract
Molecular structures are usually calculated from experimental data with some method of energy minimisation or non-linear optimisation. Key aims of a structure calculation are to estimate the coordinate uncertainty, and to provide a meaningful measure of the quality of the fit to the data. We discuss approaches to optimally combine prior information and experimental data and the connection to probability theory. We analyse the appropriate statistics for NOEs and NOE-derived distances, and the related question of restraint potentials. Finally, we will discuss approaches to determine the appropriate weight on the experimental evidence and to obtain in this way an estimate of the data quality from the structure calculation. Whereas objective estimates of coordinates and their uncertainties can only be obtained by a full Bayesian treatment of the problem, standard structure calculation methods continue to play an important role. To obtain the full benefit of these methods, they should be founded on a rigorous Baye sian analysis.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Fast Projection-based Methods for the Least Squares Nonnegative Matrix Approximation Problem

Kim, D., Sra, S., Dhillon, I.

Statistical Analysis and Data Mining, 1(1):38-51, February 2008 (article)

Abstract
Nonnegative matrix approximation (NNMA) is a popular matrix decomposition technique that has proven to be useful across a diverse variety of fields with applications ranging from document analysis and image processing to bioinformatics and signal processing. Over the years, several algorithms for NNMA have been proposed, e.g. Lee and Seung‘s multiplicative updates, alternating least squares (ALS), and gradient descent-based procedures. However, most of these procedures suffer from either slow convergence, numerical instability, or at worst, serious theoretical drawbacks. In this paper, we develop a new and improved algorithmic framework for the least-squares NNMA problem, which is not only theoretically well-founded, but also overcomes many deficiencies of other methods. Our framework readily admits powerful optimization techniques and as concrete realizations we present implementations based on the Newton, BFGS and conjugate gradient methods. Our algorithms provide numerical resu lts supe rior to both Lee and Seung‘s method as well as to the alternating least squares heuristic, which was reported to work well in some situations but has no theoretical guarantees[1]. Our approach extends naturally to include regularization and box-constraints without sacrificing convergence guarantees. We present experimental results on both synthetic and real-world datasets that demonstrate the superiority of our methods, both in terms of better approximations as well as computational efficiency.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
A Unifying Probabilistic Framework for Analyzing Residual Dipolar Couplings

Habeck, M., Nilges, M., Rieping, W.

Journal of Biomolecular NMR, 40(2):135-144, February 2008 (article)

Abstract
Residual dipolar couplings provide complementary information to the nuclear Overhauser effect measurements that are traditionally used in biomolecular structure determination by NMR. In a de novo structure determination, however, lack of knowledge about the degree and orientation of molecular alignment complicates the analysis of dipolar coupling data. We present a probabilistic framework for analyzing residual dipolar couplings and demonstrate that it is possible to estimate the atomic coordinates, the complete molecular alignment tensor, and the error of the couplings simultaneously. As a by-product, we also obtain estimates of the uncertainty in the coordinates and the alignment tensor. We show that our approach encompasses existing methods for determining the alignment tensor as special cases, including least squares estimation, histogram fitting, and elimination of an explicit alignment tensor in the restraint energy.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Contour-propagation Algorithms for Semi-automated Reconstruction of Neural Processes

Macke, J., Maack, N., Gupta, R., Denk, W., Schölkopf, B., Borst, A.

Journal of Neuroscience Methods, 167(2):349-357, January 2008 (article)

Abstract
A new technique, ”Serial Block Face Scanning Electron Microscopy” (SBFSEM), allows for automatic sectioning and imaging of biological tissue with a scanning electron microscope. Image stacks generated with this technology have a resolution sufficient to distinguish different cellular compartments, including synaptic structures, which should make it possible to obtain detailed anatomical knowledge of complete neuronal circuits. Such an image stack contains several thousands of images and is recorded with a minimal voxel size of 10-20nm in the x and y- and 30nm in z-direction. Consequently, a tissue block of 1mm3 (the approximate volume of the Calliphora vicina brain) will produce several hundred terabytes of data. Therefore, highly automated 3D reconstruction algorithms are needed. As a first step in this direction we have developed semiautomated segmentation algorithms for a precise contour tracing of cell membranes. These algorithms were embedded into an easy-to-operate user interface, which allows direct 3D observation of the extracted objects during the segmentation of image stacks. Compared to purely manual tracing, processing time is greatly accelerated.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Quantum-Statistical-Mechanical Extension of Gaussian Mixture Model

Tanaka, K., Tsuda, K.

Journal of Physics: Conference Series, 95(012023):1-9, January 2008 (article)

Abstract
We propose an extension of Gaussian mixture models in the statistical-mechanical point of view. The conventional Gaussian mixture models are formulated to divide all points in given data to some kinds of classes. We introduce some quantum states constructed by superposing conventional classes in linear combinations. Our extension can provide a new algorithm in classifications of data by means of linear response formulas in the statistical mechanics.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Learning to control in operational space

Peters, J., Schaal, S.

International Journal of Robotics Research, 27, pages: 197-212, 2008, clmc (article)

Abstract
One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A first important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2007


no image
Reaction graph kernels for discovering missing enzymes in the plant secondary metabolism

Saigo, H., Hattori, M., Tsuda, K.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
Secondary metabolic pathway in plant is important for finding druggable candidate enzymes. However, there are many enzymes whose functions are still undiscovered especially in organism-specific metabolic pathways. We propose reaction graph kernels for automatically assigning the EC numbers to unknown enzymatic reactions in a metabolic network. Experiments are carried out on KEGG/REACTION database and our method successfully predicted the first three digits of the EC number with 83% accuracy.We also exhaustively predicted missing enzymatic functions in the plant secondary metabolism pathways, and evaluated our results in biochemical validity.

ei

Web [BibTex]

2007


Web [BibTex]


no image
Positional Oligomer Importance Matrices

Sonnenburg, S., Zien, A., Philips, P., Rätsch, G.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
At the heart of many important bioinformatics problems, such as gene finding and function prediction, is the classification of biological sequences, above all of DNA and proteins. In many cases, the most accurate classifiers are obtained by training SVMs with complex sequence kernels, for instance for transcription starts or splice sites. However, an often criticized downside of SVMs with complex kernels is that it is very hard for humans to understand the learned decision rules and to derive biological insights from them. To close this gap, we introduce the concept of positional oligomer importance matrices (POIMs) and develop an efficient algorithm for their computation. We demonstrate how they overcome the limitations of sequence logos, and how they can be used to find relevant motifs for different biological phenomena in a straight-forward way. Note that the concept of POIMs is not limited to interpreting SVMs, but is applicable to general k−mer based scoring systems.

ei

Web [BibTex]

Web [BibTex]


no image
Machine Learning Algorithms for Polymorphism Detection

Schweikert, G., Zeller, G., Weigel, D., Schölkopf, B., Rätsch, G.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

ei

Web [BibTex]

Web [BibTex]


no image
A Tutorial on Spectral Clustering

von Luxburg, U.

Statistics and Computing, 17(4):395-416, December 2007 (article)

Abstract
In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
An Automated Combination of Kernels for Predicting Protein Subcellular Localization

Zien, A., Ong, C.

NIPS Workshop on Machine Learning in Computational Biology, December 2007 (talk)

Abstract
Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions.We propose a new class of protein sequence kernels which considers all motifs including motifs with gaps. This class of kernels allows the inclusion of pairwise amino acid distances into their computation. We utilize an extension of the multiclass support vector machine (SVM)method which directly solves protein subcellular localization without resorting to the common approach of splitting the problem into several binary classification problems. To automatically search over families of possible amino acid motifs, we optimize over multiple kernels at the same time. We compare our automated approach to four other predictors on three different datasets, and show that we perform better than the current state of the art. Furthermore, our method provides some insights as to which features are most useful for determining subcellular localization, which are in agreement with biological reasoning.

ei

Web [BibTex]

Web [BibTex]


no image
A Tutorial on Kernel Methods for Categorization

Jäkel, F., Schölkopf, B., Wichmann, F.

Journal of Mathematical Psychology, 51(6):343-358, December 2007 (article)

Abstract
The abilities to learn and to categorize are fundamental for cognitive systems, be it animals or machines, and therefore have attracted attention from engineers and psychologists alike. Modern machine learning methods and psychological models of categorization are remarkably similar, partly because these two fields share a common history in artificial neural networks and reinforcement learning. However, machine learning is now an independent and mature field that has moved beyond psychologically or neurally inspired algorithms towards providing foundations for a theory of learning that is rooted in statistics and functional analysis. Much of this research is potentially interesting for psychological theories of learning and categorization but also hardly accessible for psychologists. Here, we provide a tutorial introduction to a popular class of machine learning tools, called kernel methods. These methods are closely related to perceptrons, radial-basis-function neural networks and exemplar theories of catego rization. Recent theoretical advances in machine learning are closely tied to the idea that the similarity of patterns can be encapsulated in a positive definite kernel. Such a positive definite kernel can define a reproducing kernel Hilbert space which allows one to use powerful tools from functional analysis for the analysis of learning algorithms. We give basic explanations of some key concepts—the so-called kernel trick, the representer theorem and regularization—which may open up the possibility that insights from machine learning can feed back into psychology.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Accurate Splice site Prediction Using Support Vector Machines

Sonnenburg, S., Schweikert, G., Philips, P., Behr, J., Rätsch, G.

BMC Bioinformatics, 8(Supplement 10):1-16, December 2007 (article)

Abstract
Background: For splice site recognition, one has to solve two classification problems: discriminating true from decoy splice sites for both acceptor and donor sites. Gene finding systems typically rely on Markov Chains to solve these tasks. Results: In this work we consider Support Vector Machines for splice site recognition. We employ the so-called weighted degree kernel which turns out well suited for this task, as we will illustrate in several experiments where we compare its prediction accuracy with that of recently proposed systems. We apply our method to the genome-wide recognition of splice sites in Caenorhabditis elegans, Drosophila melanogaster, Arabidopsis thaliana, Danio rerio, and Homo sapiens. Our performance estimates indicate that splice sites can be recognized very accurately in these genomes and that our method outperforms many other methods including Markov Chains, GeneSplicer and SpliceMachine. We provide genome-wide predictions of splice sites and a stand-alone prediction tool ready to be used for incorporation in a gene finder. Availability: Data, splits, additional information on the model selection, the whole genome predictions, as well as the stand-alone prediction tool are available for download at http:// www.fml.mpg.de/raetsch/projects/splice.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Challenges in Brain-Computer Interface Development: Induction, Measurement, Decoding, Integration

Hill, NJ.

Invited keynote talk at the launch of BrainGain, the Dutch BCI research consortium, November 2007 (talk)

Abstract
I‘ll present a perspective on Brain-Computer Interface development from T{\"u}bingen. Some of the benefits promised by BCI technology lie in the near foreseeable future, and some further away. Our motivation is to make BCI technology feasible for the people who could benefit from what it has to offer soon: namely, people in the "completely locked-in" state. I‘ll mention some of the challenges of working with this user group, and explain the specific directions they have motivated us to take in developing experimental methods, algorithms, and software.

ei

[BibTex]

[BibTex]


no image
Towards compliant humanoids: an experimental assessment of suitable task space position/orientation controllers

Nakanishi, J., Mistry, M., Peters, J., Schaal, S.

In IROS 2007, 2007, pages: 2520-2527, (Editors: Grant, E. , T. C. Henderson), IEEE Service Center, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems, November 2007 (inproceedings)

Abstract
Compliant control will be a prerequisite for humanoid robotics if these robots are supposed to work safely and robustly in human and/or dynamic environments. One view of compliant control is that a robot should control a minimal number of degrees-of-freedom (DOFs) directly, i.e., those relevant DOFs for the task, and keep the remaining DOFs maximally compliant, usually in the null space of the task. This view naturally leads to task space control. However, surprisingly few implementations of task space control can be found in actual humanoid robots. This paper makes a first step towards assessing the usefulness of task space controllers for humanoids by investigating which choices of controllers are available and what inherent control characteristics they have—this treatment will concern position and orientation control, where the latter is based on a quaternion formulation. Empirical evaluations on an anthropomorphic Sarcos master arm illustrate the robustness of the different controllers as well as the eas e of implementing and tuning them. Our extensive empirical results demonstrate that simpler task space controllers, e.g., classical resolved motion rate control or resolved acceleration control can be quite advantageous in face of inevitable modeling errors in model-based control, and that well chosen formulations are easy to implement and quite robust, such that they are useful for humanoids.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Performance Stabilization and Improvement in Graph-based Semi-supervised Learning with Ensemble Method and Graph Sharpening

Choi, I., Shin, H.

In Korean Data Mining Society Conference, pages: 257-262, Korean Data Mining Society, Seoul, Korea, Korean Data Mining Society Conference, November 2007 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Policy Learning for Robotics

Peters, J.

14th International Conference on Neural Information Processing (ICONIP), November 2007 (talk)

ei

Web [BibTex]

Web [BibTex]


no image
A unifying framework for robot control with redundant DOFs

Peters, J., Mistry, M., Udwadia, F., Nakanishi, J., Schaal, S.

Autonomous Robots, 24(1):1-12, October 2007 (article)

Abstract
Recently, Udwadia (Proc. R. Soc. Lond. A 2003:1783–1800, 2003) suggested to derive tracking controllers for mechanical systems with redundant degrees-of-freedom (DOFs) using a generalization of Gauss’ principle of least constraint. This method allows reformulating control problems as a special class of optimal controllers. In this paper, we take this line of reasoning one step further and demonstrate that several well-known and also novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sarcos Master Arm robot for some of the derived controllers. The suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equations, both with or without external constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
The Need for Open Source Software in Machine Learning

Sonnenburg, S., Braun, M., Ong, C., Bengio, S., Bottou, L., Holmes, G., LeCun, Y., Müller, K., Pereira, F., Rasmussen, C., Rätsch, G., Schölkopf, B., Smola, A., Vincent, P., Weston, J., Williamson, R.

Journal of Machine Learning Research, 8, pages: 2443-2466, October 2007 (article)

Abstract
Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a large body of powerful learning algorithms for diverse applications. However, the true potential of these methods is not realized, since existing implementations are not openly shared, resulting in software with low usability, and weak interoperability. We argue that this situation can be significantly improved by increasing incentives for researchers to publish their software under an open source model. Additionally, we outline the problems authors are faced with when trying to publish algorithmic implementations of machine learning methods. We believe that a resource of peer reviewed software accompanied by short articles would be highly valuable to both the machine learning and the general scientific community.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Hilbert Space Representations of Probability Distributions

Gretton, A.

2nd Workshop on Machine Learning and Optimization at the ISM, October 2007 (talk)

Abstract
Many problems in unsupervised learning require the analysis of features of probability distributions. At the most fundamental level, we might wish to determine whether two distributions are the same, based on samples from each - this is known as the two-sample or homogeneity problem. We use kernel methods to address this problem, by mapping probability distributions to elements in a reproducing kernel Hilbert space (RKHS). Given a sufficiently rich RKHS, these representations are unique: thus comparing feature space representations allows us to compare distributions without ambiguity. Applications include testing whether cancer subtypes are distinguishable on the basis of DNA microarray data, and whether low frequency oscillations measured at an electrode in the cortex have a different distribution during a neural spike. A more difficult problem is to discover whether two random variables drawn from a joint distribution are independent. It turns out that any dependence between pairs of random variables can be encoded in a cross-covariance operator between appropriate RKHS representations of the variables, and we may test independence by looking at a norm of the operator. We demonstrate this independence test by establishing dependence between an English text and its French translation, as opposed to French text on the same topic but otherwise unrelated. Finally, we show that this operator norm is itself a difference in feature means.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Discriminative Subsequence Mining for Action Classification

Nowozin, S., BakIr, G., Tsuda, K.

In ICCV 2007, pages: 1919-1923, IEEE Computer Society, Los Alamitos, CA, USA, 11th IEEE International Conference on Computer Vision, October 2007 (inproceedings)

Abstract
Recent approaches to action classification in videos have used sparse spatio-temporal words encoding local appearance around interesting movements. Most of these approaches use a histogram representation, discarding the temporal order among features. But this ordering information can contain important information about the action itself, e.g. consider the sport disciplines of hurdle race and long jump, where the global temporal order of motions (running, jumping) is important to discriminate between the two. In this work we propose to use a sequential representation which retains this temporal order. Further, we introduce Discriminative Subsequence Mining to find optimal discriminative subsequence patterns. In combination with the LPBoost classifier, this amounts to simultaneously learning a classification function and performing feature selection in the space of all possible feature sequences. The resulting classifier linearly combines a small number of interpretable decision functions, each checking for the presence of a single discriminative pattern. The classifier is benchmarked on the KTH action classification data set and outperforms the best known results in the literature.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Regression with Intervals

Kashima, H., Yamazaki, K., Saigo, H., Inokuchi, A.

International Workshop on Data-Mining and Statistical Science (DMSS2007), October 2007, JSAI Incentive Award. Talk was given by Hisashi Kashima. (talk)

ei

Web [BibTex]

Web [BibTex]


no image
Some observations on the masking effects of Mach bands

Curnow, T., Cowie, DA., Henning, GB., Hill, NJ.

Journal of the Optical Society of America A, 24(10):3233-3241, October 2007 (article)

Abstract
There are 8 cycle / deg ripples or oscillations in performance as a function of location near Mach bands in experiments measuring Mach bands’ masking effects on random polarity signal bars. The oscillations with increments are 180 degrees out of phase with those for decrements. The oscillations, much larger than the measurement error, appear to relate to the weighting function of the spatial-frequency-tuned channel detecting the broad- band signals. The ripples disappear with step maskers and become much smaller at durations below 25 ms, implying either that the site of masking has changed or that the weighting function and hence spatial-frequency tuning is slow to develop.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Hilbert Space Embedding for Distributions

Smola, A., Gretton, A., Song, L., Schölkopf, B.

In Algorithmic Learning Theory, Lecture Notes in Computer Science 4754 , pages: 13-31, (Editors: M Hutter and RA Servedio and E Takimoto), Springer, Berlin, Germany, 18th International Conference on Algorithmic Learning Theory (ALT), October 2007 (inproceedings)

Abstract
We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert space. Applications of this technique can be found in two-sample tests, which are used for determining whether two sets of observations arise from the same distribution, covariate shift correction, local learning, measures of independence, and density estimation.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Cluster Identification in Nearest-Neighbor Graphs

Maier, M., Hein, M., von Luxburg, U.

In ALT 2007, pages: 196-210, (Editors: Hutter, M. , R. A. Servedio, E. Takimoto), Springer, Berlin, Germany, 18th International Conference on Algorithmic Learning Theory, October 2007 (inproceedings)

Abstract
Assume we are given a sample of points from some underlying distribution which contains several distinct clusters. Our goal is to construct a neighborhood graph on the sample points such that clusters are ``identified‘‘: that is, the subgraph induced by points from the same cluster is connected, while subgraphs corresponding to different clusters are not connected to each other. We derive bounds on the probability that cluster identification is successful, and use them to predict ``optimal‘‘ values of k for the mutual and symmetric k-nearest-neighbor graphs. We point out different properties of the mutual and symmetric nearest-neighbor graphs related to the cluster identification problem.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Inducing Metric Violations in Human Similarity Judgements

Laub, J., Macke, J., Müller, K., Wichmann, F.

In Advances in Neural Information Processing Systems 19, pages: 777-784, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
Attempting to model human categorization and similarity judgements is both a very interesting but also an exceedingly difficult challenge. Some of the difficulty arises because of conflicting evidence whether human categorization and similarity judgements should or should not be modelled as to operate on a mental representation that is essentially metric. Intuitively, this has a strong appeal as it would allow (dis)similarity to be represented geometrically as distance in some internal space. Here we show how a single stimulus, carefully constructed in a psychophysical experiment, introduces l2 violations in what used to be an internal similarity space that could be adequately modelled as Euclidean. We term this one influential data point a conflictual judgement. We present an algorithm of how to analyse such data and how to identify the crucial point. Thus there may not be a strict dichotomy between either a metric or a non-metric internal space but rather degrees to which potentially large subsets of stimuli are represented metrically with a small subset causing a global violation of metricity.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Cross-Validation Optimization for Large Scale Hierarchical Classification Kernel Methods

Seeger, M.

In Advances in Neural Information Processing Systems 19, pages: 1233-1240, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We propose a highly efficient framework for kernel multi-class models with a large and structured set of classes. Kernel parameters are learned automatically by maximizing the cross-validation log likelihood, and predictive probabilities are estimated. We demonstrate our approach on large scale text classification tasks with hierarchical class structure, achieving state-of-the-art results in an order of magnitude less time than previous work.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Local Learning Approach for Clustering

Wu, M., Schölkopf, B.

In Advances in Neural Information Processing Systems 19, pages: 1529-1536, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We present a local learning approach for clustering. The basic idea is that a good clustering result should have the property that the cluster label of each data point can be well predicted based on its neighboring data and their cluster labels, using current supervised learning methods. An optimization problem is formulated such that its solution has the above property. Relaxation and eigen-decomposition are applied to solve this optimization problem. We also briefly investigate the parameter selection issue and provide a simple parameter selection method for the proposed algorithm. Experimental results are provided to validate the effectiveness of the proposed approach.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
MR-Based PET Attenuation Correction: Method and Validation

Hofmann, M., Steinke, F., Scheel, V., Brady, M., Schölkopf, B., Pichler, B.

Joint Molecular Imaging Conference, September 2007 (talk)

Abstract
PET/MR combines the high soft tissue contrast of Magnetic Resonance Imaging (MRI) and the functional information of Positron Emission Tomography (PET). For quantitative PET information, correction of tissue photon attenuation is mandatory. Usually in conventional PET, the attenuation map is obtained from a transmission scan, which uses a rotating source, or from the CT scan in case of combined PET/CT. In the case of a PET/MR scanner, there is insufficient space for the rotating source and ideally one would want to calculate the attenuation map from the MR image instead. Since MR images provide information about proton density of the different tissue types, it is not trivial to use this data for PET attenuation correction. We present a method for predicting the PET attenuation map from a given the MR image, using a combination of atlas-registration and recognition of local patterns. Using "leave one out cross validation" we show on a database of 16 MR-CT image pairs that our method reliably allows estimating the CT image from the MR image. Subsequently, as in PET/CT, the PET attenuation map can be predicted from the CT image. On an additional dataset of MR/CT/PET triplets we quantitatively validate that our approach allows PET quantification with an error that is smaller than what would be clinically significant. We demonstrate our approach on T1-weighted human brain scans. However, the presented methods are more general and current research focuses on applying the established methods to human whole body PET/MRI applications.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Mining complex genotypic features for predicting HIV-1 drug resistance

Saigo, H., Uno, T., Tsuda, K.

Bioinformatics, 23(18):2455-2462, September 2007 (article)

Abstract
Human immunodeficiency virus type 1 (HIV-1) evolves in human body, and its exposure to a drug often causes mutations that enhance the resistance against the drug. To design an effective pharmacotherapy for an individual patient, it is important to accurately predict the drug resistance based on genotype data. Notably, the resistance is not just the simple sum of the effects of all mutations. Structural biological studies suggest that the association of mutations is crucial: Even if mutations A or B alone do not affect the resistance, a significant change might happen when the two mutations occur together. Linear regression methods cannot take the associations into account, while decision tree methods can reveal only limited associations. Kernel methods and neural networks implicitly use all possible associations for prediction, but cannot select salient associations explicitly. Our method, itemset boosting, performs linear regression in the complete space of power sets of mutations. It implements a forward feature selection procedure where, in each iteration, one mutation combination is found by an efficient branch-and-bound search. This method uses all possible combinations, and salient associations are explicitly shown. In experiments, our method worked particularly well for predicting the resistance of nucleotide reverse transcriptase inhibitors (NRTIs). Furthermore, it successfully recovered many mutation associations known in biological literature.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Branch and Bound for Semi-Supervised Support Vector Machines

Chapelle, O., Sindhwani, V., Keerthi, S.

In Advances in Neural Information Processing Systems 19, pages: 217-224, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
Semi-supervised SVMs (S3VMs) attempt to learn low-density separators by maximizing the margin over labeled and unlabeled examples. The associated optimization problem is non-convex. To examine the full potential of S3VMs modulo local minima problems in current implementations, we apply branch and bound techniques for obtaining exact, globally optimal solutions. Empirical evidence suggests that the globally optimal solution can return excellent generalization performance in situations where other implementations fail completely. While our current implementation is only applicable to small datasets, we discuss variants that can potentially lead to practically useful algorithms.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Kernel Method for the Two-Sample-Problem

Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.

In Advances in Neural Information Processing Systems 19, pages: 513-520, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We propose two statistical tests to determine if two samples are from different distributions. Our test statistic is in both cases the distance between the means of the two samples mapped into a reproducing kernel Hilbert space (RKHS). The first test is based on a large deviation bound for the test statistic, while the second is based on the asymptotic distribution of this statistic. The test statistic can be computed in $O(m^2)$ time. We apply our approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where our test performs strongly. We also demonstrate excellent performance when comparing distributions over graphs, for which no alternative tests currently exist.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models

Keerthi, S., Sindhwani, V., Chapelle, O.

In Advances in Neural Information Processing Systems 19, pages: 673-680, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning Dense 3D Correspondence

Steinke, F., Schölkopf, B., Blanz, V.

In Advances in Neural Information Processing Systems 19, pages: 1313-1320, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
Establishing correspondence between distinct objects is an important and nontrivial task: correctness of the correspondence hinges on properties which are difficult to capture in an a priori criterion. While previous work has used a priori criteria which in some cases led to very good results, the present paper explores whether it is possible to learn a combination of features that, for a given training set of aligned human heads, characterizes the notion of correct correspondence. By optimizing this criterion, we are then able to compute correspondence and morphs for novel heads.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Optimal Dominant Motion Estimation using Adaptive Search of Transformation Space

Ulges, A., Lampert, CH., Keysers, D., Breuel, TM.

In DAGM 2007, pages: 204-215, (Editors: Hamprecht, F. A., C. Schnörr, B. Jähne), Springer, Berlin, Germany, 29th Annual Symposium of the German Association for Pattern Recognition, September 2007 (inproceedings)

Abstract
The extraction of a parametric global motion from a motion field is a task with several applications in video processing. We present two probabilistic formulations of the problem and carry out optimization using the RAST algorithm, a geometric matching method novel to motion estimation in video. RAST uses an exhaustive and adaptive search of transformation space and thus gives -- in contrast to local sampling optimization techniques used in the past -- a globally optimal solution. Among other applications, our framework can thus be used as a source of ground truth for benchmarking motion estimation algorithms. Our main contributions are: first, the novel combination of a state-of- the-art MAP criterion for dominant motion estimation with a search procedure that guarantees global optimality. Second, experimental re- sults that illustrate the superior performance of our approach on synthetic flow fields as well as real-world video streams. Third, a significant speedup of the search achieved by extending the mod el with an additional smoothness prior.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Output Grouping using Dirichlet Mixtures of Linear Gaussian State-Space Models

Chiappa, S., Barber, D.

In ISPA 2007, pages: 446-451, IEEE Computer Society, Los Alamitos, CA, USA, 5th International Symposium on Image and Signal Processing and Analysis, September 2007 (inproceedings)

Abstract
We consider a model to cluster the components of a vector time-series. The task is to assign each component of the vector time-series to a single cluster, basing this assignment on the simultaneous dynamical similarity of the component to other components in the cluster. This is in contrast to the more familiar task of clustering a set of time-series based on global measures of their similarity. The model is based on a Dirichlet Mixture of Linear Gaussian State-Space models (LGSSMs), in which each LGSSM is treated with a prior to encourage the simplest explanation. The resulting model is approximated using a ‘collapsed’ variational Bayes implementation.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Manifold Denoising

Hein, M., Maier, M.

In Advances in Neural Information Processing Systems 19, pages: 561-568, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We consider the problem of denoising a noisily sampled submanifold $M$ in $R^d$, where the submanifold $M$ is a priori unknown and we are only given a noisy point sample. The presented denoising algorithm is based on a graph-based diffusion process of the point sample. We analyze this diffusion process using recent results about the convergence of graph Laplacians. In the experiments we show that our method is capable of dealing with non-trivial high-dimensional noise. Moreover using the denoising algorithm as pre-processing method we can improve the results of a semi-supervised learning algorithm.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
How to Find Interesting Locations in Video: A Spatiotemporal Interest Point Detector Learned from Human Eye movements

Kienzle, W., Schölkopf, B., Wichmann, F., Franz, M.

In Pattern Recognition, pages: 405-414, (Editors: FA Hamprecht and C Schnörr and B Jähne), Springer, Berlin, Germany, 29th Annual Symposium of the German Association for Pattern Recognition (DAGM), September 2007 (inproceedings)

Abstract
Interest point detection in still images is a well-studied topic in computer vision. In the spatiotemporal domain, however, it is still unclear which features indicate useful interest points. In this paper we approach the problem by emph{learning} a detector from examples: we record eye movements of human subjects watching video sequences and train a neural network to predict which locations are likely to become eye movement targets. We show that our detector outperforms current spatiotemporal interest point architectures on a standard classification dataset.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Bayesian Inference for Sparse Generalized Linear Models

Seeger, M., Gerwinn, S., Bethge, M.

In ECML 2007, pages: 298-309, Lecture Notes in Computer Science ; 4701, (Editors: Kok, J. N., J. Koronacki, R. Lopez de Mantaras, S. Matwin, D. Mladenic, A. Skowron), Springer, Berlin, Germany, 18th European Conference on Machine Learning, September 2007 (inproceedings)

Abstract
We present a framework for efficient, accurate approximate Bayesian inference in generalized linear models (GLMs), based on the expectation propagation (EP) technique. The parameters can be endowed with a factorizing prior distribution, encoding properties such as sparsity or non-negativity. The central role of posterior log-concavity in Bayesian GLMs is emphasized and related to stability issues in EP. In particular, we use our technique to infer the parameters of a point process model for neuronal spiking data from multiple electrodes, demonstrating significantly superior predictive performance when a sparsity assumption is enforced via a Laplace prior distribution.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Implicit Surfaces with Globally Regularised and Compactly Supported Basis Functions

Walder, C., Schölkopf, B., Chapelle, O.

In Advances in Neural Information Processing Systems 19, pages: 273-280, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We consider the problem of constructing a function whose zero set is to represent a surface, given sample points with surface normal vectors. The contributions include a novel means of regularising multi-scale compactly supported basis functions that leads to the desirable properties previously only associated with fully supported bases, and show equivalence to a Gaussian process with modified covariance function. We also provide a regularisation framework for simpler and more direct treatment of surface normals, along with a corresponding generalisation of the representer theorem. We demonstrate the techniques on 3D problems of up to 14 million data points, as well as 4D time series data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Bayesian methods for NMR structure determination

Habeck, M.

29th Annual Discussion Meeting: Magnetic Resonance in Biophysical Chemistry, September 2007 (talk)

ei

Web [BibTex]

Web [BibTex]


no image
Real-Time Fetal Heart Monitoring in Biomagnetic Measurements Using Adaptive Real-Time ICA

Waldert, S., Bensch, M., Bogdan, M., Rosenstiel, W., Schölkopf, B., Lowery, C., Eswaran, H., Preissl, H.

IEEE Transactions on Biomedical Engineering, 54(10):1867-1874, September 2007 (article)

Abstract
Electrophysiological signals of the developing fetal brain and heart can be investigated by fetal magnetoencephalography (fMEG). During such investigations, the fetal heart activity and that of the mother should be monitored continuously to provide an important indication of current well-being. Due to physical constraints of an fMEG system, it is not possible to use clinically established heart monitors for this purpose. Considering this constraint, we developed a real-time heart monitoring system for biomagnetic measurements and showed its reliability and applicability in research and for clinical examinations. The developed system consists of real-time access to fMEG data, an algorithm based on Independent Component Analysis (ICA), and a graphical user interface (GUI). The algorithm extracts the current fetal and maternal heart signal from a noisy and artifact-contaminated data stream in real-time and is able to adapt automatically to continuously varying environmental parameters. This algorithm has been na med Adaptive Real-time ICA (ARICA) and is applicable to real-time artifact removal as well as to related blind signal separation problems.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Nonparametric Approach to Bottom-Up Visual Saliency

Kienzle, W., Wichmann, F., Schölkopf, B., Franz, M.

In Advances in Neural Information Processing Systems 19, pages: 689-696, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
This paper addresses the bottom-up influence of local image information on human eye movements. Most existing computational models use a set of biologically plausible linear filters, e.g., Gabor or Difference-of-Gaussians filters as a front-end, the outputs of which are nonlinearly combined into a real number that indicates visual saliency. Unfortunately, this requires many design parameters such as the number, type, and size of the front-end filters, as well as the choice of nonlinearities, weighting and normalization schemes etc., for which biological plausibility cannot always be justified. As a result, these parameters have to be chosen in a more or less ad hoc way. Here, we propose to emph{learn} a visual saliency model directly from human eye movement data. The model is rather simplistic and essentially parameter-free, and therefore contrasts recent developments in the field that usually aim at higher prediction rates at the cost of additional parameters and increasing model complexity. Experimental results show that - despite the lack of any biological prior knowledge - our model performs comparably to existing approaches, and in fact learns image features that resemble findings from several previous studies. In particular, its maximally excitatory stimuli have center-surround structure, similar to receptive fields in the early human visual system.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Information Bottleneck for Non Co-Occurrence Data

Seldin, Y., Slonim, N., Tishby, N.

In Advances in Neural Information Processing Systems 19, pages: 1241-1248, (Editors: Schölkopf, B. , J. Platt, T. Hofmann), MIT Press, Cambridge, MA, USA, Twentieth Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We present a general model-independent approach to the analysis of data in cases when these data do not appear in the form of co-occurrence of two variables X, Y, but rather as a sample of values of an unknown (stochastic) function Z(X,Y). For example, in gene expression data, the expression level Z is a function of gene X and condition Y; or in movie ratings data the rating Z is a function of viewer X and movie Y . The approach represents a consistent extension of the Information Bottleneck method that has previously relied on the availability of co-occurrence statistics. By altering the relevance variable we eliminate the need in the sample of joint distribution of all input variables. This new formulation also enables simple MDL-like model complexity control and prediction of missing values of Z. The approach is analyzed and shown to be on a par with the best known clustering algorithms for a wide range of domains. For the prediction of missing values (collaborative filtering) it improves the currently best known results.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Learning with Hypergraphs: Clustering, Classification, and Embedding

Zhou, D., Huang, J., Schölkopf, B.

In Advances in Neural Information Processing Systems 19, pages: 1601-1608, (Editors: B Schölkopf and J Platt and T Hofmann), MIT Press, Cambridge, MA, USA, 20th Annual Conference on Neural Information Processing Systems (NIPS), September 2007 (inproceedings)

Abstract
We usually endow the investigated objects with pairwise relationships, which can be illustrated as graphs. In many real-world problems, however, relationships among the objects of our interest are more complex than pairwise. Naively squeezing the complex relationships into pairwise ones will inevitably lead to loss of information which can be expected valuable for our learning tasks however. Therefore we consider using hypergraphs instead to completely represent complex relationships among the objects of our interest, and thus the problem of learning with hypergraphs arises. Our main contribution in this paper is to generalize the powerful methodology of spectral clustering which originally operates on undirected graphs to hypergraphs, and further develop algorithms for hypergraph embedding and transductive classi¯cation on the basis of the spectral hypergraph clustering approach. Our experiments on a number of benchmarks showed the advantages of hypergraphs over usual graphs.

ei

PDF Web [BibTex]

PDF Web [BibTex]