Header logo is


2007


no image

no image
Feature Selection for Trouble Shooting in Complex Assembly Lines

Pfingsten, T., Herrmann, D., Schnitzler, T., Feustel, A., Schölkopf, B.

IEEE Transactions on Automation Science and Engineering, 4(3):465-469, July 2007 (article)

Abstract
The final properties of sophisticated products can be affected by many unapparent dependencies within the manufacturing process, and the products’ integrity can often only be checked in a final measurement. Troubleshooting can therefore be very tedious if not impossible in large assembly lines. In this paper we show that Feature Selection is an efficient tool for serial-grouped lines to reveal causes for irregularities in product attributes. We compare the performance of several methods for Feature Selection on real-world problems in mass-production of semiconductor devices. Note to Practitioners— We present a data based procedure to localize flaws in large production lines: using the results of final quality inspections and information about which machines processed which batches, we are able to identify machines which cause low yield.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Gene selection via the BAHSIC family of algorithms

Song, L., Bedo, J., Borgwardt, K., Gretton, A., Smola, A.

Bioinformatics, 23(13: ISMB/ECCB 2007 Conference Proceedings):i490-i498, July 2007 (article)

Abstract
Motivation: Identifying significant genes among thousands of sequences on a microarray is a central challenge for cancer research in bioinformatics. The ultimate goal is to detect the genes that are involved in disease outbreak and progression. A multitude of methods have been proposed for this task of feature selection, yet the selected gene lists differ greatly between different methods. To accomplish biologically meaningful gene selection from microarray data, we have to understand the theoretical connections and the differences between these methods. In this article, we define a kernel-based framework for feature selection based on the Hilbert–Schmidt independence criterion and backward elimination, called BAHSIC. We show that several well-known feature selectors are instances of BAHSIC, thereby clarifying their relationship. Furthermore, by choosing a different kernel, BAHSIC allows us to easily define novel feature selection algorithms. As a further advantage, feature selection via BAHSIC works directly on multiclass problems. Results: In a broad experimental evaluation, the members of the BAHSIC family reach high levels of accuracy and robustness when compared to other feature selection techniques. Experiments show that features selected with a linear kernel provide the best classification performance in general, but if strong non-linearities are present in the data then non-linear kernels can be more suitable.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Phenotyping of Chondrocytes In Vivo and In Vitro Using cDNA Array Technology

Zien, A., Gebhard, P., Fundel, K., Aigner, T.

Clinical Orthopaedics and Related Research, 460, pages: 226-233, July 2007 (article)

Abstract
The cDNA array technology is a powerful tool to analyze a high number of genes in parallel. We investigated whether large-scale gene expression analysis allows clustering and identification of cellular phenotypes of chondrocytes in different in vivo and in vitro conditions. In 100% of cases, clustering analysis distinguished between in vivo and in vitro samples, suggesting fundamental differences in chondrocytes in situ and in vitro regardless of the culture conditions or disease status. It also allowed us to differentiate between healthy and osteoarthritic cartilage. The clustering also revealed the relative importance of the investigated culturing conditions (stimulation agent, stimulation time, bead/monolayer). We augmented the cluster analysis with a statistical search for genes showing differential expression. The identified genes provided hints to the molecular basis of the differences between the sample classes. Our approach shows the power of modern bioinformatic algorithms for understanding and class ifying chondrocytic phenotypes in vivo and in vitro. Although it does not generate new experimental data per se, it provides valuable information regarding the biology of chondrocytes and may provide tools for diagnosing and staging the osteoarthritic disease process.

ei

DOI [BibTex]

DOI [BibTex]


no image
Common Sequence Polymorphisms Shaping Genetic Diversity in Arabidopsis thaliana

Clark, R., Schweikert, G., Toomajian, C., Ossowski, S., Zeller, G., Shinn, P., Warthmann, N., Hu, T., Fu, G., Hinds, D., Chen, H., Frazer, K., Huson, D., Schölkopf, B., Nordborg, M., Rätsch, G., Ecker, J., Weigel, D.

Science, 317(5836):338-342, July 2007 (article)

Abstract
The genomes of individuals from the same species vary in sequence as a result of different evolutionary processes. To examine the patterns of, and the forces shaping, sequence variation in Arabidopsis thaliana, we performed high-density array resequencing of 20 diverse strains (accessions). More than 1 million nonredundant single-nucleotide polymorphisms (SNPs) were identified at moderate false discovery rates (FDRs), and ~4% of the genome was identified as being highly dissimilar or deleted relative to the reference genome sequence. Patterns of polymorphism are highly nonrandom among gene families, with genes mediating interaction with the biotic environment having exceptional polymorphism levels. At the chromosomal scale, regional variation in polymorphism was readily apparent. A scan for recent selective sweeps revealed several candidate regions, including a notable example in which almost all variation was removed in a 500-kilobase window. Analyzing the polymorphisms we describe in larger sets of accessions will enable a detailed understanding of forces shaping population-wide sequence variation in A. thaliana.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Graph Laplacians and their Convergence on Random Neighborhood Graphs

Hein, M., Audibert, J., von Luxburg, U.

Journal of Machine Learning Research, 8, pages: 1325-1370, June 2007 (article)

Abstract
Given a sample from a probability measure with support on a submanifold in Euclidean space one can construct a neighborhood graph which can be seen as an approximation of the submanifold. The graph Laplacian of such a graph is used in several machine learning methods like semi-supervised learning, dimensionality reduction and clustering. In this paper we determine the pointwise limit of three different graph Laplacians used in the literature as the sample size increases and the neighborhood size approaches zero. We show that for a uniform measure on the submanifold all graph Laplacians have the same limit up to constants. However in the case of a non-uniform measure on the submanifold only the so called random walk graph Laplacian converges to the weighted Laplace-Beltrami operator.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Bayesian Reconstruction of the Density of States

Habeck, M.

Physical Review Letters, 98(20, 200601):1-4, May 2007 (article)

Abstract
A Bayesian framework is developed to reconstruct the density of states from multiple canonical simulations. The framework encompasses the histogram reweighting method of Ferrenberg and Swendsen. The new approach applies to nonparametric as well as parametric models and does not require simulation data to be discretized. It offers a means to assess the precision of the reconstructed density of states and of derived thermodynamic quantities.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
PALMA: mRNA to Genome Alignments using Large Margin Algorithms

Schulze, U., Hepp, B., Ong, C., Rätsch, G.

Bioinformatics, 23(15):1892-1900, May 2007 (article)

Abstract
Motivation: Despite many years of research on how to properly align sequences in the presence of sequencing errors, alternative splicing and micro-exons, the correct alignment of mRNA sequences to genomic DNA is still a challenging task. Results: We present a novel approach based on large margin learning that combines accurate plice site predictions with common sequence alignment techniques. By solving a convex optimization problem, our algorithm – called PALMA – tunes the parameters of the model such that true alignments score higher than other alignments. We study the accuracy of alignments of mRNAs containing artificially generated micro-exons to genomic DNA. In a carefully designed experiment, we show that our algorithm accurately identifies the intron boundaries as well as boundaries of the optimal local alignment. It outperforms all other methods: for 5702 artificially shortened EST sequences from C. elegans and human it correctly identifies the intron boundaries in all except two cases. The best other method is a recently proposed method called exalin which misaligns 37 of the sequences. Our method also demonstrates robustness to mutations, insertions and deletions, retaining accuracy even at high noise levels. Availability: Datasets for training, evaluation and testing, additional results and a stand-alone alignment tool implemented in C++ and python are available at http://www.fml.mpg.de/raetsch/projects/palma.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Training a Support Vector Machine in the Primal

Chapelle, O.

Neural Computation, 19(5):1155-1178, March 2007 (article)

Abstract
Most literature on Support Vector Machines (SVMs) concentrate on the dual optimization problem. In this paper, we would like to point out that the primal problem can also be solved efficiently, both for linear and non-linear SVMs, and that there is no reason for ignoring this possibilty. On the contrary, from the primal point of view new families of algorithms for large scale SVM training can be investigated.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Improving the Caenorhabditis elegans Genome Annotation Using Machine Learning

Rätsch, G., Sonnenburg, S., Srinivasan, J., Witte, H., Müller, K., Sommer, R., Schölkopf, B.

PLoS Computational Biology, 3(2, e20):0313-0322, February 2007 (article)

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Statistical Consistency of Kernel Canonical Correlation Analysis

Fukumizu, K., Bach, F., Gretton, A.

Journal of Machine Learning Research, 8, pages: 361-383, February 2007 (article)

Abstract
While kernel canonical correlation analysis (CCA) has been applied in many contexts, the convergence of finite sample estimates of the associated functions to their population counterparts has not yet been established. This paper gives a mathematical proof of the statistical convergence of kernel CCA, providing a theoretical justification for the method. The proof uses covariance operators defined on reproducing kernel Hilbert spaces, and analyzes the convergence of their empirical estimates of finite rank to their population counterparts, which can have infinite rank. The result also gives a sufficient condition for convergence on the regularization coefficient involved in kernel CCA: this should decrease as n^{-1/3}, where n is the number of data.

ei

PDF [BibTex]

PDF [BibTex]


no image
Some observations on the pedestal effect

Henning, G., Wichmann, F.

Journal of Vision, 7(1:3):1-15, January 2007 (article)

Abstract
The pedestal or dipper effect is the large improvement in the detectability of a sinusoidal grating observed when it is added to a masking or pedestal grating of the same spatial frequency, orientation, and phase. We measured the pedestal effect in both broadband and notched noiseVnoise from which a 1.5-octave band centered on the signal frequency had been removed. Although the pedestal effect persists in broadband noise, it almost disappears in the notched noise. Furthermore, the pedestal effect is substantial when either high- or low-pass masking noise is used. We conclude that the pedestal effect in the absence of notched noise results principally from the use of information derived from channels with peak sensitivities at spatial frequencies different from that of the signal and the pedestal. We speculate that the spatial-frequency components of the notched noise above and below the spatial frequency of the signal and the pedestal prevent ‘‘off-frequency looking,’’ that is, prevent the use of information about changes in contrast carried in channels tuned to spatial frequencies that are very much different from that of the signal and the pedestal. Thus, the pedestal or dipper effect measured without notched noise appears not to be a characteristic of individual spatial-frequency-tuned channels.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Cue Combination and the Effect of Horizontal Disparity and Perspective on Stereoacuity

Zalevski, AM., Henning, GB., Hill, NJ.

Spatial Vision, 20(1):107-138, January 2007 (article)

Abstract
Relative depth judgments of vertical lines based on horizontal disparity deteriorate enormously when the lines form part of closed configurations (Westheimer, 1979). In studies showing this effect, perspective was not manipulated and thus produced inconsistency between horizontal disparity and perspective. We show that stereoacuity improves dramatically when perspective and horizontal disparity are made consistent. Observers appear to use unhelpful perspective cues in judging the relative depth of the vertical sides of rectangles in a way not incompatible with a form of cue weighting. However, 95% confidence intervals for the weights derived for cues usually exceed the a-priori [0-1] range.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
iCub - The Design and Realization of an Open Humanoid Platform for Cognitive and Neuroscience Research

Tsagarakis, N., Metta, G., Sandini, G., Vernon, D., Beira, R., Becchi, F., Righetti, L., Santos-Victor, J., Ijspeert, A., Carrozza, M., Caldwell, D.

Advanced Robotics, 21(10):1151-1175, 2007 (article)

Abstract
The development of robotic cognition and the advancement of understanding of human cognition form two of the current greatest challenges in robotics and neuroscience, respectively. The RobotCub project aims to develop an embodied robotic child (iCub) with the physical (height 90 cm and mass less than 23 kg) and ultimately cognitive abilities of a 2.5-year-old human child. The iCub will be a freely available open system which can be used by scientists in all cognate disciplines from developmental psychology to epigenetic robotics to enhance understanding of cognitive systems through the study of cognitive development. The iCub will be open both in software, but more importantly in all aspects of the hardware and mechanical design. In this paper the design of the mechanisms and structures forming the basic 'body' of the iCub are described. The papers considers kinematic structures dynamic design criteria, actuator specification and selection, and detailed mechanical and electronic design. The paper concludes with tests of the performance of sample joints, and comparison of these results with the design requirements and simulation projects.

mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2006


no image
Structure validation of the Josephin domain of ataxin-3: Conclusive evidence for an open conformation

Nicastro, G., Habeck, M., Masino, L., Svergun, DI., Pastore, A.

Journal of Biomolecular NMR, 36(4):267-277, December 2006 (article)

Abstract
The availability of new and fast tools in structure determination has led to a more than exponential growth of the number of structures solved per year. It is therefore increasingly essential to assess the accuracy of the new structures by reliable approaches able to assist validation. Here, we discuss a specific example in which the use of different complementary techniques, which include Bayesian methods and small angle scattering, resulted essential for validating the two currently available structures of the Josephin domain of ataxin-3, a protein involved in the ubiquitin/proteasome pathway and responsible for neurodegenerative spinocerebellar ataxia of type 3. Taken together, our results demonstrate that only one of the two structures is compatible with the experimental information. Based on the high precision of our refined structure, we show that Josephin contains an open cleft which could be directly implicated in the interaction with polyubiquitin chains and other partners.

ei

Web DOI [BibTex]

2006


Web DOI [BibTex]


no image
A Unifying View of Wiener and Volterra Theory and Polynomial Kernel Regression

Franz, M., Schölkopf, B.

Neural Computation, 18(12):3097-3118, December 2006 (article)

Abstract
Volterra and Wiener series are perhaps the best understood nonlinear system representations in signal processing. Although both approaches have enjoyed a certain popularity in the past, their application has been limited to rather low-dimensional and weakly nonlinear systems due to the exponential growth of the number of terms that have to be estimated. We show that Volterra and Wiener series can be represented implicitly as elements of a reproducing kernel Hilbert space by utilizing polynomial kernels. The estimation complexity of the implicit representation is linear in the input dimensionality and independent of the degree of nonlinearity. Experiments show performance advantages in terms of convergence, interpretability, and system sizes that can be handled.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Statistical Analysis of Slow Crack Growth Experiments

Pfingsten, T., Glien, K.

Journal of the European Ceramic Society, 26(15):3061-3065, November 2006 (article)

Abstract
A common approach for the determination of Slow Crack Growth (SCG) parameters are the static and dynamic loading method. Since materials with small Weibull module show a large variability in strength, a correct statistical analysis of the data is indispensable. In this work we propose the use of the Maximum Likelihood method and a Baysian analysis, which, in contrast to the standard procedures, take into account that failure strengths are Weibull distributed. The analysis provides estimates for the SCG parameters, the Weibull module, and the corresponding confidence intervals and overcomes the necessity of manual differentiation between inert and fatigue strength data. We compare the methods to a Least Squares approach, which can be considered the standard procedure. The results for dynamic loading data from the glass sealing of MEMS devices show that the assumptions inherent to the standard approach lead to significantly different estimates.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Mining frequent stem patterns from unaligned RNA sequences

Hamada, M., Tsuda, K., Kudo, T., Kin, T., Asai, K.

Bioinformatics, 22(20):2480-2487, October 2006 (article)

Abstract
Motivation: In detection of non-coding RNAs, it is often necessary to identify the secondary structure motifs from a set of putative RNA sequences. Most of the existing algorithms aim to provide the best motif or few good motifs, but biologists often need to inspect all the possible motifs thoroughly. Results: Our method RNAmine employs a graph theoretic representation of RNA sequences, and detects all the possible motifs exhaustively using a graph mining algorithm. The motif detection problem boils down to finding frequently appearing patterns in a set of directed and labeled graphs. In the tasks of common secondary structure prediction and local motif detection from long sequences, our method performed favorably both in accuracy and in efficiency with the state-of-the-art methods such as CMFinder.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Large-Scale Gene Expression Profiling Reveals Major Pathogenetic Pathways of Cartilage Degeneration in Osteoarthritis

Aigner, T., Fundel, K., Saas, J., Gebhard, P., Haag, J., Weiss, T., Zien, A., Obermayr, F., Zimmer, R., Bartnik, E.

Arthritis and Rheumatism, 54(11):3533-3544, October 2006 (article)

Abstract
Objective. Despite many research efforts in recent decades, the major pathogenetic mechanisms of osteo- arthritis (OA), including gene alterations occurring during OA cartilage degeneration, are poorly under- stood, and there is no disease-modifying treatment approach. The present study was therefore initiated in order to identify differentially expressed disease-related genes and potential therapeutic targets. Methods. This investigation consisted of a large gene expression profiling study performed based on 78 normal and disease samples, using a custom-made complementar y DNA array covering >4,000 genes. Results. Many differentially expressed genes were identified, including the expected up-regulation of ana- bolic and catabolic matrix genes. In particular, the down-regulation of important oxidative defense genes, i.e., the genes for superoxide dismutases 2 and 3 and glutathione peroxidase 3, was prominent. This indicates that continuous oxidative stress to the cells and the matrix is one major underlying pathogenetic mecha- nism in OA. Also, genes that are involved in the phenot ypic stabilit y of cells, a feature that is greatly reduced in OA cartilage, appeared to be suppressed. Conclusion. Our findings provide a reference data set on gene alterations in OA cartilage and, importantly, indicate major mechanisms underlying central cell bio- logic alterations that occur during the OA disease process. These results identify molecular targets that can be further investigated in the search for therapeutic interventions.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Implicit Surface Modelling with a Globally Regularised Basis of Compact Support

Walder, C., Schölkopf, B., Chapelle, O.

Computer Graphics Forum, 25(3):635-644, September 2006 (article)

Abstract
We consider the problem of constructing a globally smooth analytic function that represents a surface implicitly by way of its zero set, given sample points with surface normal vectors. The contributions of the paper include a novel means of regularising multi-scale compactly supported basis functions that leads to the desirable interpolation properties previously only associated with fully supported bases. We also provide a regularisation framework for simpler and more direct treatment of surface normals, along with a corresponding generalisation of the representer theorem lying at the core of kernel-based machine learning methods. We demonstrate the techniques on 3D problems of up to 14 million data points, as well as 4D time series data and four-dimensional interpolation between three-dimensional shapes.

ei

PDF GZIP DOI [BibTex]


no image
An Online Support Vector Machine for Abnormal Events Detection

Davy, M., Desobry, F., Gretton, A., Doncarli, C.

Signal Processing, 86(8):2009-2025, August 2006 (article)

Abstract
The ability to detect online abnormal events in signals is essential in many real-world Signal Processing applications. Previous algorithms require an explicit signal statistical model, and interpret abnormal events as statistical model abrupt changes. Corresponding implementation relies on maximum likelihood or on Bayes estimation theory with generally excellent performance. However, there are numerous cases where a robust and tractable model cannot be obtained, and model-free approaches need to be considered. In this paper, we investigate a machine learning, descriptor-based approach that does not require an explicit descriptors statistical model, based on Support Vector novelty detection. A sequential optimization algorithm is introduced. Theoretical considerations as well as simulations on real signals demonstrate its practical efficiency.

ei

PDF PostScript PDF DOI [BibTex]

PDF PostScript PDF DOI [BibTex]


no image
Integrating Structured Biological data by Kernel Maximum Mean Discrepancy

Borgwardt, K., Gretton, A., Rasch, M., Kriegel, H., Schölkopf, B., Smola, A.

Bioinformatics, 22(4: ISMB 2006 Conference Proceedings):e49-e57, August 2006 (article)

Abstract
Motivation: Many problems in data integration in bioinformatics can be posed as one common question: Are two sets of observations generated by the same distribution? We propose a kernel-based statistical test for this problem, based on the fact that two distributions are different if and only if there exists at least one function having different expectation on the two distributions. Consequently we use the maximum discrepancy between function means as the basis of a test statistic. The Maximum Mean Discrepancy (MMD) can take advantage of the kernel trick, which allows us to apply it not only to vectors, but strings, sequences, graphs, and other common structured data types arising in molecular biology. Results: We study the practical feasibility of an MMD-based test on three central data integration tasks: Testing cross-platform comparability of microarray data, cancer diagnosis, and data-content based schema matching for two different protein function classification schemas. In all of these experiments, including high-dimensional ones, MMD is very accurate in finding samples that were generated from the same distribution, and outperforms its best competitors. Conclusions: We have defined a novel statistical test of whether two samples are from the same distribution, compatible with both multivariate and structured data, that is fast, easy to implement, and works well, as confirmed by our experiments.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Large Scale Transductive SVMs

Collobert, R., Sinz, F., Weston, J., Bottou, L.

Journal of Machine Learning Research, 7, pages: 1687-1712, August 2006 (article)

Abstract
We show how the Concave-Convex Procedure can be applied to the optimization of Transductive SVMs, which traditionally requires solving a combinatorial search problem. This provides for the first time a highly scalable algorithm in the nonlinear case. Detailed experiments verify the utility of our approach.

ei

PostScript PDF PDF [BibTex]

PostScript PDF PDF [BibTex]


no image
Building Support Vector Machines with Reduced Classifier Complexity

Keerthi, S., Chapelle, O., DeCoste, D.

Journal of Machine Learning Research, 7, pages: 1493-1515, July 2006 (article)

Abstract
Support vector machines (SVMs), though accurate, are not preferred in applications requiring great classification speed, due to the number of support vectors being large. To overcome this problem we devise a primal method with the following properties: (1) it decouples the idea of basis functions from the concept of support vectors; (2) it greedily finds a set of kernel basis functions of a specified maximum size ($dmax$) to approximate the SVM primal cost function well; (3) it is efficient and roughly scales as $O(ndmax^2)$ where $n$ is the number of training examples; and, (4) the number of basis functions it requires to achieve an accuracy close to the SVM accuracy is usually far less than the number of SVM support vectors.

ei

PDF [BibTex]

PDF [BibTex]


no image
ARTS: Accurate Recognition of Transcription Starts in Human

Sonnenburg, S., Zien, A., Rätsch, G.

Bioinformatics, 22(14):e472-e480, July 2006 (article)

Abstract
Motivation: One of the most important features of genomic DNA are the protein-coding genes. While it is of great value to identify those genes and the encoded proteins, it is also crucial to understand how their transcription is regulated. To this end one has to identify the corresponding promoters and the contained transcription factor binding sites. TSS finders can be used to locate potential promoters. They may also be used in combination with other signal and content detectors to resolve entire gene structures. Results: We have developed a novel kernel based method - called ARTS - that accurately recognizes transcription start sites in human. The application of otherwise too computationally expensive Support Vector Machines was made possible due to the use of efficient training and evaluation techniques using suffix tries. In a carefully designed experimental study, we compare our TSS finder to state-of-the-art methods from the literature: McPromoter, Eponine and FirstEF. For given false positive rates within a reasonable range, we consistently achieve considerably higher true positive rates. For instance, ARTS finds about 24% true positives at a false positive rate of 1/1000, where the other methods find less than half (10.5%). Availability: Datasets, model selection results, whole genome predictions, and additional experimental results are available at http://www.fml.tuebingen.mpg.de/raetsch/projects/arts

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Large Scale Multiple Kernel Learning

Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.

Journal of Machine Learning Research, 7, pages: 1531-1565, July 2006 (article)

Abstract
While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimental results show that the proposed algorithm works for hundred thousands of examples or hundreds of kernels to be combined, and helps for automatic model selection, improving the interpretability of the learning result. In a second part we discuss general speed up mechanism for SVMs, especially when used with sparse feature maps as appear for string kernels, allowing us to train a string kernel SVM on a 10 million real-world splice data set from computational biology. We integrated multiple kernel learning in our machine learning toolbox SHOGUN for which the source code is publicly available at http://www.fml.tuebingen.mpg.de/raetsch/projects/shogun.

ei

PDF [BibTex]

PDF [BibTex]


no image
Factorial coding of natural images: how effective are linear models in removing higher-order dependencies?

Bethge, M.

Journal of the Optical Society of America A, 23(6):1253-1268, June 2006 (article)

Abstract
The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included into the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2x2 to 16x16 pixels. In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2%. In conclusion, the `edge filters‘ found with ICA lead only to a surprisingly small improvement in terms of its actual objective.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Classifying EEG and ECoG Signals without Subject Training for Fast BCI Implementation: Comparison of Non-Paralysed and Completely Paralysed Subjects

Hill, N., Lal, T., Schröder, M., Hinterberger, T., Wilhelm, B., Nijboer, F., Mochty, U., Widman, G., Elger, C., Schölkopf, B., Kübler, A., Birbaumer, N.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2):183-186, June 2006 (article)

Abstract
We summarize results from a series of related studies that aim to develop a motor-imagery-based brain-computer interface using a single recording session of EEG or ECoG signals for each subject. We apply the same experimental and analytical methods to 11 non-paralysed subjects (8 EEG, 3 ECoG), and to 5 paralysed subjects (4 EEG, 1 ECoG) who had been unable to communicate for some time. While it was relatively easy to obtain classifiable signals quickly from most of the non-paralysed subjects, it proved impossible to classify the signals obtained from the paralysed patients by the same methods. This highlights the fact that though certain BCI paradigms may work well with healthy subjects, this does not necessarily indicate success with the target user group. We outline possible reasons for this failure to transfer.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
SCARNA: Fast and Accurate Structural Alignment of RNA Sequences by Matching Fixed-Length Stem Fragments

Tabei, Y., Tsuda, K., Kin, T., Asai, K.

Bioinformatics, 22(14):1723-1729, May 2006 (article)

Abstract
The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequence is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comparison in searching similar RNAs should consider not only their sequence similarities but their potential secondary structures. Sankoff‘s algorithm predicts the common secondary structures of the sequences, but it is computationally too expensive to apply to large-scale analyses. Because we often want to compare a large number of cDNA sequences or to search similar RNAs in the whole genome sequences, much faster algorithms are required. We propose a new method of comparing RNA sequences based on the structural alignments of the fixed-length fragments of the stem candidates. The implemented software, SCARNA (Stem Candidate Aligner for RNAs), is fast enough to apply to the long sequences in the large-scale analyses. The accuracy of the alignments is better or comparable to the much slower existing algorithms.

ei

PDF Web DOI [BibTex]


no image
The Effect of Artifacts on Dependence Measurement in fMRI

Gretton, A., Belitski, A., Murayama, Y., Schölkopf, B., Logothetis, N.

Magnetic Resonance Imaging, 24(4):401-409, April 2006 (article)

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Phase noise and the classification of natural images

Wichmann, F., Braun, D., Gegenfurtner, K.

Vision Research, 46(8-9):1520-1529, April 2006 (article)

Abstract
We measured the effect of global phase manipulations on a rapid animal categorization task. The Fourier spectra of our images of natural scenes were manipulated by adding zero-mean random phase noise at all spatial frequencies. The phase noise was the independent variable, uniformly and symmetrically distributed between 0 degree and ±180 degrees. Subjects were remarkably resistant to phase noise. Even with ±120 degree phase noise subjects were still performing at 75% correct. The high resistance of the subjects’ animal categorization rate to phase noise suggests that the visual system is highly robust to such random image changes. The proportion of correct answers closely followed the correlation between original and the phase noise-distorted images. Animal detection rate was higher when the same task was performed with contrast reduced versions of the same natural images, at contrasts where the contrast reduction mimicked that resulting from our phase randomization. Since the subjects’ categorization rate was better in the contrast experiment, reduction of local contrast alone cannot explain the performance in the phase noise experiment. This result obtained with natural images differs from those obtained for simple sinusoidal stimuli were performance changes due to phase changes are attributed to local contrast changes only. Thus the global phasechange accompanying disruption of image structure such as edges and object boundaries at different spatial scales reduces object classification over and above the performance deficit resulting from reducing contrast. Additional colour information improves the categorization performance by 2 %.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A Direct Method for Building Sparse Kernel Learning Algorithms

Wu, M., Schölkopf, B., BakIr, G.

Journal of Machine Learning Research, 7, pages: 603-624, April 2006 (article)

Abstract
Many Kernel Learning Algorithms(KLA), including Support Vector Machine (SVM), result in a Kernel Machine (KM), such as a kernel classifier, whose key component is a weight vector in a feature space implicitly introduced by a positive definite kernel function. This weight vector is usually obtained by solving a convex optimization problem. Based on this fact we present a direct method to build Sparse Kernel Learning Algorithms (SKLA) by adding one more constraint to the original convex optimization problem, such that the sparseness of the resulting KM is explicitly controlled while at the same time the performance of the resulting KM can be kept as high as possible. A gradient based approach is provided to solve this modified optimization problem. Applying this method to the SVM results in a concrete algorithm for building Sparse Large Margin Classifiers (SLMC). Further analysis of the SLMC algorithm indicates that it essentially finds a discriminating subspace that can be spanned by a small number of vectors, and in this subspace, the different classes of data are linearly well separated. Experimental results over several classification benchmarks demonstrate the effectiveness of our approach.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Statistical Properties of Kernel Principal Component Analysis

Blanchard, G., Bousquet, O., Zwald, L.

Machine Learning, 66(2-3):259-294, March 2006 (article)

Abstract
We study the properties of the eigenvalues of Gram matrices in a non-asymptotic setting. Using local Rademacher averages, we provide data-dependent and tight bounds for their convergence towards eigenvalues of the corresponding kernel operator. We perform these computations in a functional analytic framework which allows to deal implicitly with reproducing kernel Hilbert spaces of infinite dimension. This can have applications to various kernel algorithms, such as Support Vector Machines (SVM). We focus on Kernel Principal Component Analysis (KPCA) and, using such techniques, we obtain sharp excess risk bounds for the reconstruction error. In these bounds, the dependence on the decay of the spectrum and on the closeness of successive eigenvalues is made explicit.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Network-based de-noising improves prediction from microarray data

Kato, T., Murata, Y., Miura, K., Asai, K., Horton, P., Tsuda, K., Fujibuchi, W.

BMC Bioinformatics, 7(Suppl. 1):S4-S4, March 2006 (article)

Abstract
Prediction of human cell response to anti-cancer drugs (compounds) from microarray data is a challenging problem, due to the noise properties of microarrays as well as the high variance of living cell responses to drugs. Hence there is a strong need for more practical and robust methods than standard methods for real-value prediction. We devised an extended version of the off-subspace noise-reduction (de-noising) method to incorporate heterogeneous network data such as sequence similarity or protein-protein interactions into a single framework. Using that method, we first de-noise the gene expression data for training and test data and also the drug-response data for training data. Then we predict the unknown responses of each drug from the de-noised input data. For ascertaining whether de-noising improves prediction or not, we carry out 12-fold cross-validation for assessment of the prediction performance. We use the Pearson‘s correlation coefficient between the true and predicted respon se values as the prediction performance. De-noising improves the prediction performance for 65% of drugs. Furthermore, we found that this noise reduction method is robust and effective even when a large amount of artificial noise is added to the input data. We found that our extended off-subspace noise-reduction method combining heterogeneous biological data is successful and quite useful to improve prediction of human cell cancer drug responses from microarray data.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Model-based Design Analysis and Yield Optimization

Pfingsten, T., Herrmann, D., Rasmussen, C.

IEEE Transactions on Semiconductor Manufacturing, 19(4):475-486, February 2006 (article)

Abstract
Fluctuations are inherent to any fabrication process. Integrated circuits and micro-electro-mechanical systems are particularly affected by these variations, and due to high quality requirements the effect on the devices’ performance has to be understood quantitatively. In recent years it has become possible to model the performance of such complex systems on the basis of design specifications, and model-based Sensitivity Analysis has made its way into industrial engineering. We show how an efficient Bayesian approach, using a Gaussian process prior, can replace the commonly used brute-force Monte Carlo scheme, making it possible to apply the analysis to computationally costly models. We introduce a number of global, statistically justified sensitivity measures for design analysis and optimization. Two models of integrated systems serve us as case studies to introduce the analysis and to assess its convergence properties. We show that the Bayesian Monte Carlo scheme can save costly simulation runs and can ensure a reliable accuracy of the analysis.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Weighting of experimental evidence in macromolecular structure determination

Habeck, M., Rieping, W., Nilges, M.

Proceedings of the National Academy of Sciences of the United States of America, 103(6):1756-1761, February 2006 (article)

Abstract
The determination of macromolecular structures requires weighting of experimental evidence relative to prior physical information. Although it can critically affect the quality of the calculated structures, experimental data are routinely weighted on an empirical basis. At present, cross-validation is the most rigorous method to determine the best weight. We describe a general method to adaptively weight experimental data in the course of structure calculation. It is further shown that the necessity to define weights for the data can be completely alleviated. We demonstrate the method on a structure calculation from NMR data and find that the resulting structures are optimal in terms of accuracy and structural quality. Our method is devoid of the bias imposed by an empirical choice of the weight and has some advantages over estimating the weight by cross-validation.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Classification of Faces in Man and Machine

Graf, A., Wichmann, F., Bülthoff, H., Schölkopf, B.

Neural Computation, 18(1):143-165, January 2006 (article)

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Dynamic Hebbian learning in adaptive frequency oscillators

Righetti, L., Buchli, J., Ijspeert, A.

Physica D: Nonlinear Phenomena, 216(2):269-281, 2006 (article)

Abstract
Nonlinear oscillators are widely used in biology, physics and engineering for modeling and control. They are interesting because of their synchronization properties when coupled to other dynamical systems. In this paper, we propose a learning rule for oscillators which adapts their frequency to the frequency of any periodic or pseudo-periodic input signal. Learning is done in a dynamic way: it is part of the dynamical system and not an offline process. An interesting property of our model is that it is easily generalizable to a large class of oscillators, from phase oscillators to relaxation oscillators and strange attractors with a generic learning rule. One major feature of our learning rule is that the oscillators constructed can adapt their frequency without any signal processing or the need to specify a time window or similar free parameters. All the processing is embedded in the dynamics of the adaptive oscillator. The convergence of the learning is proved for the Hopf oscillator, then numerical experiments are carried out to explore the learning capabilities of the system. Finally, we generalize the learning rule to non-harmonic oscillators like relaxation oscillators and strange attractors.

mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Engineering Entrainment and Adaptation in Limit Cycle Systems – From biological inspiration to applications in robotics

Buchli, J., Righetti, L., Ijspeert, A.

Biological Cybernetics, 95(6):645-664, December 2006 (article)

Abstract
Periodic behavior is key to life and is observed in multiple instances and at multiple time scales in our metabolism, our natural environment, and our engineered environment. A natural way of modeling or generating periodic behavior is done by using oscillators, i.e., dynamical systems that exhibit limit cycle behavior. While there is extensive literature on methods to analyze such dynamical systems, much less work has been done on methods to synthesize an oscillator to exhibit some specific desired characteristics. The goal of this article is twofold: (1) to provide a framework for characterizing and designing oscillators and (2) to review how classes of well-known oscillators can be understood and related to this framework. The basis of the framework is to characterize oscillators in terms of their fundamental temporal and spatial behavior and in terms of properties that these two behaviors can be designed to exhibit. This focus on fundamental properties is important because it allows us to systematically compare a large variety of oscillators that might at first sight appear very different from each other. We identify several specifications that are useful for design, such as frequency-locking behavior, phase-locking behavior, and specific output signal shape. We also identify two classes of design methods by which these specifications can be met, namely offline methods and online methods. By relating these specifications to our framework and by presenting several examples of how oscillators have been designed in the literature, this article provides a useful methodology and toolbox for designing oscillators for a wide range of purposes. In particular, the focus on synthesis of limit cycle dynamical systems should be useful both for engineering and for computational modeling of physical or biological phenomena.

mg

link (url) DOI [BibTex]