Header logo is


2004


no image
On the representation, learning and transfer of spatio-temporal movement characteristics

Ilg, W., Bakir, GH., Mezger, J., Giese, M.

International Journal of Humanoid Robotics, 1(4):613-636, December 2004 (article)

ei

[BibTex]

2004


[BibTex]


no image
Insect-inspired estimation of egomotion

Franz, MO., Chahl, JS., Krapp, HG.

Neural Computation, 16(11):2245-2260, November 2004 (article)

Abstract
Tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during egomotion. In this study, we examine whether a simplified linear model based on the organization principles in tangential neurons can be used to estimate egomotion from the optic flow. We present a theory for the construction of an estimator consisting of a linear combination of optic flow vectors that incorporates prior knowledge both about the distance distribution of the environment, and about the noise and egomotion statistics of the sensor. The estimator is tested on a gantry carrying an omnidirectional vision sensor. The experiments show that the proposed approach leads to accurate and robust estimates of rotation rates, whereas translation estimates are of reasonable quality, albeit less reliable.

ei

PDF PostScript Web DOI [BibTex]

PDF PostScript Web DOI [BibTex]


no image
Efficient face detection by a cascaded support-vector machine expansion

Romdhani, S., Torr, P., Schölkopf, B., Blake, A.

Proceedings of The Royal Society of London A, 460(2501):3283-3297, A, November 2004 (article)

Abstract
We describe a fast system for the detection and localization of human faces in images using a nonlinear ‘support-vector machine‘. We approximate the decision surface in terms of a reduced set of expansion vectors and propose a cascaded evaluation which has the property that the full support-vector expansion is only evaluated on the face-like parts of the image, while the largest part of typical images is classified using a single expansion vector (a simpler and more efficient classifier). As a result, only three reduced-set vectors are used, on average, to classify an image patch. Hence, the cascaded evaluation, presented in this paper, offers a thirtyfold speed-up over an evaluation using the full set of reduced-set vectors, which is itself already thirty times faster than classification using all the support vectors.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Discrete vs. Continuous: Two Sides of Machine Learning

Zhou, D.

October 2004 (talk)

Abstract
We consider the problem of transductive inference. In many real-world problems, unlabeled data is far easier to obtain than labeled data. Hence transductive inference is very significant in many practical problems. According to Vapnik's point of view, one should predict the function value only on the given points directly rather than a function defined on the whole space, the latter being a more complicated problem. Inspired by this idea, we develop discrete calculus on finite discrete spaces, and then build discrete regularization. A family of transductive algorithms is naturally derived from this regularization framework. We validate the algorithms on both synthetic and real-world data from text/web categorization to bioinformatics problems. A significant by-product of this work is a powerful way of ranking data based on examples including images, documents, proteins and many other kinds of data. This talk is mainly based on the followiing contribution: (1) D. Zhou and B. Sch{\"o}lkopf: Transductive Inference with Graphs, MPI Technical report, August, 2004; (2) D. Zhou, B. Sch{\"o}lkopf and T. Hofmann. Semi-supervised Learning on Directed Graphs. NIPS 2004; (3) D. Zhou, O. Bousquet, T.N. Lal, J. Weston and B. Sch{\"o}lkopf. Learning with Local and Global Consistency. NIPS 2003.

ei

PDF [BibTex]


no image
Grundlagen von Support Vector Maschinen und Anwendungen in der Bildverarbeitung

Eichhorn, J.

September 2004 (talk)

Abstract
Invited talk at the workshop "Numerical, Statistical and Discrete Methods in Image Processing" at the TU M{\"u}nchen (in GERMAN)

ei

PDF [BibTex]


no image
Learning kernels from biological networks by maximizing entropy

Tsuda, K., Noble, W.

Bioinformatics, 20(Suppl. 1):i326-i333, August 2004 (article)

Abstract
Motivation: The diffusion kernel is a general method for computing pairwise distances among all nodes in a graph, based on the sum of weighted paths between each pair of nodes. This technique has been used successfully, in conjunction with kernel-based learning methods, to draw inferences from several types of biological networks. Results: We show that computing the diffusion kernel is equivalent to maximizing the von Neumann entropy, subject to a global constraint on the sum of the Euclidean distances between nodes. This global constraint allows for high variance in the pairwise distances. Accordingly, we propose an alternative, locally constrained diffusion kernel, and we demonstrate that the resulting kernel allows for more accurate support vector machine prediction of protein functional classifications from metabolic and protein–protein interaction networks.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Masking effect produced by Mach bands on the detection of narrow bars of random polarity

Henning, GB., Hoddinott, KT., Wilson-Smith, ZJ., Hill, NJ.

Journal of the Optical Society of America, 21(8):1379-1387, A, August 2004 (article)

ei

[BibTex]

[BibTex]


no image
Riemannian Geometry on Graphs and its Application to Ranking and Classification

Zhou, D.

June 2004 (talk)

Abstract
We consider the problem of transductive inference. In many real-world problems, unlabeled data is far easier to obtain than labeled data. Hence transductive inference is very significant in many practical problems. According to Vapnik's point of view, one should predict the function value only on the given points directly rather than a function defined on the whole space, the latter being a more complicated problem. Inspired by this idea, we develop discrete calculus on finite discrete spaces, and then build discrete regularization. A family of transductive algorithms is naturally derived from this regularization framework. We validate the algorithms on both synthetic and real-world data from text/web categorization to bioinformatics problems. A significant by-product of this work is a powerful way of ranking data based on examples including images, documents, proteins and many other kinds of data.

ei

PDF [BibTex]


no image
Support Vector Channel Selection in BCI

Lal, T., Schröder, M., Hinterberger, T., Weston, J., Bogdan, M., Birbaumer, N., Schölkopf, B.

IEEE Transactions on Biomedical Engineering, 51(6):1003-1010, June 2004 (article)

Abstract
Designing a Brain Computer Interface (BCI) system one can choose from a variety of features that may be useful for classifying brain activity during a mental task. For the special case of classifying EEG signals we propose the usage of the state of the art feature selection algorithms Recursive Feature Elimination and Zero-Norm Optimization which are based on the training of Support Vector Machines (SVM). These algorithms can provide more accurate solutions than standard filter methods for feature selection. We adapt the methods for the purpose of selecting EEG channels. For a motor imagery paradigm we show that the number of used channels can be reduced significantly without increasing the classification error. The resulting best channels agree well with the expected underlying cortical activity patterns during the mental tasks. Furthermore we show how time dependent task specific information can be visualized.

ei

DOI [BibTex]

DOI [BibTex]


no image
Distance-Based Classification with Lipschitz Functions

von Luxburg, U., Bousquet, O.

Journal of Machine Learning Research, 5, pages: 669-695, June 2004 (article)

Abstract
The goal of this article is to develop a framework for large margin classification in metric spaces. We want to find a generalization of linear decision functions for metric spaces and define a corresponding notion of margin such that the decision function separates the training points with a large margin. It will turn out that using Lipschitz functions as decision functions, the inverse of the Lipschitz constant can be interpreted as the size of a margin. In order to construct a clean mathematical setup we isometrically embed the given metric space into a Banach space and the space of Lipschitz functions into its dual space. To analyze the resulting algorithm, we prove several representer theorems. They state that there always exist solutions of the Lipschitz classifier which can be expressed in terms of distance functions to training points. We provide generalization bounds for Lipschitz classifiers in terms of the Rademacher complexities of some Lipschitz function classes. The generality of our approach can be seen from the fact that several well-known algorithms are special cases of the Lipschitz classifier, among them the support vector machine, the linear programming machine, and the 1-nearest neighbor classifier.

ei

PDF PostScript PDF [BibTex]

PDF PostScript PDF [BibTex]


no image
cDNA-Microarray Technology in Cartilage Research - Functional Genomics of Osteoarthritis [in German]

Aigner, T., Finger, F., Zien, A., Bartnik, E.

Zeitschrift f{\"u}r Orthop{\"a}die und ihre Grenzgebiete, 142(2):241-247, April 2004 (article)

Abstract
Functional genomics represents a new challenging approach in order to analyze complex diseases such as osteoarthritis on a molecular level. The characterization of the molecular changes of the cartilage cells, the chondrocytes, enables a better understanding of the pathomechanisms of the disease. In particular, the identification and characterization of new target molecules for therapeutic intervention is of interest. Also, potential molecular markers for diagnosis and monitoring of osteoarthritis contribute to a more appropriate patient management. The DNA-microarray technology complements (but does not replace) biochemical and biological research in new disease-relevant genes. Large-scale functional genomics will identify molecular networks such as yet identified players in the anabolic-catabolic balance of articular cartilage as well as disease-relevant intracellular signaling cascades so far rather unknown in articular chondrocytes. However, at the moment it is also important to recognize the limitations of the microarray technology in order to avoid over-interpretation of the results. This might lead to misleading results and prevent to a significant extent a proper use of the potential of this technology in the field of osteoarthritis.

ei

[BibTex]

[BibTex]


no image
A Compression Approach to Support Vector Model Selection

von Luxburg, U., Bousquet, O., Schölkopf, B.

Journal of Machine Learning Research, 5, pages: 293-323, April 2004 (article)

Abstract
In this paper we investigate connections between statistical learning theory and data compression on the basis of support vector machine (SVM) model selection. Inspired by several generalization bounds we construct "compression coefficients" for SVMs which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane. The main idea is to relate the coding precision to geometrical concepts such as the width of the margin or the shape of the data in the feature space. The so derived compression coefficients combine well known quantities such as the radius-margin term R^2/rho^2, the eigenvalues of the kernel matrix, and the number of support vectors. To test whether they are useful in practice we ran model selection experiments on benchmark data sets. As a result we found that compression coefficients can fairly accurately predict the parameters for which the test error is minimized.

ei

PDF [BibTex]

PDF [BibTex]


no image
Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking

Zhou, D.

January 2004 (talk)

Abstract
We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problems and demonstrates effective use of unlabeled data.

ei

PDF [BibTex]


no image
Experimentally optimal v in support vector regression for different noise models and parameter settings

Chalimourda, A., Schölkopf, B., Smola, A.

Neural Networks, 17(1):127-141, January 2004 (article)

Abstract
In Support Vector (SV) regression, a parameter ν controls the number of Support Vectors and the number of points that come to lie outside of the so-called var epsilon-insensitive tube. For various noise models and SV parameter settings, we experimentally determine the values of ν that lead to the lowest generalization error. We find good agreement with the values that had previously been predicted by a theoretical argument based on the asymptotic efficiency of a simplified model of SV regression. As a side effect of the experiments, valuable information about the generalization behavior of the remaining SVM parameters and their dependencies is gained. The experimental findings are valid even for complex ‘real-world’ data sets. Based on our results on the role of the ν-SVM parameters, we discuss various model selection methods.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Introduction to Category Theory

Bousquet, O.

Internal Seminar, January 2004 (talk)

Abstract
A brief introduction to the general idea behind category theory with some basic definitions and examples. A perspective on higher dimensional categories is given.

ei

PDF [BibTex]

PDF [BibTex]


no image
Protein ranking: from local to global structure in the protein similarity network

Weston, J., Elisseeff, A., Zhou, D., Leslie, C., Noble, W.

Proceedings of the National Academy of Science, 101(17):6559-6563, 2004 (article)

Abstract
Biologists regularly search databases of DNA or protein sequences for evolutionary or functional relationships to a given query sequence. We describe a ranking algorithm that exploits the entire network structure of similarity relationships among proteins in a sequence database by performing a diffusion operation on a pre-computed, weighted network. The resulting ranking algorithm, evaluated using a human-curated database of protein structures, is efficient and provides significantly better rankings than a local network search algorithm such as PSI-BLAST.

ei

Web [BibTex]

Web [BibTex]


no image
Statistical Performance of Support Vector Machines

Blanchard, G., Bousquet, O., Massart, P.

2004 (article)

ei

PostScript [BibTex]


no image
Asymptotic Properties of the Fisher Kernel

Tsuda, K., Akaho, S., Kawanabe, M., Müller, K.

Neural Computation, 16(1):115-137, 2004 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
Some observations on the effects of slant and texture type on slant-from-texture

Rosas, P., Wichmann, F., Wagemans, J.

Vision Research, 44(13):1511-1535, 2004 (article)

Abstract
We measure the performance of five subjects in a slant-discrimination task for differently textured planes. As textures we used uniform lattices, randomly displaced lattices, circles (polka dots), Voronoi tessellations, plaids, 1/f noise, “coherent” noise and a leopard skin-like texture. Our results show: (1) Improving performance with larger slants for all textures. (2) Thus, following from (1), cases of “non-symmetrical” performance around a particular orientation. (3) For orientations sufficiently slanted, the different textures do not elicit major differences in performance, (4) while for orientations closer to the vertical plane there are marked differences between them. (5) These differences allow a rank-order of textures to be formed according to their “helpfulness”– that is, how easy the discrimination task is when a particular texture is mapped on the plane. Polka dots tend to allow the best slant discrimination performance, noise patterns the worst. Two additional experiments were conducted to test the generality of the obtained rank-order. First, the tilt of the planes was rotated to break the axis of gravity present in the original discrimination experiment. Second, the task was changed to a slant report task via probe adjustment. The results of both control experiments confirmed the texture-based rank-order previously obtained. We comment on the importance of these results for depth perception research in general, and in particular the implications our results have for studies of cue combination (sensor fusion) using texture as one of the cues involved.

ei

PDF [BibTex]

PDF [BibTex]


no image
Minimizing the Cross Validation Error to Mix Kernel Matrices of Heterogeneous Biological Data

Tsuda, K., Uda, S., Kin, T., Asai, K.

Neural Processing Letters, 19, pages: 63-72, 2004 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
A Tutorial on Support Vector Regression

Smola, A., Schölkopf, B.

Statistics and Computing, 14(3):199-222, 2004 (article)

ei

Web [BibTex]

Web [BibTex]


no image
Bayesian analysis of the Scatterometer Wind Retrieval Inverse Problem: Some New Approaches

Cornford, D., Csato, L., Evans, D., Opper, M.

Journal of the Royal Statistical Society B, 66, pages: 1-17, 3, 2004 (article)

Abstract
The retrieval of wind vectors from satellite scatterometer observations is a non-linear inverse problem.A common approach to solving inverse problems is to adopt a Bayesian framework and to infer the posterior distribution of the parameters of interest given the observations by using a likelihood model relating the observations to the parameters, and a prior distribution over the parameters.We show how Gaussian process priors can be used efficiently with a variety of likelihood models, using local forward (observation) models and direct inverse models for the scatterometer.We present an enhanced Markov chain Monte Carlo method to sample from the resulting multimodal posterior distribution.We go on to show how the computational complexity of the inference can be controlled by using a sparse, sequential Bayes algorithm for estimation with Gaussian processes.This helps to overcome the most serious barrier to the use of probabilistic, Gaussian process methods in remote sensing inverse problems, which is the prohibitively large size of the data sets.We contrast the sampling results with the approximations that are found by using the sparse, sequential Bayes algorithm.

ei

PDF [BibTex]

PDF [BibTex]


no image
Feature Selection for Support Vector Machines Using Genetic Algorithms

Fröhlich, H., Chapelle, O., Schölkopf, B.

International Journal on Artificial Intelligence Tools (Special Issue on Selected Papers from the 15th IEEE International Conference on Tools with Artificial Intelligence 2003), 13(4):791-800, 2004 (article)

ei

Web [BibTex]

Web [BibTex]


no image
Phenotypic Characterization of Human Chondrocyte Cell Line C-20/A4: A Comparison between Monolayer and Alginate Suspension Culture

Finger, F., Schorle, C., Söder, S., Zien, A., Goldring, M., Aigner, T.

Cells Tissues Organs, 178(2):65-77, 2004 (article)

Abstract
DNA microarray analysis was used to investigate the molecular phenotype of one of the first human chondrocyte cell lines, C-20/A4, derived from juvenile costal chondrocytes by immortalization with origin-defective simian virus 40 large T antigen. Clontech Human Cancer Arrays 1.2 and quantitative PCR were used to examine gene expression profiles of C-20/A4 cells cultured in the presence of serum in monolayer and alginate beads. In monolayer cultures, genes involved in cell proliferation were strongly upregulated compared to those expressed by human adult articular chondrocytes in primary culture. Of the cell cycle-regulated genes, only two, the CDK regulatory subunit and histone H4, were downregulated after culture in alginate beads, consistent with the ability of these cells to proliferate in suspension culture. In contrast, the expression of several genes that are involved in pericellular matrix formation, including MMP-14, COL6A1, fibronectin, biglycan and decorin, was upregulated when the C-20/A4 cells were transferred to suspension culture in alginate. Also, nexin-1, vimentin, and IGFBP-3, which are known to be expressed by primary chondrocytes, were differentially expressed in our study. Consistent with the proliferative phenotype of this cell line, few genes involved in matrix synthesis and turnover were highly expressed in the presence of serum. These results indicate that immortalized chondrocyte cell lines, rather than substituting for primary chondrocytes, may serve as models for extending findings on chondrocyte function not achievable by the use of primary chondrocytes.

ei

[BibTex]

[BibTex]


no image
Kernel Methods and their Potential Use in Signal Processing

Perez-Cruz, F., Bousquet, O.

IEEE Signal Processing Magazine, (Special issue on Signal Processing for Mining), 2004 (article) Accepted

ei

PostScript [BibTex]

PostScript [BibTex]


no image
Advanced Statistical Learning Theory

Bousquet, O.

Machine Learning Summer School, 2004 (talk)

ei

PDF [BibTex]

PDF [BibTex]

2003


no image
Concentration Inequalities for Sub-Additive Functions Using the Entropy Method

Bousquet, O.

Stochastic Inequalities and Applications, 56, pages: 213-247, Progress in Probability, (Editors: Giné, E., C. Houdré and D. Nualart), November 2003 (article)

Abstract
We obtain exponential concentration inequalities for sub-additive functions of independent random variables under weak conditions on the increments of those functions, like the existence of exponential moments for these increments. As a consequence of these general inequalities, we obtain refinements of Talagrand's inequality for empirical processes and new bounds for randomized empirical processes. These results are obtained by further developing the entropy method introduced by Ledoux.

ei

PostScript [BibTex]

2003


PostScript [BibTex]


no image
Statistical Learning Theory

Bousquet, O.

Machine Learning Summer School, August 2003 (talk)

ei

PDF [BibTex]

PDF [BibTex]


no image
Remarks on Statistical Learning Theory

Bousquet, O.

Machine Learning Summer School, August 2003 (talk)

ei

PDF [BibTex]

PDF [BibTex]


no image
Statistical Learning Theory, Capacity and Complexity

Schölkopf, B.

Complexity, 8(4):87-94, July 2003 (article)

Abstract
We give an exposition of the ideas of statistical learning theory, followed by a discussion of how a reinterpretation of the insights of learning theory could potentially also benefit our understanding of a certain notion of complexity.

ei

Web DOI [BibTex]


no image
Dealing with large Diagonals in Kernel Matrices

Weston, J., Schölkopf, B., Eskin, E., Leslie, C., Noble, W.

Annals of the Institute of Statistical Mathematics, 55(2):391-408, June 2003 (article)

Abstract
In kernel methods, all the information about the training data is contained in the Gram matrix. If this matrix has large diagonal values, which arises for many types of kernels, then kernel methods do not perform well: We propose and test several methods for dealing with this problem by reducing the dynamic range of the matrix while preserving the positive definiteness of the Hessian of the quadratic programming problem that one has to solve when training a Support Vector Machine, which is a common kernel approach for pattern recognition.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
The em Algorithm for Kernel Matrix Completion with Auxiliary Data

Tsuda, K., Akaho, S., Asai, K.

Journal of Machine Learning Research, 4, pages: 67-81, May 2003 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
Constructing Descriptive and Discriminative Non-linear Features: Rayleigh Coefficients in Kernel Feature Spaces

Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Smola, A., Müller, K.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5):623-628, May 2003 (article)

Abstract
We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of the Rayleigh coefficient, we propose nonlinear generalizations of Fisher‘s discriminant and oriented PCA using support vector kernel functions. Extensive simulations show the utility of our approach.

ei

DOI [BibTex]

DOI [BibTex]


no image
Tractable Inference for Probabilistic Data Models

Csato, L., Opper, M., Winther, O.

Complexity, 8(4):64-68, April 2003 (article)

Abstract
We present an approximation technique for probabilistic data models with a large number of hidden variables, based on ideas from statistical physics. We give examples for two nontrivial applications. © 2003 Wiley Periodicals, Inc.

ei

PDF GZIP Web [BibTex]

PDF GZIP Web [BibTex]


no image
Feature selection and transduction for prediction of molecular bioactivity for drug design

Weston, J., Perez-Cruz, F., Bousquet, O., Chapelle, O., Elisseeff, A., Schölkopf, B.

Bioinformatics, 19(6):764-771, April 2003 (article)

Abstract
Motivation: In drug discovery a key task is to identify characteristics that separate active (binding) compounds from inactive (non-binding) ones. An automated prediction system can help reduce resources necessary to carry out this task. Results: Two methods for prediction of molecular bioactivity for drug design are introduced and shown to perform well in a data set previously studied as part of the KDD (Knowledge Discovery and Data Mining) Cup 2001. The data is characterized by very few positive examples, a very large number of features (describing three-dimensional properties of the molecules) and rather different distributions between training and test data. Two techniques are introduced specifically to tackle these problems: a feature selection method for unbalanced data and a classifier which adapts to the distribution of the the unlabeled test data (a so-called transductive method). We show both techniques improve identification performance and in conjunction provide an improvement over using only one of the techniques. Our results suggest the importance of taking into account the characteristics in this data which may also be relevant in other problems of a similar type.

ei

Web [BibTex]


no image
Rademacher and Gaussian averages in Learning Theory

Bousquet, O.

Universite de Marne-la-Vallee, March 2003 (talk)

ei

PDF [BibTex]

PDF [BibTex]


no image
Use of the Zero-Norm with Linear Models and Kernel Methods

Weston, J., Elisseeff, A., Schölkopf, B., Tipping, M.

Journal of Machine Learning Research, 3, pages: 1439-1461, March 2003 (article)

Abstract
We explore the use of the so-called zero-norm of the parameters of linear models in learning. Minimization of such a quantity has many uses in a machine learning context: for variable or feature selection, minimizing training error and ensuring sparsity in solutions. We derive a simple but practical method for achieving these goals and discuss its relationship to existing techniques of minimizing the zero-norm. The method boils down to implementing a simple modification of vanilla SVM, namely via an iterative multiplicative rescaling of the training data. Applications we investigate which aid our discussion include variable and feature selection on biological microarray data, and multicategory classification.

ei

PDF PostScript PDF [BibTex]

PDF PostScript PDF [BibTex]


no image
Introduction: Robots with Cognition?

Franz, MO.

6, pages: 38, (Editors: H.H. Bülthoff, K.R. Gegenfurtner, H.A. Mallot, R. Ulrich, F.A. Wichmann), 6. T{\"u}binger Wahrnehmungskonferenz (TWK), February 2003 (talk)

Abstract
Using robots as models of cognitive behaviour has a long tradition in robotics. Parallel to the historical development in cognitive science, one observes two major, subsequent waves in cognitive robotics. The first is based on ideas of classical, cognitivist Artificial Intelligence (AI). According to the AI view of cognition as rule-based symbol manipulation, these robots typically try to extract symbolic descriptions of the environment from their sensors that are used to update a common, global world representation from which, in turn, the next action of the robot is derived. The AI approach has been successful in strongly restricted and controlled environments requiring well-defined tasks, e.g. in industrial assembly lines. AI-based robots mostly failed, however, in the unpredictable and unstructured environments that have to be faced by mobile robots. This has provoked the second wave in cognitive robotics which tries to achieve cognitive behaviour as an emergent property from the interaction of simple, low-level modules. Robots of the second wave are called animats as their architecture is designed to closely model aspects of real animals. Using only simple reactive mechanisms and Hebbian-type or evolutionary learning, the resulting animats often outperformed the highly complex AI-based robots in tasks such as obstacle avoidance, corridor following etc. While successful in generating robust, insect-like behaviour, typical animats are limited to stereotyped, fixed stimulus-response associations. If one adopts the view that cognition requires a flexible, goal-dependent choice of behaviours and planning capabilities (H.A. Mallot, Kognitionswissenschaft, 1999, 40-48) then it appears that cognitive behaviour cannot emerge from a collection of purely reactive modules. It rather requires environmentally decoupled structures that work without directly engaging the actions that it is concerned with. This poses the current challenge to cognitive robotics: How can we build cognitive robots that show the robustness and the learning capabilities of animats without falling back into the representational paradigm of AI? The speakers of the symposium present their approaches to this question in the context of robot navigation and sensorimotor learning. In the first talk, Prof. Helge Ritter introduces a robot system for imitation learning capable of exploring various alternatives in simulation before actually performing a task. The second speaker, Angelo Arleo, develops a model of spatial memory in rat navigation based on his electrophysiological experiments. He validates the model on a mobile robot which, in some navigation tasks, shows a performance comparable to that of the real rat. A similar model of spatial memory is used to investigate the mechanisms of territory formation in a series of robot experiments presented by Prof. Hanspeter Mallot. In the last talk, we return to the domain of sensorimotor learning where Ralf M{\"o}ller introduces his approach to generate anticipatory behaviour by learning forward models of sensorimotor relationships.

ei

Web [BibTex]

Web [BibTex]


no image
An Introduction to Variable and Feature Selection.

Guyon, I., Elisseeff, A.

Journal of Machine Learning, 3, pages: 1157-1182, 2003 (article)

ei

[BibTex]

[BibTex]


no image
New Approaches to Statistical Learning Theory

Bousquet, O.

Annals of the Institute of Statistical Mathematics, 55(2):371-389, 2003 (article)

Abstract
We present new tools from probability theory that can be applied to the analysis of learning algorithms. These tools allow to derive new bounds on the generalization performance of learning algorithms and to propose alternative measures of the complexity of the learning task, which in turn can be used to derive new learning algorithms.

ei

PostScript [BibTex]

PostScript [BibTex]

1998


no image
SVMs — a practical consequence of learning theory

Schölkopf, B.

IEEE Intelligent Systems and their Applications, 13(4):18-21, July 1998 (article)

Abstract
My first exposure to Support Vector Machines came this spring when heard Sue Dumais present impressive results on text categorization using this analysis technique. This issue's collection of essays should help familiarize our readers with this interesting new racehorse in the Machine Learning stable. Bernhard Scholkopf, in an introductory overview, points out that a particular advantage of SVMs over other learning algorithms is that it can be analyzed theoretically using concepts from computational learning theory, and at the same time can achieve good performance when applied to real problems. Examples of these real-world applications are provided by Sue Dumais, who describes the aforementioned text-categorization problem, yielding the best results to date on the Reuters collection, and Edgar Osuna, who presents strong results on application to face detection. Our fourth author, John Platt, gives us a practical guide and a new technique for implementing the algorithm efficiently.

ei

PDF Web DOI [BibTex]

1998


PDF Web DOI [BibTex]


no image
Learning view graphs for robot navigation

Franz, M., Schölkopf, B., Mallot, H., Bülthoff, H.

Autonomous Robots, 5(1):111-125, March 1998 (article)

Abstract
We present a purely vision-based scheme for learning a topological representation of an open environment. The system represents selected places by local views of the surrounding scene, and finds traversable paths between them. The set of recorded views and their connections are combined into a graph model of the environment. To navigate between views connected in the graph, we employ a homing strategy inspired by findings of insect ethology. In robot experiments, we demonstrate that complex visual exploration and navigation tasks can thus be performed without using metric information.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]