Header logo is


2009


no image
Efficient Graphlet Kernels for Large Graph Comparison

Shervashidze, N., Vishwanathan, S., Petri, T., Mehlhorn, K., Borgwardt, K.

In JMLR Workshop and Conference Proceedings Volume 5: AISTATS 2009, pages: 488-495, (Editors: Van Dyk, D. , M. Welling), MIT Press, Cambridge, MA, USA, Twelfth International Conference on Artificial Intelligence and Statistics, April 2009 (inproceedings)

Abstract
State-of-the-art graph kernels do not scale to large graphs with hundreds of nodes and thousands of edges. In this article we propose to compare graphs by counting {it graphlets}, ie subgraphs with $k$ nodes where $k in { 3, 4, 5 }$. Exhaustive enumeration of all graphlets being prohibitively expensive, we introduce two theoretically grounded speedup schemes, one based on sampling and the second one specifically designed for bounded degree graphs. In our experimental evaluation, our novel kernels allow us to efficiently compare large graphs that cannot be tackled by existing graph kernels.

ei

PDF Web [BibTex]

2009


PDF Web [BibTex]


no image
Overlap and refractory effects in a Brain-Computer Interface speller based on the visual P300 Event-Related Potential

Martens, S., Hill, N., Farquhar, J., Schölkopf, B.

Journal of Neural Engineering, 6(2):1-9, April 2009 (article)

Abstract
We reveal the presence of refractory and overlap effects in the event-related potentials in visual P300 speller datasets, and we show their negative impact on the performance of the system. This finding has important implications for how to encode the letters that can be selected for communication. However, we show that such effects are dependent on stimulus parameters: an alternative stimulus type based on apparent motion suffers less from the refractory effects and leads to an improved letter prediction performance.

ei

PDF DOI [BibTex]


no image
Online blind deconvolution for astronomical imaging

Harmeling, S., Hirsch, M., Sra, S., Schölkopf, B.

In Proceedings of the First IEEE International Conference Computational Photography (ICCP 2009), pages: 1-7, IEEE, Piscataway, NJ, USA, First IEEE International Conference on Computational Photography (ICCP), April 2009 (inproceedings)

Abstract
Atmospheric turbulences blur astronomical images taken by earth-based telescopes. Taking many short-time exposures in such a situation provides noisy images of the same object, where each noisy image has a different blur. Commonly astronomers apply a technique called “Lucky Imaging” that selects a few of the recorded frames that fulfill certain criteria, such as reaching a certain peak intensity (“Strehl ratio”). The selected frames are then averaged to obtain a better image. In this paper we introduce and analyze a new method that exploits all the frames and generates an improved image in an online fashion. Our initial experiments with controlled artificial data and real-world astronomical datasets yields promising results.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
A kernel method for unsupervised structured network inference

Lippert, C., Stegle, O., Ghahramani, Z., Borgwardt, KM.

In JMLR Workshop and Conference Proceedings Volume 5: AISTATS 2009, pages: 368-375, (Editors: Van Dyk, D. , M. Welling), MIT Press, Cambridge, MA, USA, Twelfth International Conference on Artificial Intelligence and Statistics, April 2009 (inproceedings)

Abstract
Network inference is the problem of inferring edges between a set of real-world objects, for instance, interactions between pairs of proteins in bioinformatics. Current kernel-based approaches to this problem share a set of common features: (i) they are supervised and hence require labeled training data; (ii) edges in the network are treated as mutually independent and hence topological properties are largely ignored; (iii) they lack a statistical interpretation. We argue that these common assumptions are often undesirable for network inference, and propose (i) an unsupervised kernel method (ii) that takes the global structure of the network into account and (iii) is statistically motivated. We show that our approach can explain commonly used heuristics in statistical terms. In experiments on social networks, different variants of our method demonstrate appealing predictive performance.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
PAC-Bayesian Generalization Bound for Density Estimation with Application to Co-clustering

Seldin, Y., Tishby, N.

In JMLR Workshop and Conference Proceedings Volume 5: AISTATS 2009, pages: 472-479, MIT Press, Cambridge, MA, USA, 12th International Conference on Artificial Intelligence and Statistics, April 2009 (inproceedings)

Abstract
We derive a PAC-Bayesian generalization bound for density estimation. Similar to the PAC-Bayesian generalization bound for classification, the result has the appealingly simple form of a tradeoff between empirical performance and the KL-divergence of the posterior from the prior. Moreover, the PAC-Bayesian generalization bound for classification can be derived as a special case of the bound for density estimation. To illustrate a possible application of our bound we derive a generalization bound for co-clustering. The bound provides a criterion to evaluate the ability of co-clustering to predict new co-occurrences, thus introducing a supervised flavor to this traditionally unsupervised task.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions

Bubeck, S., von Luxburg, U.

Journal of Machine Learning Research, 10, pages: 657-698, March 2009 (article)

Abstract
Clustering is often formulated as a discrete optimization problem. The objective is to find, among all partitions of the data set, the best one according to some quality measure. However, in the statistical setting where we assume that the finite data set has been sampled from some underlying space, the goal is not to find the best partition of the given sample, but to approximate the true partition of the underlying space. We argue that the discrete optimization approach usually does not achieve this goal, and instead can lead to inconsistency. We construct examples which provably have this behavior. As in the case of supervised learning, the cure is to restrict the size of the function classes under consideration. For appropriate “small” function classes we can prove very general consistency theorems for clustering optimization schemes. As one particular algorithm for clustering with a restricted function space we introduce “nearest neighbor clustering”. Similar to the k-nearest neighbor classifier in supervised learning, this algorithm can be seen as a general baseline algorithm to minimize arbitrary clustering objective functions. We prove that it is statistically consistent for all commonly used clustering objective functions.

ei

PDF Web [BibTex]


no image
ICA with Sparse Connections: Revisited

Zhang, K., Peng, H., Chan, L., Hyvärinen, A.

In Independent Component Analysis and Signal Separation, pages: 195-202, (Editors: Adali, T. , Christian Jutten, J.M. Travassos Romano, A. Kardec Barros), Springer, Berlin, Germany, 8th International Conference on Independent Component Analysis and Signal Separation (ICA), March 2009 (inproceedings)

Abstract
When applying independent component analysis (ICA), sometimes we expect the connections between the observed mixtures and the recovered independent components (or the original sources) to be sparse, to make the interpretation easier or to reduce the random effect in the results. In this paper we propose two methods to tackle this problem. One is based on adaptive Lasso, which exploits the L 1 penalty with data-adaptive weights. We show the relationship between this method and the classic information criteria such as BIC and AIC. The other is based on optimal brain surgeon, and we show how its stopping criterion is related to the information criteria. This method produces the solution path of the transformation matrix, with different number of zero entries. These methods involve low computational loads. Moreover, in each method, the parameter controlling the sparsity level of the transformation matrix has clear interpretations. By setting such parameters to certain values, the results of the proposed methods are consistent with those produced by classic information criteria.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Tech-note: Iterative design and test of a multimodal experience

Reckter, H., Geiger, C., Singer, J., Streuber, S.

In Proceedings of the IEEE Symposium on 3D User Interfaces (3DUI 2009), pages: 99-102, (Editors: Kiyokawa, K. , S. Coquillart, R. Balakrishnan), IEEE Service Center, Piscataway, NJ, USA, IEEE Symposium on 3D User Interfaces (3DUI), March 2009 (inproceedings)

Abstract
The goal of the Turtle surf project described in this tech-note is to design, implement and evaluate a multimodal installation that should provide a good user experience in a virtual 3D world. For this purpose we combine audio-visual media forms and different types of haptic/tactile feedback. For the latter, we focus on the application of vibrational feedback, wind and water spray and heat. We follow a user-centered design approach and try to get user feedback as early as possible during the iterative design process. We present the conceptual idea of the Turtle surf project, and the iterative design and test of prototypes that helped us to refine the final design based on collected user feedback.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Protein Functional Class Prediction With a Combined Graph

Shin, H., Tsuda, K., Schölkopf, B.

Expert Systems with Applications, 36(2):3284-3292, March 2009 (article)

Abstract
In bioinformatics, there exist multiple descriptions of graphs for the same set of genes or proteins. For instance, in yeast systems, graph edges can represent different relationships such as protein–protein interactions, genetic interactions, or co-participation in a protein complex, etc. Relying on similarities between nodes, each graph can be used independently for prediction of protein function. However, since different graphs contain partly independent and partly complementary information about the problem at hand, one can enhance the total information extracted by combining all graphs. In this paper, we propose a method for integrating multiple graphs within a framework of semi-supervised learning. The method alternates between minimizing the objective function with respect to network output and with respect to combining weights. We apply the method to the task of protein functional class prediction in yeast. The proposed method performs significantly better than the same algorithm trained on any singl e graph.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Gaussian Process Dynamic Programming

Deisenroth, M., Rasmussen, C., Peters, J.

Neurocomputing, 72(7-9):1508-1524, March 2009 (article)

Abstract
Reinforcement learning (RL) and optimal control of systems with contin- uous states and actions require approximation techniques in most interesting cases. In this article, we introduce Gaussian process dynamic programming (GPDP), an approximate value-function based RL algorithm. We consider both a classic optimal control problem, where problem-specific prior knowl- edge is available, and a classic RL problem, where only very general priors can be used. For the classic optimal control problem, GPDP models the unknown value functions with Gaussian processes and generalizes dynamic programming to continuous-valued states and actions. For the RL problem, GPDP starts from a given initial state and explores the state space using Bayesian active learning. To design a fast learner, available data has to be used efficiently. Hence, we propose to learn probabilistic models of the a priori unknown transition dynamics and the value functions on the fly. In both cases, we successfully apply the resulting continuous-valued controllers to the under-actuated pendulum swing up and analyze the performances of the suggested algorithms. It turns out that GPDP uses data very efficiently and can be applied to problems, where classic dynamic programming would be cumbersome.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
An algebraic characterization of the optimum of regularized kernel methods

Dinuzzo, F., De Nicolao, G.

Machine Learning, 74(3):315-345, March 2009 (article)

Abstract
The representer theorem for kernel methods states that the solution of the associated variational problem can be expressed as the linear combination of a finite number of kernel functions. However, for non-smooth loss functions, the analytic characterization of the coefficients poses nontrivial problems. Standard approaches resort to constrained optimization reformulations which, in general, lack a closed-form solution. Herein, by a proper change of variable, it is shown that, for any convex loss function, the coefficients satisfy a system of algebraic equations in a fixed-point form, which may be directly obtained from the primal formulation. The algebraic characterization is specialized to regression and classification methods and the fixed-point equations are explicitly characterized for many loss functions of practical interest. The consequences of the main result are then investigated along two directions. First, the existence of an unconstrained smooth reformulation of the original non-smooth problem is proven. Second, in the context of SURE (Stein’s Unbiased Risk Estimation), a general formula for the degrees of freedom of kernel regression methods is derived.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Towards quantitative PET/MRI: a review of MR-based attenuation correction techniques

Hofmann, M., Pichler, B., Schölkopf, B., Beyer, T.

European Journal of Nuclear Medicine and Molecular Imaging, 36(Supplement 1):93-104, March 2009 (article)

Abstract
Introduction Positron emission tomography (PET) is a fully quantitative technology for imaging metabolic pathways and dynamic processes in vivo. Attenuation correction of raw PET data is a prerequisite for quantification and is typically based on separate transmission measurements. In PET/CT attenuation correction, however, is performed routinely based on the available CT transmission data. Objective Recently, combined PET/magnetic resonance (MR) has been proposed as a viable alternative to PET/CT. Current concepts of PET/MRI do not include CT-like transmission sources and, therefore, alternative methods of PET attenuation correction must be found. This article reviews existing approaches to MR-based attenuation correction (MR-AC). Most groups have proposed MR-AC algorithms for brain PET studies and more recently also for torso PET/MR imaging. Most MR-AC strategies require the use of complementary MR and transmission images, or morphology templates generated from transmission images. We review and discuss these algorithms and point out challenges for using MR-AC in clinical routine. Discussion MR-AC is work-in-progress with potentially promising results from a template-based approach applicable to both brain and torso imaging. While efforts are ongoing in making clinically viable MR-AC fully automatic, further studies are required to realize the potential benefits of MR-based motion compensation and partial volume correction of the PET data.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Generating Spike Trains with Specified Correlation Coefficients

Macke, J., Berens, P., Ecker, A., Tolias, A., Bethge, M.

Neural Computation, 21(2):397-423, February 2009 (article)

Abstract
Spike trains recorded from populations of neurons can exhibit substantial pairwise correlations between neurons and rich temporal structure. Thus, for the realistic simulation and analysis of neural systems, it is essential to have efficient methods for generating artificial spike trains with specified correlation structure. Here we show how correlated binary spike trains can be simulated by means of a latent multivariate gaussian model. Sampling from the model is computationally very efficient and, in particular, feasible even for large populations of neurons. The entropy of the model is close to the theoretical maximum for a wide range of parameters. In addition, this framework naturally extends to correlations over time and offers an elegant way to model correlated neural spike counts with arbitrary marginal distributions.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Automatic detection of preclinical neurodegeneration: Presymptomatic Huntington disease

Klöppel, S., Chu, C., Tan, G., Draganski, B., Johnson, H., Paulsen, J., Kienzle, W., Tabrizi, S., Ashburner, J., Frackowiak, R.

Neurology, 72(5):426-431, February 2009 (article)

Abstract
Background: Treatment of neurodegenerative diseases is likely to be most beneficial in the very early, possibly preclinical stages of degeneration. We explored the usefulness of fully automatic structural MRI classification methods for detecting subtle degenerative change. The availability of a definitive genetic test for Huntington disease (HD) provides an excellent metric for judging the performance of such methods in gene mutation carriers who are free of symptoms. Methods: Using the gray matter segment of MRI scans, this study explored the usefulness of a multivariate support vector machine to automatically identify presymptomatic HD gene mutation carriers (PSCs) in the absence of any a priori information. A multicenter data set of 96 PSCs and 95 age- and sex-matched controls was studied. The PSC group was subclassified into three groups based on time from predicted clinical onset, an estimate that is a function of DNA mutation size and age. Results: Subjects with at least a 33% chance of developing unequivocal signs of HD in 5 years were correctly assigned to the PSC group 69% of the time. Accuracy improved to 83% when regions affected by the disease were selected a priori for analysis. Performance was at chance when the probability of developing symptoms in 5 years was less than 10%. Conclusions: Presymptomatic Huntington disease gene mutation carriers close to estimated diagnostic onset were successfully separated from controls on the basis of single anatomic scans, without additional a priori information. Prior information is required to allow separation when degenerative changes are either subtle or variable.

ei

Web [BibTex]

Web [BibTex]


no image
Enumeration of condition-dependent dense modules in protein interaction networks

Georgii, E., Dietmann, S., Uno, T., Pagel, P., Tsuda, K.

Bioinformatics, 25(7):933-940, February 2009 (article)

Abstract
Motivation: Modern systems biology aims at understanding how the different molecular components of a biological cell interact. Often, cellular functions are performed by complexes consisting of many different proteins. The composition of these complexes may change according to the cellular environment, and one protein may be involved in several different processes. The automatic discovery of functional complexes from protein interaction data is challenging. While previous approaches use approximations to extract dense modules, our approach exactly solves the problem of dense module enumeration. Furthermore, constraints from additional information sources such as gene expression and phenotype data can be integrated, so we can systematically mine for dense modules with interesting profiles. Results: Given a weighted protein interaction network, our method discovers all protein sets that satisfy a user-defined minimum density threshold. We employ a reverse search strategy, which allows us to exploit the density criterion in an efficient way. Our experiments show that the novel approach is feasible and produces biologically meaningful results. In comparative validation studies using yeast data, the method achieved the best overall prediction performance with respect to confirmed complexes. Moreover, by enhancing the yeast network with phenotypic and phylogenetic profiles and the human network with tissue-specific expression data, we identified condition-dependent complex variants.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Prototype Classification: Insights from Machine Learning

Graf, A., Bousquet, O., Rätsch, G., Schölkopf, B.

Neural Computation, 21(1):272-300, January 2009 (article)

Abstract
We shed light on the discrimination between patterns belonging to two different classes by casting this decoding problem into a generalized prototype framework. The discrimination process is then separated into two stages: a projection stage that reduces the dimensionality of the data by projecting it on a line and a threshold stage where the distributions of the projected patterns of both classes are separated. For this, we extend the popular mean-of-class prototype classification using algorithms from machine learning that satisfy a set of invariance properties. We report a simple yet general approach to express different types of linear classification algorithms in an identical and easy-to-visualize formal framework using generalized prototypes where these prototypes are used to express the normal vector and offset of the hyperplane. We investigate nonmargin classifiers such as the classical prototype classifier, the Fisher classifier, and the relevance vector machine. We then study hard and soft margin cl assifiers such as the support vector machine and a boosted version of the prototype classifier. Subsequently, we relate mean-of-class prototype classification to other classification algorithms by showing that the prototype classifier is a limit of any soft margin classifier and that boosting a prototype classifier yields the support vector machine. While giving novel insights into classification per se by presenting a common and unified formalism, our generalized prototype framework also provides an efficient visualization and a principled comparison of machine learning classification.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Automatic classification of brain resting states using fMRI temporal signals

Soldati, N., Robinson, S., Persello, C., Jovicich, J., Bruzzone, L.

Electronics Letters, 45(1):19-21, January 2009 (article)

Abstract
A novel technique is presented for the automatic discrimination between networks of dasiaresting statesdasia of the human brain and physiological fluctuations in functional magnetic resonance imaging (fMRI). The method is based on features identified via a statistical approach to group independent component analysis time courses, which may be extracted from fMRI data. This technique is entirely automatic and, unlike other approaches, uses temporal rather than spatial information. The method achieves 83% accuracy in the identification of resting state networks.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
The DICS repository: module-assisted analysis of disease-related gene lists

Dietmann, S., Georgii, E., Antonov, A., Tsuda, K., Mewes, H.

Bioinformatics, 25(6):830-831, January 2009 (article)

Abstract
The DICS database is a dynamic web repository of computationally predicted functional modules from the human protein–protein interaction network. It provides references to the CORUM, DrugBank, KEGG and Reactome pathway databases. DICS can be accessed for retrieving sets of overlapping modules and protein complexes that are significantly enriched in a gene list, thereby providing valuable information about the functional context.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
mGene: accurate SVM-based gene finding with an application to nematode genomes

Schweikert, G., Zien, A., Zeller, G., Behr, J., Dieterich, C., Ong, C., Philips, P., De Bona, F., Hartmann, L., Bohlen, A., Krüger, N., Sonnenburg, S., Rätsch, G.

Genome Research, 19(11):2133-43, 2009 (article)

Abstract
We present a highly accurate gene-prediction system for eukaryotic genomes, called mGene. It combines in an unprecedented manner the flexibility of generalized hidden Markov models (gHMMs) with the predictive power of modern machine learning methods, such as Support Vector Machines (SVMs). Its excellent performance was proved in an objective competition based on the genome of the nematode Caenorhabditis elegans. Considering the average of sensitivity and specificity, the developmental version of mGene exhibited the best prediction performance on nucleotide, exon, and transcript level for ab initio and multiple-genome gene-prediction tasks. The fully developed version shows superior performance in 10 out of 12 evaluation criteria compared with the other participating gene finders, including Fgenesh++ and Augustus. An in-depth analysis of mGene's genome-wide predictions revealed that approximately 2200 predicted genes were not contained in the current genome annotation. Testing a subset of 57 of these genes by RT-PCR and sequencing, we confirmed expression for 24 (42%) of them. mGene missed 300 annotated genes, out of which 205 were unconfirmed. RT-PCR testing of 24 of these genes resulted in a success rate of merely 8%. These findings suggest that even the gene catalog of a well-studied organism such as C. elegans can be substantially improved by mGene's predictions. We also provide gene predictions for the four nematodes C. briggsae, C. brenneri, C. japonica, and C. remanei. Comparing the resulting proteomes among these organisms and to the known protein universe, we identified many species-specific gene inventions. In a quality assessment of several available annotations for these genomes, we find that mGene's predictions are most accurate.

ei

DOI [BibTex]

DOI [BibTex]


no image
Efficient Bregman Range Search

Cayton, L.

In Advances in Neural Information Processing Systems 22, pages: 243-251, (Editors: Bengio, Y. , D. Schuurmans, J. Lafferty, C. Williams, A. Culotta), Curran, Red Hook, NY, USA, 23rd Annual Conference on Neural Information Processing Systems (NIPS), 2009 (inproceedings)

Abstract
We develop an algorithm for efficient range search when the notion of dissimilarity is given by a Bregman divergence. The range search task is to return all points in a potentially large database that are within some specified distance of a query. It arises in many learning algorithms such as locally-weighted regression, kernel density estimation, neighborhood graph-based algorithms, and in tasks like outlier detection and information retrieval. In metric spaces, efficient range search-like algorithms based on spatial data structures have been deployed on a variety of statistical tasks. Here we describe an algorithm for range search for an arbitrary Bregman divergence. This broad class of dissimilarity measures includes the relative entropy, Mahalanobis distance, Itakura-Saito divergence, and a variety of matrix divergences. Metric methods cannot be directly applied since Bregman divergences do not in general satisfy the triangle inequality. We derive geometric properties of Bregman divergences that yield an efficient algorithm for range search based on a recently proposed space decomposition for Bregman divergences.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Structure and activity of the N-terminal substrate recognition domains in proteasomal ATPases

Djuranovic, S., Hartmann, MD., Habeck, M., Ursinus, A., Zwickl, P., Martin, J., Lupas, AN., Zeth, K.

Molecular Cell, 34(5):580-590, 2009 (article)

Abstract
The proteasome forms the core of the protein quality control system in archaea and eukaryotes and also occurs in one bacterial lineage, the Actinobacteria. Access to its proteolytic compartment is controlled by AAA ATPases, whose N-terminal domains (N domains) are thought to mediate substrate recognition. The N domains of an archaeal proteasomal ATPase, Archaeoglobus fulgidus PAN, and of its actinobacterial homolog, Rhodococcus erythropolis ARC, form hexameric rings, whose subunits consist of an N-terminal coiled coil and a C-terminal OB domain. In ARC-N, the OB domains are duplicated and form separate rings. PAN-N and ARC-N can act as chaperones, preventing the aggregation of heterologous proteins in vitro, and this activity is preserved in various chimeras, even when these include coiled coils and OB domains from unrelated proteins. The structures suggest a molecular mechanism for substrate processing based on concerted radial motions of the coiled coils relative to the OB rings.

ei

DOI [BibTex]

DOI [BibTex]


no image
Discussion of: Brownian Distance Covariance

Gretton, A., Fukumizu, K., Sriperumbudur, B.

The Annals of Applied Statistics, 3(4):1285-1294, 2009 (article)

ei

[BibTex]

[BibTex]


no image
Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions

Sriperumbudur, B., Fukumizu, K., Gretton, A., Lanckriet, G., Schölkopf, B.

In Advances in Neural Information Processing Systems 22, pages: 1750-1758, (Editors: Y Bengio and D Schuurmans and J Lafferty and C Williams and A Culotta), Curran, Red Hook, NY, USA, 23rd Annual Conference on Neural Information Processing Systems (NIPS), 2009 (inproceedings)

Abstract
Embeddings of probability measures into reproducing kernel Hilbert spaces have been proposed as a straightforward and practical means of representing and comparing probabilities. In particular, the distance between embeddings (the maximum mean discrepancy, or MMD) has several key advantages over many classical metrics on distributions, namely easy computability, fast convergence and low bias of finite sample estimates. An important requirement of the embedding RKHS is that it be characteristic: in this case, the MMD between two distributions is zero if and only if the distributions coincide. Three new results on the MMD are introduced in the present study. First, it is established that MMD corresponds to the optimal risk of a kernel classifier, thus forming a natural link between the distance between distributions and their ease of classification. An important consequence is that a kernel must be characteristic to guarantee classifiability between distributions in the RKHS. Second, the class of characteristic kernels is broadened to incorporate all strictly positive definite kernels: these include non-translation invariant kernels and kernels on non-compact domains. Third, a generalization of the MMD is proposed for families of kernels, as the supremum over MMDs on a class of kernels (for instance the Gaussian kernels with different bandwidths). This extension is necessary to obtain a single distance measure if a large selection or class of characteristic kernels is potentially appropriate. This generalization is reasonable, given that it corresponds to the problem of learning the kernel by minimizing the risk of the corresponding kernel classifier. The generalized MMD is shown to have consistent finite sample estimates, and its performance is demonstrated on a homogeneity testing example.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Nonlinear directed acyclic structure learning with weakly additive noise models

Tillman, R., Gretton, A., Spirtes, P.

In Advances in Neural Information Processing Systems 22, pages: 1847-1855, (Editors: Bengio, Y. , D. Schuurmans, J. Lafferty, C. Williams, A. Culotta), Curran, Red Hook, NY, USA, 23rd Annual Conference on Neural Information Processing Systems (NIPS), 2009 (inproceedings)

Abstract
The recently proposed emph{additive noise model} has advantages over previous structure learning algorithms, when attempting to recover some true data generating mechanism, since it (i) does not assume linearity or Gaussianity and (ii) can recover a unique DAG rather than an equivalence class. However, its original extension to the multivariate case required enumerating all possible DAGs, and for some special distributions, e.g. linear Gaussian, the model is invertible and thus cannot be used for structure learning. We present a new approach which combines a PC style search using recent advances in kernel measures of conditional dependence with local searches for additive noise models in substructures of the equivalence class. This results in a more computationally efficient approach that is useful for arbitrary distributions even when additive noise models are invertible. Experiments with synthetic and real data show that this method is more accurate than previous methods when data are nonlinear and/or non-Gaussian.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Efficient factor GARCH models and factor-DCC models

Zhang, K., Chan, L.

Quantitative Finance, 9(1):71-91, 2009 (article)

Abstract
We report that, in the estimation of univariate GARCH or multivariate generalized orthogonal GARCH (GO-GARCH) models, maximizing the likelihood is equivalent to making the standardized residuals as independent as possible. Based on this, we propose three factor GARCH models in the framework of GO-GARCH: independent-factor GARCH exploits factors that are statistically as independent as possible; factors in best-factor GARCH have the largest autocorrelation in their squared values such that their volatilities could be forecast well by univariate GARCH; and factors in conditional-decorrelation GARCH are conditionally as uncorrelated as possible. A convenient two-step method for estimating these models is introduced. Since the extracted factors may still have weak conditional correlations, we further propose factor-DCC models as an extension to the above factor GARCH models with dynamic conditional correlation (DCC) modelling the remaining conditional correlations between factors. Experimental results for the Hong Kong stock market show that conditional-decorrelation GARCH and independent-factor GARCH have better generalization performance than the original GO-GARCH, and that conditional-decorrelation GARCH (among factor GARCH models) and its extension with DCC embedded (among factor-DCC models) behave best.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Graphical models for decoding in BCI visual speller systems

Martens, S., Farquhar, J., Hill, J., Schölkopf, B.

In pages: 470-473, IEEE, 4th International IEEE EMBS Conference on Neural Engineering (NER), 2009 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]


no image
A Fast, Consistent Kernel Two-Sample Test

Gretton, A., Fukumizu, K., Harchaoui, Z., Sriperumbudur, B.

In Advances in Neural Information Processing Systems 22, pages: 673-681, (Editors: Bengio, Y. , D. Schuurmans, J. Lafferty, C. Williams, A. Culotta), Curran, Red Hook, NY, USA, 23rd Annual Conference on Neural Information Processing Systems (NIPS), 2009 (inproceedings)

Abstract
A kernel embedding of probability distributions into reproducing kernel Hilbert spaces (RKHS) has recently been proposed, which allows the comparison of two probability measures P and Q based on the distance between their respective embeddings: for a sufficiently rich RKHS, this distance is zero if and only if P and Q coincide. In using this distance as a statistic for a test of whether two samples are from different distributions, a major difficulty arises in computing the significance threshold, since the empirical statistic has as its null distribution (where P = Q) an infinite weighted sum of x2 random variables. Prior finite sample approximations to the null distribution include using bootstrap resampling, which yields a consistent estimate but is computationally costly; and fitting a parametric model with the low order moments of the test statistic, which can work well in practice but has no consistency or accuracy guarantees. The main result of the present work is a novel estimate of the null distribution, computed from the eigenspectrum of the Gram matrix on the aggregate sample from P and Q, and having lower computational cost than the bootstrap. A proof of consistency of this estimate is provided. The performance of the null distribution estimate is compared with the bootstrap and parametric approaches on an artificial example, high dimensional multivariate data, and text.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Augmenting Feature-driven fMRI Analyses: Semi-supervised learning and resting state activity

Blaschko, M., Shelton, J., Bartels, A.

In Advances in Neural Information Processing Systems 22, pages: 126-134, (Editors: Bengio, Y. , D. Schuurmans, J. Lafferty, C. Williams, A. Culotta), Curran, Red Hook, NY, USA, 23rd Annual Conference on Neural Information Processing Systems (NIPS), 2009 (inproceedings)

Abstract
Resting state activity is brain activation that arises in the absence of any task, and is usually measured in awake subjects during prolonged fMRI scanning sessions where the only instruction given is to close the eyes and do nothing. It has been recognized in recent years that resting state activity is implicated in a wide variety of brain function. While certain networks of brain areas have different levels of activation at rest and during a task, there is nevertheless significant similarity between activations in the two cases. This suggests that recordings of resting state activity can be used as a source of unlabeled data to augment discriminative regression techniques in a semi-supervised setting. We evaluate this setting empirically yielding three main results: (i) regression tends to be improved by the use of Laplacian regularization even when no additional unlabeled data are available, (ii) resting state data seem to have a similar marginal distribution to that recorded during the execution of a visual processing task implying largely similar types of activation, and (iii) this source of information can be broadly exploited to improve the robustness of empirical inference in fMRI studies, an inherently data poor domain.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Non-linear System Identification: Visual Saliency Inferred from Eye-Movement Data

Wichmann, F., Kienzle, W., Schölkopf, B., Franz, M.

Journal of Vision, 9(8):article 32, 2009 (article)

Abstract
For simple visual patterns under the experimenter's control we impose which information, or features, an observer can use to solve a given perceptual task. For natural vision tasks, however, there are typically a multitude of potential features in a given visual scene which the visual system may be exploiting when analyzing it: edges, corners, contours, etc. Here we describe a novel non-linear system identification technique based on modern machine learning methods that allows the critical features an observer uses to be inferred directly from the observer's data. The method neither requires stimuli to be embedded in noise nor is it limited to linear perceptive fields (classification images). We demonstrate our technique by deriving the critical image features observers fixate in natural scenes (bottom-up visual saliency). Unlike previous studies where the relevant structure is determined manually—e.g. by selecting Gabors as visual filters—we do not make any assumptions in this regard, but numerically infer number and properties them from the eye-movement data. We show that center-surround patterns emerge as the optimal solution for predicting saccade targets from local image structure. The resulting model, a one-layer feed-forward network with contrast gain-control, is surprisingly simple compared to previously suggested saliency models. Nevertheless, our model is equally predictive. Furthermore, our findings are consistent with neurophysiological hardware in the superior colliculus. Bottom-up visual saliency may thus not be computed cortically as has been thought previously.

ei

Web DOI [BibTex]


no image
mGene.web: a web service for accurate computational gene finding

Schweikert, G., Behr, J., Zien, A., Zeller, G., Ong, C., Sonnenburg, S., Rätsch, G.

Nucleic Acids Research, 37, pages: W312-6, 2009 (article)

Abstract
We describe mGene.web, a web service for the genome-wide prediction of protein coding genes from eukaryotic DNA sequences. It offers pre-trained models for the recognition of gene structures including untranslated regions in an increasing number of organisms. With mGene.web, users have the additional possibility to train the system with their own data for other organisms on the push of a button, a functionality that will greatly accelerate the annotation of newly sequenced genomes. The system is built in a highly modular way, such that individual components of the framework, like the promoter prediction tool or the splice site predictor, can be used autonomously. The underlying gene finding system mGene is based on discriminative machine learning techniques and its high accuracy has been demonstrated in an international competition on nematode genomes. mGene.web is available at http://www.mgene.org/web, it is free of charge and can be used for eukaryotic genomes of small to moderate size (several hundred Mbp).

ei

DOI [BibTex]

DOI [BibTex]


no image
Fast subtree kernels on graphs

Shervashidze, N., Borgwardt, K.

In Advances in Neural Information Processing Systems 22, pages: 1660-1668, (Editors: Bengio, Y. , D. Schuurmans, J. Lafferty, C. Williams, A. Culotta), Curran, Red Hook, NY, USA, 23rd Annual Conference on Neural Information Processing Systems (NIPS), 2009 (inproceedings)

Abstract
In this article, we propose fast subtree kernels on graphs. On graphs with n nodes and m edges and maximum degree d, these kernels comparing subtrees of height h can be computed in O(mh), whereas the classic subtree kernel by Ramon & G{\"a}rtner scales as O(n24dh). Key to this efficiency is the observation that the Weisfeiler-Lehman test of isomorphism from graph theory elegantly computes a subtree kernel as a byproduct. Our fast subtree kernels can deal with labeled graphs, scale up easily to large graphs and outperform state-of-the-art graph kernels on several classification benchmark datasets in terms of accuracy and runtime.

ei

PDF Web [BibTex]

PDF Web [BibTex]


Thumb xl screen shot 2012 02 21 at 15.56.00  2
On feature combination for multiclass object classification

Gehler, P., Nowozin, S.

In Proceedings of the Twelfth IEEE International Conference on Computer Vision, pages: 221-228, ICCV, 2009, oral presentation (inproceedings)

ei ps

project page, code, data GoogleScholar pdf DOI [BibTex]

project page, code, data GoogleScholar pdf DOI [BibTex]


no image
Magnetic mobile micro-robots

Pawashe, C., Floyd, S., Sitti, M.

7eme Journees Nationales de la Recherche en Robotique, 2009 (article)

pi

[BibTex]

[BibTex]


no image
Gecko-Inspired Directional and Controllable Adhesion

Murphy, M. P., Aksak, B., Sitti, M.

Small, 5(2):170-175, WILEY-VCH Verlag, 2009 (article)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Multiple magnetic microrobot control using electrostatic anchoring

Pawashe, C., Floyd, S., Sitti, M.

Applied Physics Letters, 94(16):164108, AIP, 2009 (article)

pi

[BibTex]

[BibTex]


no image
Characterization of bacterial actuation of micro-objects

Behkam, B., Sitti, M.

In Robotics and Automation, 2009. ICRA’09. IEEE International Conference on, pages: 1022-1027, 2009 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Wet self-cleaning of biologically inspired elastomer mushroom shaped microfibrillar adhesives

Kim, S., Cheung, E., Sitti, M.

Langmuir, 25(13):7196-7199, ACS Publications, 2009 (article)

pi

[BibTex]

[BibTex]


no image
Compliant footpad design analysis for a bio-inspired quadruped amphibious robot

Park, H. S., Sitti, M.

In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pages: 645-651, 2009 (inproceedings)

pi

[BibTex]

[BibTex]


no image
A novel artificial hair receptor based on aligned PVDF micro/nano fibers

Weiting, Liu, Bilsay, Sumer, Cesare, Stefanini, Arianna, Menciassi, Fei, Li, Dajing, Chen, Paolo, Dario, Metin, Sitti, Xin, Fu

In Robotics and Biomimetics, 2008. ROBIO 2008. IEEE International Conference on, pages: 49-54, 2009 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Robot ceiling climbers harness new tricks

Marks, Paul

New Scientist, 202(2705):18-19, Reed Business Information, 2009 (article)

pi

[BibTex]

[BibTex]


no image
Waalbot: Agile climbing with synthetic fibrillar dry adhesives

Murphy, M. P., Sitti, M.

In Robotics and Automation, 2009. ICRA’09. IEEE International Conference on, pages: 1599-1600, 2009 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Biologically-Inspired Patterned and Coated Adhesives for Medical Devices

Glass, P, Chung, H, Lee, C, Tworkoski, E, Washburn, NR, Sitti, M

Journal of Medical Devices, 3(2):027537, American Society of Mechanical Engineers, 2009 (article)

pi

[BibTex]

[BibTex]


no image
Modeling and experimental characterization of an untethered magnetic micro-robot

Pawashe, C., Floyd, S., Sitti, M.

The International Journal of Robotics Research, 28(8):1077-1094, Sage Publications, 2009 (article)

pi

[BibTex]

[BibTex]


no image
Towards automated nanoassembly with the atomic force microscope: A versatile drift compensation procedure

Krohs, F., Onal, C., Sitti, M., Fatikow, S.

Journal of Dynamic Systems, Measurement, and Control, 131(6):061106, American Society of Mechanical Engineers, 2009 (article)

pi

[BibTex]

[BibTex]


no image
Enhanced adhesion by gecko-inspired hierarchical fibrillar adhesives

Murphy, M. P., Kim, S., Sitti, M.

ACS applied materials \& interfaces, 1(4):849-855, American Chemical Society, 2009 (article)

pi

Project Page [BibTex]

Project Page [BibTex]


no image
Miniature devices: Voyage of the microrobots

Sitti, M.

Nature, 458(7242):1121-1122, Nature Publishing Group, 2009 (article)

pi

[BibTex]

[BibTex]


no image
Piezoelectric ultrasonic resonant micromotor with a volume of less than 1 mm 3 for use in medical microbots

Watson, B., Friend, J., Yeo, L., Sitti, M.

In Robotics and Automation, 2009. ICRA’09. IEEE International Conference on, pages: 2225-2230, 2009 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Dynamic modeling and analysis of pitch motion of a basilisk lizard inspired quadruped robot running on water

Park, H. S., Floyd, S., Sitti, M.

In Robotics and Automation, 2009. ICRA’09. IEEE International Conference on, pages: 2655-2660, 2009 (inproceedings)

pi

[BibTex]

[BibTex]


no image
Dry spinning based spinneret based tunable engineered parameters (STEP) technique for controlled and aligned deposition of polymeric nanofibers

Nain, A. S., Sitti, M., Jacobson, A., Kowalewski, T., Amon, C.

Macromolecular rapid communications, 30(16):1406-1412, WILEY-VCH Verlag, 2009 (article)

pi

[BibTex]

[BibTex]