Header logo is


2012


no image
ShapePheno: Unsupervised extraction of shape phenotypes from biological image collections

Karaletsos, T., Stegle, O., Dreyer, C., Winn, J., Borgwardt, K.

Bioinformatics, 28(7):1001-1008, April 2012 (article)

Abstract
Motivation: Accurate large-scale phenotyping has recently gained considerable importance in biology. For example, in genome wide association studies technological advances have rendered genotyping cheap, leaving phenotype acquisition as the major bottleneck. Automatic image analysis is one major strategy to phenotype individuals in large numbers. Current approaches for visual phenotyping focus predominantly on summarizing statistics and geometric measures, such as height and width of an individual, or color histograms and patterns. However, more subtle, but biologically informative phenotypes, such as the local deformation of the shape of an individual with respect to the population mean cannot be automatically extracted and quantified by current techniques. Results: We propose a probabilistic machine learning model that allows for the extraction of deformation phenotypes from biological images, making them available as quantitative traits for downstream analysis. Our approach jointly models a collection of images using a learned common template that is mapped onto each image through a deformable smooth transformation. In a case study we analyze the shape deformations of 388 guppy fish (Poecilia reticulata). We find that the flexible shape phenotypes our model extracts are complementary to basic geometric measures. Moreover, these quantitative traits assort the observations into distinct groups and can be mapped to polymorphic genetic loci of the sample set.

ei

Web DOI [BibTex]

2012


Web DOI [BibTex]


no image
A New Perceptual Bias Reveals Suboptimal Population Decoding of Sensory Responses

Putzeys, T., Bethge, M., Wichmann, F., Wagemans, J., Goris, R.

PLoS Computational Biology, 8(4):1-13, April 2012 (article)

Abstract
Several studies have reported optimal population decoding of sensory responses in two-alternative visual discrimination tasks. Such decoding involves integrating noisy neural responses into a more reliable representation of the likelihood that the stimuli under consideration evoked the observed responses. Importantly, an ideal observer must be able to evaluate likelihood with high precision and only consider the likelihood of the two relevant stimuli involved in the discrimination task. We report a new perceptual bias suggesting that observers read out the likelihood representation with remarkably low precision when discriminating grating spatial frequencies. Using spectrally filtered noise, we induced an asymmetry in the likelihood function of spatial frequency. This manipulation mainly affects the likelihood of spatial frequencies that are irrelevant to the task at hand. Nevertheless, we find a significant shift in perceived grating frequency, indicating that observers evaluate likelihoods of a broad range of irrelevant frequencies and discard prior knowledge of stimulus alternatives when performing two-alternative discrimination.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Patterns of cis regulatory variation in diverse human populations

Stranger, BE., Montgomery, SB., Dimas, AS., Parts, L., Stegle, O., Ingle, CE., Sekowska, M., Smith, GD., Evans, D., Gutierrez-Arcelus, M., others

PLoS genetics, 8(4):e1002639, April 2012 (article)

Abstract
he genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants, but also more recently to assist in the interpretation and elucidation of disease signals. To date, many studies have looked in specific tissues and population-based samples, but there has been limited assessment of the degree of inter-population variability in regulatory variation. We analyzed genome-wide gene expression in lymphoblastoid cell lines from a total of 726 individuals from 8 global populations from the HapMap3 project and correlated gene expression levels with HapMap3 SNPs located in cis to the genes. We describe the influence of ancestry on gene expression levels within and between these diverse human populations and uncover a non-negligible impact on global patterns of gene expression. We further dissect the specific functional pathways differentiated between populations. We also identify 5,691 expression quantitative trait loci (eQTLs) after controlling for both non-genetic factors and population admixture and observe that half of the cis-eQTLs are replicated in one or more of the populations. We highlight patterns of eQTL-sharing between populations, which are partially determined by population genetic relatedness, and discover significant sharing of eQTL effects between Asians, European-admixed, and African subpopulations. Specifically, we observe that both the effect size and the direction of effect for eQTLs are highly conserved across populations. We observe an increasing proximity of eQTLs toward the transcription start site as sharing of eQTLs among populations increases, highlighting that variants close to TSS have stronger effects and therefore are more likely to be detected across a wider panel of populations. Together these results offer a unique picture and resource of the degree of differentiation among human populations in functional regulatory variation and provide an estimate for the transferability of complex trait variants across populations.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
A Kernel Two-Sample Test

Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.

Journal of Machine Learning Research, 13, pages: 723-773, March 2012 (article)

Abstract
We propose a framework for analyzing and comparing distributions, which we use to construct statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS), and is called the maximum mean discrepancy (MMD). We present two distribution-free tests based on large deviation bounds for the MMD, and a third test based on the asymptotic distribution of this statistic. The MMD can be computed in quadratic time, although efficient linear time approximations are available. Our statistic is an instance of an integral probability metric, and various classical metrics on distributions are obtained when alternative function classes are used in place of an RKHS. We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Technical performance evaluation of a human brain PET/MRI system

Kolb, A., Wehrl, H., Hofmann, M., Judenhofer, M., Eriksson, L., Ladebeck, R., Lichy, M., Byars, L., Michel, C., Schlemmer, H., Schmand, M., Claussen, C., Sossi, V., Pichler, B.

European Radiology, 22(8):1776-1788, March 2012 (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Real-time detection of colored objects in multiple camera streams with off-the-shelf hardware components

Lampert, C., Peters, J.

Journal of Real-Time Image Processing, 7(1):31-41, March 2012 (article)

Abstract
We describe RTblob, a high speed vision system that detects objects in cluttered scenes based on their color and shape at a speed of over 800 frames/s. Because the system is available as open-source software and relies only on off-the-shelf PC hardware components, it can provide the basis for multiple application scenarios. As an illustrative example, we show how RTblob can be used in a robotic table tennis scenario to estimate ball trajectories through 3D space simultaneously from four cameras images at a speed of 200 Hz.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
A short note on parameter approximation for von Mises-Fisher distributions: and a fast implementation of Is(x)

Sra, S.

Computational Statistics, 27(1):177-190, March 2012 (article)

Abstract
In high-dimensional directional statistics one of the most basic probability distributions is the von Mises-Fisher (vMF) distribution. Maximum likelihood estimation for the vMF distribution turns out to be surprisingly hard because of a difficult transcendental equation that needs to be solved for computing the concentration parameter κ. This paper is a followup to the recent paper of Tanabe et al. (Comput Stat 22(1):145–157, 2007), who exploited inequalities about Bessel function ratios to obtain an interval in which the parameter estimate for κ should lie; their observation lends theoretical validity to the heuristic approximation of Banerjee et al. (JMLR 6:1345–1382, 2005). Tanabe et al. (Comput Stat 22(1):145–157, 2007) also presented a fixed-point algorithm for computing improved approximations for κ. However, their approximations require (potentially significant) additional computation, and in this short paper we show that given the same amount of computation as their method, one can achieve more accurate approximations using a truncated Newton method. A more interesting contribution of this paper is a simple algorithm for computing I s (x): the modified Bessel function of the first kind. Surprisingly, our naïve implementation turns out to be several orders of magnitude faster for large arguments common to high-dimensional data, than the standard implementations in well-established software such as Mathematica ©, Maple ©, and Gp/Pari.

ei

PDF PDF DOI [BibTex]


no image
An online brain–computer interface based on shifting attention to concurrent streams of auditory stimuli

Hill, N., Schölkopf, B.

Journal of Neural Engineering, 9(2):026011, February 2012 (article)

Abstract
We report on the development and online testing of an electroencephalogram-based brain–computer interface (BCI) that aims to be usable by completely paralysed users—for whom visual or motor-system-based BCIs may not be suitable, and among whom reports of successful BCI use have so far been very rare. The current approach exploits covert shifts of attention to auditory stimuli in a dichotic-listening stimulus design. To compare the efficacy of event-related potentials (ERPs) and steady-state auditory evoked potentials (SSAEPs), the stimuli were designed such that they elicited both ERPs and SSAEPs simultaneously. Trial-by-trial feedback was provided online, based on subjects' modulation of N1 and P3 ERP components measured during single 5 s stimulation intervals. All 13 healthy subjects were able to use the BCI, with performance in a binary left/right choice task ranging from 75% to 96% correct across subjects (mean 85%). BCI classification was based on the contrast between stimuli in the attended stream and stimuli in the unattended stream, making use of every stimulus, rather than contrasting frequent standard and rare 'oddball' stimuli. SSAEPs were assessed offline: for all subjects, spectral components at the two exactly known modulation frequencies allowed discrimination of pre-stimulus from stimulus intervals, and of left-only stimuli from right-only stimuli when one side of the dichotic stimulus pair was muted. However, attention modulation of SSAEPs was not sufficient for single-trial BCI communication, even when the subject's attention was clearly focused well enough to allow classification of the same trials via ERPs. ERPs clearly provided a superior basis for BCI. The ERP results are a promising step towards the development of a simple-to-use, reliable yes/no communication system for users in the most severely paralysed states, as well as potential attention-monitoring and -training applications outside the context of assistive technology.

ei

PDF DOI [BibTex]


no image
A non-monotonic method for large-scale non-negative least squares

Kim, D., Sra, S., Dhillon, I. S.

Optimization Methods and Software, 28(5):1012-1039, Febuary 2012 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Inferring Networks of Diffusion and Influence

Gomez Rodriguez, M., Leskovec, J., Krause, A.

ACM Transactions on Knowledge Discovery from Data, 5(4:21), February 2012 (article)

Abstract
Information diffusion and virus propagation are fundamental processes taking place in networks. While it is often possible to directly observe when nodes become infected with a virus or publish the information, observing individual transmissions (who infects whom, or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by developing a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions propagate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best explains the observed infection times. Since the optimization problem is NP-hard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and finds provably near-optimal networks. We demonstrate the effectiveness of our approach by tracing information diffusion in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news for the top 1,000 media sites and blogs tends to have a core-periphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses

Stegle, O., Parts, L., Piipari, M., Winn, J., Durbin, R.

Nature Protocols, 7(3):500–507, February 2012 (article)

Abstract
We present PEER (probabilistic estimation of expression residuals), a software package implementing statistical models that improve the sensitivity and interpretability of genetic associations in population-scale expression data. This approach builds on factor analysis methods that infer broad variance components in the measurements. PEER takes as input transcript profiles and covariates from a set of individuals, and then outputs hidden factors that explain much of the expression variability. Optionally, these factors can be interpreted as pathway or transcription factor activations by providing prior information about which genes are involved in the pathway or targeted by the factor. The inferred factors are used in genetic association analyses. First, they are treated as additional covariates, and are included in the model to increase detection power for mapping expression traits. Second, they are analyzed as phenotypes themselves to understand the causes of global expression variability. PEER extends previous related surrogate variable models and can be implemented within hours on a desktop computer.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Context-aware brain-computer interfaces: exploring the information space of user, technical system and environment

Zander, TO., Jatzev, S.

Journal of Neural Engineering, 9(1):016003, 10, February 2012 (article)

Abstract
Brain–computer interface (BCI) systems are usually applied in highly controlled environments such as research laboratories or clinical setups. However, many BCI-based applications are implemented in more complex environments. For example, patients might want to use a BCI system at home, and users without disabilities could benefit from BCI systems in special working environments. In these contexts, it might be more difficult to reliably infer information about brain activity, because many intervening factors add up and disturb the BCI feature space. One solution for this problem would be adding context awareness to the system. We propose to augment the available information space with additional channels carrying information about the user state, the environment and the technical system. In particular, passive BCI systems seem to be capable of adding highly relevant context information—otherwise covert aspects of user state. In this paper, we present a theoretical framework based on general human–machine system research for adding context awareness to a BCI system. Building on that, we present results from a study on a passive BCI, which allows access to the covert aspect of user state related to the perceived loss of control. This study is a proof of concept and demonstrates that context awareness could beneficially be implemented in and combined with a BCI system or a general human–machine system. The EEG data from this experiment are available for public download at www.phypa.org.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
How the initialization affects the stability of the k-means algorithm

Bubeck, S., Meila, M., von Luxburg, U.

ESAIM: Probability and Statistics, 16, pages: 436-452, January 2012 (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Joint Modelling of Confounding Factors and Prominent Genetic Regulators Provides Increased Accuracy in Genetical Genomics Studies

Fusi, N., Stegle, O., Lawrence, ND.

PLoS Computational Biology, 8(1):1-9, January 2012 (article)

Abstract
Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown subtle environmental perturbations. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals. Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, this new model can more accurately distinguish true genetic association signals from confounding variation. We applied our model and compared it to existing methods on different datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, our approach not only identifies a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies. A software implementation of PANAMA is freely available online at http://ml.sheffield.ac.uk/qtl/.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Bayesian flexible fitting of biomolecular structures into EM maps

Habeck, M.

Biophysical journal, 2012 (article) Submitted

ei

[BibTex]

[BibTex]


no image
Measurement and Calibration of Noise Bias in Weak Lensing Galaxy Shape Estimation

Kacprzak, T., Zuntz, J., Rowe, B., Bridle, S., Refregier, A., Amara, A., Voigt, L., Hirsch, M.

Monthly Notices of the Royal Astronomical Society (MNRAS), 2012 (article)

ei

[BibTex]

[BibTex]


no image
LMM-Lasso: A Lasso Multi-Marker Mixed Model for Association Mapping with Population Structure Correction

Rakitsch, B., Lippert, C., Stegle, O., Borgwardt, KM.

Bioinformatics, 29(2):206-214, 2012 (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Existential neuroscience: a functional magnetic resonance imaging investigation of neural responses to reminders of one’s mortality

Quirin, M., Loktyushin, A., Arndt, J., Küstermann, E., Lo, Y., Kuhl, J., Eggert, L.

Social Cognitive and Affective Neuroscience, 7(2):193-198, 2012 (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Active learning for domain adaptation in the supervised classification of remote sensing images

Persello, C., Bruzzone, L.

IEEE Transactions on Geoscience and Remote Sensing, 50(11):4468-4483, 2012 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Reinforcement learning to adjust parametrized motor primitives to new situations

Kober, J., Wilhelm, A., Oztop, E., Peters, J.

Autonomous Robots, 33(4):361-379, 2012 (article)

Abstract
Humans manage to adapt learned movements very quickly to new situations by generalizing learned behaviors from similar situations. In contrast, robots currently often need to re-learn the complete movement. In this paper, we propose a method that learns to generalize parametrized motor plans by adapting a small set of global parameters, called meta-parameters. We employ reinforcement learning to learn the required meta-parameters to deal with the current situation, described by states. We introduce an appropriate reinforcement learning algorithm based on a kernelized version of the reward-weighted regression. To show its feasibility, we evaluate this algorithm on a toy example and compare it to several previous approaches. Subsequently, we apply the approach to three robot tasks, i.e., the generalization of throwing movements in darts, of hitting movements in table tennis, and of throwing balls where the tasks are learned on several different real physical robots, i.e., a Barrett WAM, a BioRob, the JST-ICORP/SARCOS CBi and a Kuka KR 6.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
On the Empirical Estimation of Integral Probability Metrics

Sriperumbudur, B., Fukumizu, K., Gretton, A., Schölkopf, B., Lanckriet, G.

Electronic Journal of Statistics, 6, pages: 1550-1599, 2012 (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Effect of MR contrast agents on quantitative accuracy of PET in combined whole-body PET/MR imaging

Lois, C., Bezrukov, I., Schmidt, H., Schwenzer, N., Werner, M., Kupferschläger, J., Beyer, T.

European Journal of Nuclear Medicine and Molecular Imaging, 39(11):1756-1766, 2012 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Multitask Learning in Computational Biology

Widmer, C., Rätsch, G.

JMLR W\&CP. ICML 2011 Unsupervised and Transfer Learning Workshop, 27, pages: 207-216, 2012 (article)

Abstract
Computational Biology provides a wide range of applications for Multitask Learning (MTL) methods. As the generation of labels often is very costly in the biomedical domain, combining data from different related problems or tasks is a promising strategy to reduce label cost. In this paper, we present two problems from sequence biology, where MTL was successfully applied. For this, we use regularization-based MTL methods, with a special focus on the case of a hierarchical relationship between tasks. Furthermore, we propose strategies to refine the measure of task relatedness, which is of central importance in MTL and finally give some practical guidelines, when MTL strategies are likely to pay off.

ei

PDF [BibTex]

PDF [BibTex]


no image
Arabidopsis defense against Botrytis cinerea: chronology and regulation deciphered by high-resolution temporal transcriptomic analysis

Windram, O., Madhou, P., McHattie, S., Hill, C., Hickman, R., Cooke, E., Jenkins, DJ., Penfold, CA., Baxter, Ll., Breeze, E., Kiddle, SJ., Rhodes, J., Atwell, S., Kliebenstein, D., Kim, Y-S., Stegle, O., Borgwardt, KM., others

The Plant Cell Online, 24(9):3530-3557, 2012, all authors: Oliver Windram,Priyadharshini Madhou,Stuart McHattie,Claire Hill,Richard Hickman,Emma Cooke,Dafyd J. Jenkins,Christopher A. Penfold,Laura Baxter,Emily Breeze,Steven J. Kiddle,Johanna Rhodes,Susanna Atwell,Daniel J. (article)

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Improved Linear Mixed Models for Genome-Wide Association Studies

Listgarten, J., Lippert, C., Kadie, CM., Davidson, RI., Eskin, E., Heckerman, D.

Nature Methods, 9, pages: 525–526, 2012 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Calibration of Boltzmann distribution priors in Bayesian data analysis

Mechelke, M., Habeck, M.

Physical Review E, 86(6):066705, 2012 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
CSB: A Python framework for computational structural biology

Kalev, I., Mechelke, M., Kopec, K., Holder, T., Carstens, S., Habeck, M.

Bioinformatics, 28(22):2996-2997, 2012 (article)

Abstract
Summary: Computational Structural Biology Toolbox (CSB) is a cross-platform Python class library for reading, storing and analyzing biomolecular structures with rich support for statistical analyses. CSB is designed for reusability and extensibility and comes with a clean, well-documented API following good object-oriented engineering practice. Availability: Stable release packages are available for download from the Python Package Index (PyPI), as well as from the project’s web site http://csb.codeplex.com.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Significant global reduction of carbon uptake by water-cycle driven extreme vegetation anomalies

Zscheischler, J., Mahecha, M., von Buttlar, J., Harmeling, S., Jung, M., Randerson, J., Reichstein, M.

Nature Geoscience, 2012 (article) In revision

ei

[BibTex]

[BibTex]


no image
Design of a Haptic Interface for a Gastrointestinal Endoscopy Simulation

Yu, S., Woo, H. S., Son, H. I., Ahn, W., Jung, H., Lee, D. Y., Yi, S. Y.

Advanced Robotics, 26(18):2115-2143, 2012 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Measurement and calibration of noise bias in weak lensing galaxy shape estimation

Kacprzak, T., Zuntz, J., Rowe, B., Bridle, S., Refregier, A., Amara, A., Voigt, L., Hirsch, M.

Monthly Notices of the Royal Astronomical Society, 427(4):2711-2722, Oxford University Press, 2012 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Image analysis for cosmology: results from the GREAT10 Galaxy Challenge

Kitching, T. D., Balan, S. T., Bridle, S., Cantale, N., Courbin, F., Eifler, T., Gentile, M., Gill, M. S. S., Harmeling, S., Heymans, C., others,

Monthly Notices of the Royal Astronomical Society, 423(4):3163-3208, Oxford University Press, 2012 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
First SN Discoveries from the Dark Energy Survey

Abbott, T., Abdalla, F., Achitouv, I., Ahn, E., Aldering, G., Allam, S., Alonso, D., Amara, A., Annis, J., Antonik, M., others,

The Astronomer's Telegram, 4668, pages: 1, 2012 (article)

ei

[BibTex]

[BibTex]


no image
A sensorimotor paradigm for Bayesian model selection

Genewein, T, Braun, DA

Frontiers in Human Neuroscience, 6(291):1-16, October 2012 (article)

Abstract
Sensorimotor control is thought to rely on predictive internal models in order to cope efficiently with uncertain environments. Recently, it has been shown that humans not only learn different internal models for different tasks, but that they also extract common structure between tasks. This raises the question of how the motor system selects between different structures or models, when each model can be associated with a range of different task-specific parameters. Here we design a sensorimotor task that requires subjects to compensate visuomotor shifts in a three-dimensional virtual reality setup, where one of the dimensions can be mapped to a model variable and the other dimension to the parameter variable. By introducing probe trials that are neutral in the parameter dimension, we can directly test for model selection. We found that model selection procedures based on Bayesian statistics provided a better explanation for subjects’ choice behavior than simple non-probabilistic heuristics. Our experimental design lends itself to the general study of model selection in a sensorimotor context as it allows to separately query model and parameter variables from subjects.

ei

DOI [BibTex]

DOI [BibTex]


no image
Risk-Sensitivity in Bayesian Sensorimotor Integration

Grau-Moya, J, Ortega, PA, Braun, DA

PLoS Computational Biology, 8(9):1-7, sep 2012 (article)

Abstract
Information processing in the nervous system during sensorimotor tasks with inherent uncertainty has been shown to be consistent with Bayesian integration. Bayes optimal decision-makers are, however, risk-neutral in the sense that they weigh all possibilities based on prior expectation and sensory evidence when they choose the action with highest expected value. In contrast, risk-sensitive decision-makers are sensitive to model uncertainty and bias their decision-making processes when they do inference over unobserved variables. In particular, they allow deviations from their probabilistic model in cases where this model makes imprecise predictions. Here we test for risk-sensitivity in a sensorimotor integration task where subjects exhibit Bayesian information integration when they infer the position of a target from noisy sensory feedback. When introducing a cost associated with subjects' response, we found that subjects exhibited a characteristic bias towards low cost responses when their uncertainty was high. This result is in accordance with risk-sensitive decision-making processes that allow for deviations from Bayes optimal decision-making in the face of uncertainty. Our results suggest that both Bayesian integration and risk-sensitivity are important factors to understand sensorimotor integration in a quantitative fashion.

ei

DOI [BibTex]

DOI [BibTex]

2005


no image
Kernel Methods for Measuring Independence

Gretton, A., Herbrich, R., Smola, A., Bousquet, O., Schölkopf, B.

Journal of Machine Learning Research, 6, pages: 2075-2129, December 2005 (article)

Abstract
We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the covariance between functions of the random variables in reproducing kernel Hilbert spaces (RKHSs). We prove that when the RKHSs are universal, both functionals are zero if and only if the random variables are pairwise independent. We also show that the kernel mutual information is an upper bound near independence on the Parzen window estimate of the mutual information. Analogous results apply for two correlation-based dependence functionals introduced earlier: we show the kernel canonical correlation and the kernel generalised variance to be independence measures for universal kernels, and prove the latter to be an upper bound on the mutual information near independence. The performance of the kernel dependence functionals in measuring independence is verified in the context of independent component analysis.

ei

PDF PostScript PDF [BibTex]

2005


PDF PostScript PDF [BibTex]


no image
A Unifying View of Sparse Approximate Gaussian Process Regression

Quinonero Candela, J., Rasmussen, C.

Journal of Machine Learning Research, 6, pages: 1935-1959, December 2005 (article)

Abstract
We provide a new unifying view, including all existing proper probabilistic sparse approximations for Gaussian process regression. Our approach relies on expressing the effective prior which the methods are using. This allows new insights to be gained, and highlights the relationship between existing methods. It also allows for a clear theoretically justified ranking of the closeness of the known approximations to the corresponding full GPs. Finally we point directly to designs of new better sparse approximations, combining the best of the existing strategies, within attractive computational constraints.

ei

PDF [BibTex]

PDF [BibTex]


no image
Method and device for detection of splice form and alternative splice forms in DNA or RNA sequences

Rätsch, G., Sonnenburg, S., Müller, K., Schölkopf, B.

European Patent Application, International No PCT/EP2005/005783, December 2005 (patent)

ei

[BibTex]

[BibTex]


no image
Maximal Margin Classification for Metric Spaces

Hein, M., Bousquet, O., Schölkopf, B.

Journal of Computer and System Sciences, 71(3):333-359, October 2005 (article)

Abstract
In order to apply the maximum margin method in arbitrary metric spaces, we suggest to embed the metric space into a Banach or Hilbert space and to perform linear classification in this space. We propose several embeddings and recall that an isometric embedding in a Banach space is always possible while an isometric embedding in a Hilbert space is only possible for certain metric spaces. As a result, we obtain a general maximum margin classification algorithm for arbitrary metric spaces (whose solution is approximated by an algorithm of Graepel. Interestingly enough, the embedding approach, when applied to a metric which can be embedded into a Hilbert space, yields the SVM algorithm, which emphasizes the fact that its solution depends on the metric and not on the kernel. Furthermore we give upper bounds of the capacity of the function classes corresponding to both embeddings in terms of Rademacher averages. Finally we compare the capacities of these function classes directly.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
Selective integration of multiple biological data for supervised network inference

Kato, T., Tsuda, K., Asai, K.

Bioinformatics, 21(10):2488 , October 2005 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
Assessing Approximate Inference for Binary Gaussian Process Classification

Kuss, M., Rasmussen, C.

Journal of Machine Learning Research, 6, pages: 1679 , October 2005 (article)

Abstract
Gaussian process priors can be used to define flexible, probabilistic classification models. Unfortunately exact Bayesian inference is analytically intractable and various approximation techniques have been proposed. In this work we review and compare Laplace‘s method and Expectation Propagation for approximate Bayesian inference in the binary Gaussian process classification model. We present a comprehensive comparison of the approximations, their predictive performance and marginal likelihood estimates to results obtained by MCMC sampling. We explain theoretically and corroborate empirically the advantages of Expectation Propagation compared to Laplace‘s method.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Clustering on the Unit Hypersphere using von Mises-Fisher Distributions

Banerjee, A., Dhillon, I., Ghosh, J., Sra, S.

Journal of Machine Learning Research, 6, pages: 1345-1382, September 2005 (article)

Abstract
Several large scale data mining applications, such as text categorization and gene expression analysis, involve high-dimensional data that is also inherently directional in nature. Often such data is L2 normalized so that it lies on the surface of a unit hypersphere. Popular models such as (mixtures of) multi-variate Gaussians are inadequate for characterizing such data. This paper proposes a generative mixture-model approach to clustering directional data based on the von Mises-Fisher (vMF) distribution, which arises naturally for data distributed on the unit hypersphere. In particular, we derive and analyze two variants of the Expectation Maximization (EM) framework for estimating the mean and concentration parameters of this mixture. Numerical estimation of the concentration parameters is non-trivial in high dimensions since it involves functional inversion of ratios of Bessel functions. We also formulate two clustering algorithms corresponding to the variants of EM that we derive. Our approach provides a theoretical basis for the use of cosine similarity that has been widely employed by the information retrieval community, and obtains the spherical kmeans algorithm (kmeans with cosine similarity) as a special case of both variants. Empirical results on clustering of high-dimensional text and gene-expression data based on a mixture of vMF distributions show that the ability to estimate the concentration parameter for each vMF component, which is not present in existing approaches, yields superior results, especially for difficult clustering tasks in high-dimensional spaces.

ei

PDF [BibTex]

PDF [BibTex]


no image
Support Vector Machines for 3D Shape Processing

Steinke, F., Schölkopf, B., Blanz, V.

Computer Graphics Forum, 24(3, EUROGRAPHICS 2005):285-294, September 2005 (article)

Abstract
We propose statistical learning methods for approximating implicit surfaces and computing dense 3D deformation fields. Our approach is based on Support Vector (SV) Machines, which are state of the art in machine learning. It is straightforward to implement and computationally competitive; its parameters can be automatically set using standard machine learning methods. The surface approximation is based on a modified Support Vector regression. We present applications to 3D head reconstruction, including automatic removal of outliers and hole filling. In a second step, we build on our SV representation to compute dense 3D deformation fields between two objects. The fields are computed using a generalized SVMachine enforcing correspondence between the previously learned implicit SV object representations, as well as correspondences between feature points if such points are available. We apply the method to the morphing of 3D heads and other objects.

ei

PDF [BibTex]

PDF [BibTex]


no image
Fast Protein Classification with Multiple Networks

Tsuda, K., Shin, H., Schölkopf, B.

Bioinformatics, 21(Suppl. 2):59-65, September 2005 (article)

Abstract
Support vector machines (SVM) have been successfully used to classify proteins into functional categories. Recently, to integrate multiple data sources, a semidefinite programming (SDP) based SVM method was introduced Lanckriet et al (2004). In SDP/SVM, multiple kernel matrices corresponding to each of data sources are combined with weights obtained by solving an SDP. However, when trying to apply SDP/SVM to large problems, the computational cost can become prohibitive, since both converting the data to a kernel matrix for the SVM and solving the SDP are time and memory demanding. Another application-specific drawback arises when some of the data sources are protein networks. A common method of converting the network to a kernel matrix is the diffusion kernel method, which has time complexity of O(n^3), and produces a dense matrix of size n x n. We propose an efficient method of protein classification using multiple protein networks. Available protein networks, such as a physical interaction network or a metabolic network, can be directly incorporated. Vectorial data can also be incorporated after conversion into a network by means of neighbor point connection. Similarly to the SDP/SVM method, the combination weights are obtained by convex optimization. Due to the sparsity of network edges, the computation time is nearly linear in the number of edges of the combined network. Additionally, the combination weights provide information useful for discarding noisy or irrelevant networks. Experiments on function prediction of 3588 yeast proteins show promising results: the computation time is enormously reduced, while the accuracy is still comparable to the SDP/SVM method.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Correlation of EEG spectral entropy with regional cerebral blood flow during sevoflurane and propofol anaesthesia

Maksimow, A., Kaisti, K., Aalto, S., Mäenpää, M., Jääskeläinen, S., Hinkka, S., Martens, SMM., Särkelä, M., Viertiö-Oja, H., Scheinin, H.

Anaesthesia, 60(9):862-869, September 2005 (article)

Abstract
ENTROPY index monitoring, based on spectral entropy of the electroencephalogram, is a promising new method to measure the depth of anaesthesia. We examined the association between spectral entropy and regional cerebral blood flow in healthy subjects anaesthetised with 2%, 3% and 4% end-expiratory concentrations of sevoflurane and 7.6, 12.5 and 19.0 microg.ml(-1) plasma drug concentrations of propofol. Spectral entropy from the frequency band 0.8-32 Hz was calculated and cerebral blood flow assessed using positron emission tomography and [(15)O]-labelled water at baseline and at each anaesthesia level. Both drugs induced significant reductions in spectral entropy and cortical and global cerebral blood flow. Midfrontal-central spectral entropy was associated with individual frontal and whole brain blood flow values across all conditions, suggesting that this novel measure of anaesthetic depth can depict global changes in neuronal activity induced by the drugs. The cortical areas of the most significant associations were remarkably similar for both drugs.

ei

DOI [BibTex]

DOI [BibTex]


no image
Iterative Kernel Principal Component Analysis for Image Modeling

Kim, K., Franz, M., Schölkopf, B.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(9):1351-1366, September 2005 (article)

Abstract
In recent years, Kernel Principal Component Analysis (KPCA) has been suggested for various image processing tasks requiring an image model such as, e.g., denoising or compression. The original form of KPCA, however, can be only applied to strongly restricted image classes due to the limited number of training examples that can be processed. We therefore propose a new iterative method for performing KPCA, the Kernel Hebbian Algorithm which iteratively estimates the Kernel Principal Components with only linear order memory complexity. In our experiments, we compute models for complex image classes such as faces and natural images which require a large number of training examples. The resulting image models are tested in single-frame super-resolution and denoising applications. The KPCA model is not specifically tailored to these tasks; in fact, the same model can be used in super-resolution with variable input resolution, or denoising with unknown noise characteristics. In spite of this, both super-resolution a nd denoising performance are comparable to existing methods.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Analyzing microarray data using quantitative association rules

Georgii, E., Richter, L., Rückert, U., Kramer, S.

Bioinformatics, 21(Suppl. 2):123-129, September 2005 (article)

Abstract
Motivation: We tackle the problem of finding regularities in microarray data. Various data mining tools, such as clustering, classification, Bayesian networks and association rules, have been applied so far to gain insight into gene-expression data. Association rule mining techniques used so far work on discretizations of the data and cannot account for cumulative effects. In this paper, we investigate the use of quantitative association rules that can operate directly on numeric data and represent cumulative effects of variables. Technically speaking, this type of quantitative association rules based on half-spaces can find non-axis-parallel regularities. Results: We performed a variety of experiments testing the utility of quantitative association rules for microarray data. First of all, the results should be statistically significant and robust against fluctuations in the data. Next, the approach should be scalable in the number of variables, which is important for such high-dimensional data. Finally, the rules should make sense biologically and be sufficiently different from rules found in regular association rule mining working with discretizations. In all of these dimensions, the proposed approach performed satisfactorily. Therefore, quantitative association rules based on half-spaces should be considered as a tool for the analysis of microarray gene-expression data.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Large Margin Methods for Structured and Interdependent Output Variables

Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.

Journal of Machine Learning Research, 6, pages: 1453-1484, September 2005 (article)

Abstract
Learning general functional dependencies between arbitrary input and output spaces is one of the key challenges in computational intelligence. While recent progress in machine learning has mainly focused on designing flexible and powerful input representations, this paper addresses the complementary issue of designing classification algorithms that can deal with more complex outputs, such as trees, sequences, or sets. More generally, we consider problems involving multiple dependent output variables, structured output spaces, and classification problems with class attributes. In order to accomplish this, we propose to appropriately generalize the well-known notion of a separation margin and derive a corresponding maximum-margin formulation. While this leads to a quadratic program with a potentially prohibitive, i.e. exponential, number of constraints, we present a cutting plane algorithm that solves the optimization problem in polynomial time for a large class of problems. The proposed method has important applications in areas such as computational biology, natural language processing, information retrieval/extraction, and optical character recognition. Experiments from various domains involving different types of output spaces emphasize the breadth and generality of our approach.

ei

PDF [BibTex]

PDF [BibTex]


no image
Gene Expression Profiling of Serum- and Interleukin-1beta-Stimulated Primary Human Adult Articular Chondrocytes - A Molecular Analysis Based on Chondrocytes Isolated from One Donor

Aigner, T., McKenna, L., Zien, A., Fan, Z., Gebhard, P., Zimmer, R.

Cytokine, 31(3):227-240, August 2005 (article)

Abstract
In order to understand the cellular disease mechanisms of osteoarthritic cartilage degeneration it is of primary importance to understand both the anabolic and the catabolic processes going on in parallel in the diseased tissue. In this study, we have applied cDNA-array technology (Clontech) to study gene expression patterns of primary human normal adult articular chondrocytes isolated from one donor cultured under anabolic (serum) and catabolic (IL-1beta) conditions. Significant differences between the different in vitro cultures tested were detected. Overall, serum and IL-1beta significantly altered gene expression levels of 102 and 79 genes, respectively. IL-1beta stimulated the matrix metalloproteinases-1, -3, and -13 as well as members of its intracellular signaling cascade, whereas serum increased the expression of many cartilage matrix genes. Comparative gene expression analysis with previously published in vivo data (normal and osteoarthritic cartilage) showed significant differences of all in vitro s timulations compared to the changes detected in osteoarthritic cartilage in vivo. This investigation allowed us to characterize gene expression profiles of two classical anabolic and catabolic stimuli of human adult articular chondrocytes in vitro. No in vitro model appeared to be adequate to study overall gene expression alterations in osteoarthritic cartilage. Serum stimulated in vitro cultures largely reflected the results that were only consistent with the anabolic activation seen in osteoarthritic chondrocytes. In contrast, IL-1beta did not appear to be a good model for mimicking catabolic gene alterations in degenerating chondrocytes.

ei

Web [BibTex]