Header logo is


2018


no image
Kernel Recursive ABC: Point Estimation with Intractable Likelihood

Kajihara, T., Kanagawa, M., Yamazaki, K., Fukumizu, K.

Proceedings of the 35th International Conference on Machine Learning, pages: 2405-2414, PMLR, July 2018 (conference)

Abstract
We propose a novel approach to parameter estimation for simulator-based statistical models with intractable likelihood. Our proposed method involves recursive application of kernel ABC and kernel herding to the same observed data. We provide a theoretical explanation regarding why the approach works, showing (for the population setting) that, under a certain assumption, point estimates obtained with this method converge to the true parameter, as recursion proceeds. We have conducted a variety of numerical experiments, including parameter estimation for a real-world pedestrian flow simulator, and show that in most cases our method outperforms existing approaches.

pn

Paper [BibTex]

2018


Paper [BibTex]


Thumb xl tease pic
Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients

Balles, L., Hennig, P.

In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018 (inproceedings) Accepted

Abstract
The ADAM optimizer is exceedingly popular in the deep learning community. Often it works very well, sometimes it doesn't. Why? We interpret ADAM as a combination of two aspects: for each weight, the update direction is determined by the sign of stochastic gradients, whereas the update magnitude is determined by an estimate of their relative variance. We disentangle these two aspects and analyze them in isolation, gaining insight into the mechanisms underlying ADAM. This analysis also extends recent results on adverse effects of ADAM on generalization, isolating the sign aspect as the problematic one. Transferring the variance adaptation to SGD gives rise to a novel method, completing the practitioner's toolbox for problems where ADAM fails.

pn

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]

2012


Thumb xl thumb hennigk2012
Quasi-Newton Methods: A New Direction

Hennig, P., Kiefel, M.

In Proceedings of the 29th International Conference on Machine Learning, pages: 25-32, ICML ’12, (Editors: John Langford and Joelle Pineau), Omnipress, New York, NY, USA, ICML, July 2012 (inproceedings)

Abstract
Four decades after their invention, quasi- Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

ei ps pn

website+code pdf link (url) [BibTex]

2012


website+code pdf link (url) [BibTex]


no image
Learning Tracking Control with Forward Models

Bócsi, B., Hennig, P., Csató, L., Peters, J.

In pages: 259 -264, IEEE International Conference on Robotics and Automation (ICRA), May 2012 (inproceedings)

Abstract
Performing task-space tracking control on redundant robot manipulators is a difficult problem. When the physical model of the robot is too complex or not available, standard methods fail and machine learning algorithms can have advantages. We propose an adaptive learning algorithm for tracking control of underactuated or non-rigid robots where the physical model of the robot is unavailable. The control method is based on the fact that forward models are relatively straightforward to learn and local inversions can be obtained via local optimization. We use sparse online Gaussian process inference to obtain a flexible probabilistic forward model and second order optimization to find the inverse mapping. Physical experiments indicate that this approach can outperform state-of-the-art tracking control algorithms in this context.

ei pn

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Approximate Gaussian Integration using Expectation Propagation

Cunningham, J., Hennig, P., Lacoste-Julien, S.

In pages: 1-11, -, January 2012 (inproceedings) Submitted

Abstract
While Gaussian probability densities are omnipresent in applied mathematics, Gaussian cumulative probabilities are hard to calculate in any but the univariate case. We offer here an empirical study of the utility of Expectation Propagation (EP) as an approximate integration method for this problem. For rectangular integration regions, the approximation is highly accurate. We also extend the derivations to the more general case of polyhedral integration regions. However, we find that in this polyhedral case, EP's answer, though often accurate, can be almost arbitrarily wrong. These unexpected results elucidate an interesting and non-obvious feature of EP not yet studied in detail, both for the problem of Gaussian probabilities and for EP more generally.

ei pn

Web [BibTex]

Web [BibTex]


no image
Kernel Topic Models

Hennig, P., Stern, D., Herbrich, R., Graepel, T.

In Fifteenth International Conference on Artificial Intelligence and Statistics, 22, pages: 511-519, JMLR Proceedings, (Editors: Lawrence, N. D. and Girolami, M.), JMLR.org, AISTATS , 2012 (inproceedings)

Abstract
Latent Dirichlet Allocation models discrete data as a mixture of discrete distributions, using Dirichlet beliefs over the mixture weights. We study a variation of this concept, in which the documents' mixture weight beliefs are replaced with squashed Gaussian distributions. This allows documents to be associated with elements of a Hilbert space, admitting kernel topic models (KTM), modelling temporal, spatial, hierarchical, social and other structure between documents. The main challenge is efficient approximate inference on the latent Gaussian. We present an approximate algorithm cast around a Laplace approximation in a transformed basis. The KTM can also be interpreted as a type of Gaussian process latent variable model, or as a topic model conditional on document features, uncovering links between earlier work in these areas.

ei pn

PDF Web [BibTex]

PDF Web [BibTex]