Header logo is


2020


Bayesian Optimization in Robot Learning - Automatic Controller Tuning and Sample-Efficient Methods
Bayesian Optimization in Robot Learning - Automatic Controller Tuning and Sample-Efficient Methods

Marco-Valle, A.

University of Tübingen, June 2020 (thesis)

Abstract
The problem of designing controllers to regulate dynamical systems has been studied by engineers during the past millennia. Ever since, suboptimal performance lingers in many closed loops as an unavoidable side effect of manually tuning the parameters of the controllers. Nowadays, industrial settings remain skeptic about data-driven methods that allow one to automatically learn controller parameters. In the context of robotics, machine learning (ML) keeps growing its influence on increasing autonomy and adaptability, for example to aid automating controller tuning. However, data-hungry ML methods, such as standard reinforcement learning, require a large number of experimental samples, prohibitive in robotics, as hardware can deteriorate and break. This brings about the following question: Can manual controller tuning, in robotics, be automated by using data-efficient machine learning techniques? In this thesis, we tackle the question above by exploring Bayesian optimization (BO), a data-efficient ML framework, to buffer the human effort and side effects of manual controller tuning, while retaining a low number of experimental samples. We focus this work in the context of robotic systems, providing thorough theoretical results that aim to increase data-efficiency, as well as demonstrations in real robots. Specifically, we present four main contributions. We first consider using BO to replace manual tuning in robotic platforms. To this end, we parametrize the design weights of a linear quadratic regulator (LQR) and learn its parameters using an information-efficient BO algorithm. Such algorithm uses Gaussian processes (GPs) to model the unknown performance objective. The GP model is used by BO to suggest controller parameters that are expected to increment the information about the optimal parameters, measured as a gain in entropy. The resulting “automatic LQR tuning” framework is demonstrated on two robotic platforms: A robot arm balancing an inverted pole and a humanoid robot performing a squatting task. In both cases, an existing controller is automatically improved in a handful of experiments without human intervention. BO compensates for data scarcity by means of the GP, which is a probabilistic model that encodes prior assumptions about the unknown performance objective. Usually, incorrect or non-informed assumptions have negative consequences, such as higher number of robot experiments, poor tuning performance or reduced sample-efficiency. The second to fourth contributions presented herein attempt to alleviate this issue. The second contribution proposes to include the robot simulator into the learning loop as an additional information source for automatic controller tuning. While doing a real robot experiment generally entails high associated costs (e.g., require preparation and take time), simulations are cheaper to obtain (e.g., they can be computed faster). However, because the simulator is an imperfect model of the robot, its information is biased and could have negative repercussions in the learning performance. To address this problem, we propose “simu-vs-real”, a principled multi-fidelity BO algorithm that trades off cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. The resulting algorithm is demonstrated on a cart-pole system, where simulations and real experiments are alternated, thus sparing many real evaluations. The third contribution explores how to adequate the expressiveness of the probabilistic prior to the control problem at hand. To this end, the mathematical structure of LQR controllers is leveraged and embedded into the GP, by means of the kernel function. Specifically, we propose two different “LQR kernel” designs that retain the flexibility of Bayesian nonparametric learning. Simulated results indicate that the LQR kernel yields superior performance than non-informed kernel choices when used for controller learning with BO. Finally, the fourth contribution specifically addresses the problem of handling controller failures, which are typically unavoidable in practice while learning from data, specially if non-conservative solutions are expected. Although controller failures are generally problematic (e.g., the robot has to be emergency-stopped), they are also a rich information source about what should be avoided. We propose “failures-aware excursion search”, a novel algorithm for Bayesian optimization under black-box constraints, where failures are limited in number. Our results in numerical benchmarks indicate that by allowing a confined number of failures, better optima are revealed as compared with state-of-the-art methods. The first contribution of this thesis, “automatic LQR tuning”, lies among the first on applying BO to real robots. While it demonstrated automatic controller learning from few experimental samples, it also revealed several important challenges, such as the need of higher sample-efficiency, which opened relevant research directions that we addressed through several methodological contributions. Summarizing, we proposed “simu-vs-real”, a novel BO algorithm that includes the simulator as an additional information source, an “LQR kernel” design that learns faster than standard choices and “failures-aware excursion search”, a new BO algorithm for constrained black-box optimization problems, where the number of failures is limited.

ics

Repository (Universitätsbibliothek) - University of Tübingen PDF DOI [BibTex]


no image
Interaction of hydrogen isotopes with flexible metal-organic frameworks

Bondorf, L.

Universität Stuttgart, Stuttgart, 2020 (mastersthesis)

mms

[BibTex]

[BibTex]

2017


Nonparametric Disturbance Correction and Nonlinear Dual Control

no image
New Directions for Learning with Kernels and Gaussian Processes (Dagstuhl Seminar 16481)

Gretton, A., Hennig, P., Rasmussen, C., Schölkopf, B.

Dagstuhl Reports, 6(11):142-167, 2017 (book)

ei pn

DOI [BibTex]

DOI [BibTex]


no image
Understanding FORC using synthetic micro-structured systems with variable coupling- and coercivefield distributions

Groß, Felix

Universität Stuttgart, Stuttgart, 2017 (mastersthesis)

mms

[BibTex]


no image
Adsorption von Wasserstoffmolekülen in nanoporösen Gerüststrukturen

Kotzur, Nadine

Universität Stuttgart, Stuttgart, 2017 (mastersthesis)

mms

[BibTex]

[BibTex]

2011


no image
Ferromagnetism of ZnO influenced by physical and chemical treatment

Chen, Y.

Universität Stuttgart, Stuttgart, 2011 (mastersthesis)

mms

[BibTex]

2011


[BibTex]


no image
Herstellung und Charakterisierung von ultradünnen, funktionellen CoFeB Filmen

Streckenbach, F.

Hochschule Esslingen / Hochschule Aalen, Esslingen / Aalen, 2011 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Hydrogen adsorption on metal-organic frameworks

Streppel, B.

Universität Stuttgart, Stuttgart, 2011 (phdthesis)

mms

link (url) [BibTex]

link (url) [BibTex]


no image
Piezo driven strain effects on magneto-crystalline anisotropy

Badr, E.

Universität Stuttgart, Stuttgart, 2011 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Magnetooptische Untersuchungen an granularen und beschichteten MgB2 Filmen

Stahl, C.

Universität Stuttgart, Stuttgart, 2011 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Mikromagnetismus der Wechselwirkung von Spinwellen mit Domänenwänden in Ferromagneten

Macke, S.

Universität Stuttgart, Stuttgart, 2011 (phdthesis)

mms

[BibTex]

[BibTex]


no image
Herstellung und Qualifizierung gesputterter Magnesiumdiboridschichten

Breyer, F.

Hochschule Aalen, Aalen, 2011 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Study of krypton/xenon storage and separation in microporous frameworks

Soleimani Dorcheh, A.

Universität Darmstadt, Darmstadt, 2011 (mastersthesis)

mms

[BibTex]

[BibTex]

2010


no image
Approximate Inference in Graphical Models

Hennig, P.

University of Cambridge, November 2010 (phdthesis)

ei pn

Web [BibTex]

2010


Web [BibTex]


no image
Statics and dynamics of simple fluids on chemically patterned substrates

Dörfler, F.

Universität Stuttgart, Stuttgart, Germany, 2010 (phdthesis)

mms

link (url) [BibTex]

link (url) [BibTex]


no image
Entnetzung verspannter Filme

Reindl, A.

Universität Stuttgart, Stuttgart, 2010 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Advanced ferromagnetic nanostructures

Goll, D.

Universität Stuttgart, Stuttgart, 2010 (phdthesis)

mms

[BibTex]

[BibTex]


no image
Wasserstoff in funktionellen Dünnschichtsystemen

Honert, J.

Universität Stuttgart, Stuttgart, 2010 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Handbook of Hydrogen Storage

Hirscher, M.

pages: 353 p., Wiley-VCH, Weinheim, 2010 (book)

mms

[BibTex]

[BibTex]


no image
Elektronentheorie der magnetischen EXAFS

Gü\ssmann, M.

Universität Stuttgart, Stuttgart, 2006 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Elektronenspektroskopie an Übergangsmetallclustern

He\ssler, M.

Bayerische Julius-Maximilians-Universität, Würzburg, 2006 (phdthesis)

mms

[BibTex]

[BibTex]


no image
Hydrogen storage by physisorption on porous materials

Panella, B.

Universität Stuttgart, Stuttgart, 2006 (phdthesis)

mms

link (url) [BibTex]

link (url) [BibTex]


no image
Theory of magnetic x-ray reflectometry on the Co2Pt7 multilayer system

Martosiswoyo, L.

Universität Stuttgart, Stuttgart, 2006 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Magnetischer zirkularer Röntgendichroismus an Übergangsmetalloxiden

Lafkioti, M.

Universität Stuttgart, Stuttgart, 2006 (mastersthesis)

mms

[BibTex]

[BibTex]


no image
Contributions to the theory of x-ray magnetic dichroism

Dörfler, F.

Universität Stuttgart, Stuttgart, 2006 (mastersthesis)

mms

[BibTex]

[BibTex]