Header logo is


2017


Thumb xl fig toyex lqr1kernel 1
On the Design of LQR Kernels for Efficient Controller Learning

Marco, A., Hennig, P., Schaal, S., Trimpe, S.

Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

am ics pn

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]

2017


arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]


no image
Optimal gamification can help people procrastinate less

Lieder, F., Griffiths, T. L.

Annual Meeting of the Society for Judgment and Decision Making, Annual Meeting of the Society for Judgment and Decision Making, November 2017 (conference)

re

Project Page [BibTex]

Project Page [BibTex]


Thumb xl teaser
Optimizing Long-term Predictions for Model-based Policy Search

Doerr, A., Daniel, C., Nguyen-Tuong, D., Marco, A., Schaal, S., Toussaint, M., Trimpe, S.

Proceedings of 1st Annual Conference on Robot Learning (CoRL), 78, pages: 227-238, (Editors: Sergey Levine and Vincent Vanhoucke and Ken Goldberg), 1st Annual Conference on Robot Learning, November 2017 (conference)

Abstract
We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

am ics

PDF Project Page [BibTex]

PDF Project Page [BibTex]


no image
Event-based State Estimation: An Emulation-based Approach

Trimpe, S.

IET Control Theory & Applications, 11(11):1684-1693, July 2017 (article)

Abstract
An event-based state estimation approach for reducing communication in a networked control system is proposed. Multiple distributed sensor agents observe a dynamic process and sporadically transmit their measurements to estimator agents over a shared bus network. Local event-triggering protocols ensure that data is transmitted only when necessary to meet a desired estimation accuracy. The event-based design is shown to emulate the performance of a centralised state observer design up to guaranteed bounds, but with reduced communication. The stability results for state estimation are extended to the distributed control system that results when the local estimates are used for feedback control. Results from numerical simulations and hardware experiments illustrate the effectiveness of the proposed approach in reducing network communication.

am ics

arXiv Supplementary material PDF DOI Project Page [BibTex]

arXiv Supplementary material PDF DOI Project Page [BibTex]


Thumb xl apollo system2 croped
Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

Doerr, A., Nguyen-Tuong, D., Marco, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 5295-5301, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics

PDF arXiv DOI Project Page [BibTex]

PDF arXiv DOI Project Page [BibTex]


Thumb xl this one
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A. P., Krause, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics pn

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]


no image
Embedded interruptions and task complexity influence schema-related cognitive load progression in an abstract learning task

Wirzberger, M., Bijarsari, S. E., Rey, G. D.

Acta Psychologica, 179, pages: 30-41, Elsevier, 2017 (article)

Abstract
Cognitive processes related to schema acquisition comprise an essential source of demands in learning situations. Since the related amount of cognitive load is supposed to change over time, plausible temporal models of load progression based on different theoretical backgrounds are inspected in this study. A total of 116 student participants completed a basal symbol sequence learning task, which provided insights into underlying cognitive dynamics. Two levels of task complexity were determined by the amount of elements within the symbol sequence. In addition, interruptions due to an embedded secondary task occurred at five predefined stages over the task. Within the resulting 2x5-factorial mixed between-within design, the continuous monitoring of efficiency in learning performance enabled assumptions on relevant resource investment. From the obtained results, a nonlinear change of learning efficiency over time seems most plausible in terms of cognitive load progression. Moreover, different effects of the induced interruptions show up in conditions of task complexity, which indicate the activation of distinct cognitive mechanisms related to structural aspects of the task. Findings are discussed in the light of evidence from research on memory and information processing.

re

DOI [BibTex]

DOI [BibTex]


no image
Empirical Evidence for Resource-Rational Anchoring and Adjustment

Lieder, F., Griffiths, T. L., Huys, Q. J. M., Goodman, N. D.

Psychonomic Bulletin \& Review, 25, pages: 775-784, Springer, 2017 (article)

re

[BibTex]

[BibTex]


no image
A reward shaping method for promoting metacognitive learning

Lieder, F., Krueger, P. M., Callaway, F., Griffiths, T. L.

In Proceedings of the Third Multidisciplinary Conference on Reinforcement Learning and Decision-Making, 2017 (inproceedings)

re

Project Page [BibTex]

Project Page [BibTex]


no image
The moderating role of arousal on the seductive detail effect

Schneider, S., Wirzberger, M., Augustin, Y., Rey, G. D.

In Abstracts of the 59th Conference of Experimental Psychologists (TeaP), pages: 96, Papst Science Publishers, Lengerich, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Influences of cognitive load on learning performance, speech and physiological parameters in a dual-task setting

Wirzberger, M., Herms, R., Esmaeili Bijarsari, S., Rey, G. D., Eibl, M.

In Abstracts of the 20th Conference of the European Society for Cognitive Psychology, pages: 161, Potsdam, Germany, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Strategy selection as rational metareasoning

Lieder, F., Griffiths, T.

Psychological Review, 124, pages: 762-794, American Psychological Association, 2017 (article)

re

Project Page [BibTex]

Project Page [BibTex]


no image
A computerized training program for teaching people how to plan better

Lieder, F., Krueger, P. M., Callaway, F., Griffiths, T. L.

PsyArXiv, 2017 (article)

re

Project Page [BibTex]

Project Page [BibTex]


no image
Time – Space – Content? Interrupting features of hyperlinks in multimedia learning

Wirzberger, M., Schneider, S., Dlouhy, S., Rey, G. D.

In Abstracts of the 59th Conference of Experimental Psychologists (TeaP), pages: 97, Pabst Science Publishers, Lengerich, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Computer Science meets Cognition: Möglichkeiten und Herausforderungen interdisziplinärer Kognitionsforschung [Computer science meets cognition: Chances and challenges in interdisciplinary research on cognition]

Wirzberger, M., Truschzinski, M., Schmidt, R., Barlag, M.

In INFORMATIK 2017, Lecture Notes in Informatics (LNI), pages: 2273-2277, Gesellschaft für Informatik, Bonn, 2017 (inproceedings)

re

DOI [BibTex]

DOI [BibTex]


no image
When does bounded-optimal metareasoning favor few cognitive systems?

Milli, S., Lieder, F., Griffiths, T. L.

In AAAI Conference on Artificial Intelligence, 31, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
The Structure of Goal Systems Predicts Human Performance

Bourgin, D., Lieder, F., Reichman, D., Talmon, N., Griffiths, T.

In Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Learning to (mis) allocate control: maltransfer can lead to self-control failure

Bustamante, L., Lieder, F., Musslick, S., Shenhav, A., Cohen, J.

In The 3rd Multidisciplinary Conference on Reinforcement Learning and Decision Making. Ann Arbor, Michigan, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Inspecting cognitive load factors in digital learning settings with ACT-R

Wirzberger, M.

In Dagstuhl 2017. Proceedings of the 11th Joint Workshop of the German Research Training Groups in Computer Science, pages: 62, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Lernförderliche Gestaltung computerbasierter Instruktionen zur Roboterkonstruktion [Enhancing design of computer-based instructions in a robot construction task]

Esmaeili Bijarsari, S., Wirzberger, M., Rey, G. D.

In INFORMATIK 2017, Lecture Notes in Informatics (LNI), pages: 2279-2286, Gesellschaft für Informatik, Bonn, 2017 (inproceedings)

re

DOI [BibTex]

DOI [BibTex]


no image
Toward a rational and mechanistic account of mental effort

Shenhav, A., Musslick, S., Lieder, F., Kool, W., Griffiths, T., Cohen, J., Botvinick, M.

Annual Review of Neuroscience, 40, pages: 99-124, Annual Reviews, 2017 (article)

re

Project Page [BibTex]

Project Page [BibTex]


no image
An automatic method for discovering rational heuristics for risky choice

Lieder, F., Krueger, P. M., Griffiths, T. L.

In Proceedings of the 39th Annual Meeting of the Cognitive Science Society. Austin, TX: Cognitive Science Society, 2017 (inproceedings)

re

Project Page [BibTex]

Project Page [BibTex]


no image
Mouselab-MDP: A new paradigm for tracing how people plan

Callaway, F., Lieder, F., Krueger, P. M., Griffiths, T. L.

In The 3rd multidisciplinary conference on reinforcement learning and decision making, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
A dynamic process model for predicting workload in an air traffic controller task

Truschzinski, M., Wirzberger, M.

In Proceedings of the 39th Annual Meeting of the Cognitive Science Society, pages: 1224-1229, Cognitive Science Society, Austin, TX, 2017 (inproceedings)

re

link (url) [BibTex]

link (url) [BibTex]


no image
Auswirkung systeminduzierter Delays auf die menschliche Gedächtnisleistung in einem virtuellen agentenbasierten Trainingssetting [Influence of system-induced delays on human memory performance in a virtual agent-based training scenario]

Wirzberger, M., Schmidt, R., Rey, G. D., Hardt, W.

In INFORMATIK 2017, Lecture Notes in Informatics (LNI), pages: 2287-2294, Gesellschaft für Informatik, Bonn, 2017 (inproceedings)

re

DOI [BibTex]

DOI [BibTex]


no image
Enhancing metacognitive reinforcement learning using reward structures and feedback

Krueger, P. M., Lieder, F., Griffiths, T. L.

In Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017 (inproceedings)

re

Project Page Project Page [BibTex]

Project Page Project Page [BibTex]


no image
The anchoring bias reflects rational use of cognitive resources

Lieder, F., Griffiths, T. L., Huys, Q. J. M., Goodman, N. D.

Psychonomic Bulletin \& Review, 25, pages: 762-794, Springer, 2017 (article)

re

[BibTex]

[BibTex]


no image
Helping people choose subgoals with sparse pseudo rewards

Callaway, F., Lieder, F., Griffiths, T. L.

In Proceedings of the Third Multidisciplinary Conference on Reinforcement Learning and Decision Making, 2017 (inproceedings)

re

[BibTex]

[BibTex]


no image
Modeling cognitive load effects in an interrupted learning task: An ACT-R approach

Wirzberger, M., Rey, G. D., Krems, J.

In Proceedings of the 39th Annual Meeting of the Cognitive Science Society, pages: 3540-3545, Cognitive Science Society, Austin, TX, 2017 (inproceedings)

re

link (url) [BibTex]

link (url) [BibTex]