Header logo is


2018


Deep Reinforcement Learning for Event-Triggered Control
Deep Reinforcement Learning for Event-Triggered Control

Baumann, D., Zhu, J., Martius, G., Trimpe, S.

In Proceedings of the 57th IEEE International Conference on Decision and Control (CDC), pages: 943-950, 57th IEEE International Conference on Decision and Control (CDC), December 2018 (inproceedings)

al ics

arXiv PDF DOI Project Page Project Page [BibTex]

2018


arXiv PDF DOI Project Page Project Page [BibTex]


no image
Discovering and Teaching Optimal Planning Strategies

Lieder, F., Callaway, F., Krueger, P. M., Das, P., Griffiths, T. L., Gul, S.

In The 14th biannual conference of the German Society for Cognitive Science, GK, September 2018 (inproceedings)

re

Project Page [BibTex]

Project Page [BibTex]


no image
Discovering Rational Heuristics for Risky Choice

Gul, S., Krueger, P. M., Callaway, F., Griffiths, T. L., Lieder, F.

The 14th biannual conference of the German Society for Cognitive Science, GK, The 14th biannual conference of the German Society for Cognitive Science, GK, September 2018 (conference)

Abstract
How should we think and decide to make the best possible use of our precious time and limited cognitive resources? And how do people’s cognitive strategies compare to this ideal? We study these questions in the domain of multi-alternative risky choice using the methodology of resource-rational analysis. To answer the first question, we leverage a new meta-level reinforcement learning algorithm to derive optimal heuristics for four different risky choice environments. We find that our method rediscovers two fast-and-frugal heuristics that people are known to use, namely Take-The-Best and choosing randomly, as resource-rational strategies for specific environments. Our method also discovered a novel heuristic that combines elements of Take-The-Best and Satisficing. To answer the second question, we use the Mouselab paradigm to measure how people’s decision strategies compare to the predictions of our resource-rational analysis. We found that our resource-rational analysis correctly predicted which strategies people use and under which conditions they use them. While people generally tend to make rational use of their limited resources overall, their strategy choices do not always fully exploit the structure of each decision problem. Overall, people’s decision operations were about 88% as resource-rational as they could possibly be. A formal model comparison confirmed that our resource-rational model explained people’s decision strategies significantly better than the Directed Cognition model of Gabaix et al. (2006). Our study is a proof-of-concept that optimal cognitive strategies can be automatically derived from the principle of resource-rationality. Our results suggest that resource-rational analysis is a promising approach for uncovering people’s cognitive strategies and revisiting the debate about human rationality with a more realistic normative standard.

re

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Learning to select computations

Callaway, F., Gul, S., Krueger, P. M., Griffiths, T. L., Lieder, F.

In Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference, August 2018, Frederick Callaway and Sayan Gul and Falk Lieder contributed equally to this publication. (inproceedings)

Abstract
The efficient use of limited computational resources is an essential ingredient of intelligence. Selecting computations optimally according to rational metareasoning would achieve this, but this is computationally intractable. Inspired by psychology and neuroscience, we propose the first concrete and domain-general learning algorithm for approximating the optimal selection of computations: Bayesian metalevel policy search (BMPS). We derive this general, sample-efficient search algorithm for a computation-selecting metalevel policy based on the insight that the value of information lies between the myopic value of information and the value of perfect information. We evaluate BMPS on three increasingly difficult metareasoning problems: when to terminate computation, how to allocate computation between competing options, and planning. Across all three domains, BMPS achieved near-optimal performance and compared favorably to previously proposed metareasoning heuristics. Finally, we demonstrate the practical utility of BMPS in an emergency management scenario, even accounting for the overhead of metareasoning.

re

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Nonlinear decoding of a complex movie from the mammalian retina

Botella-Soler, V., Deny, S., Martius, G., Marre, O., Tkačik, G.

PLOS Computational Biology, 14(5):1-27, Public Library of Science, May 2018 (article)

Abstract
Author summary Neurons in the retina transform patterns of incoming light into sequences of neural spikes. We recorded from ∼100 neurons in the rat retina while it was stimulated with a complex movie. Using machine learning regression methods, we fit decoders to reconstruct the movie shown from the retinal output. We demonstrated that retinal code can only be read out with a low error if decoders make use of correlations between successive spikes emitted by individual neurons. These correlations can be used to ignore spontaneous spiking that would, otherwise, cause even the best linear decoders to “hallucinate” nonexistent stimuli. This work represents the first high resolution single-trial full movie reconstruction and suggests a new paradigm for separating spontaneous from stimulus-driven neural activity.

al

DOI [BibTex]

DOI [BibTex]


no image
L4: Practical loss-based stepsize adaptation for deep learning

Rolinek, M., Martius, G.

In Advances in Neural Information Processing Systems 31 (NeurIPS 2018), pages: 6434-6444, (Editors: S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett), Curran Associates, Inc., 2018 (inproceedings)

al

Github link (url) Project Page [BibTex]

Github link (url) Project Page [BibTex]


Systematic self-exploration of behaviors for robots in a dynamical systems framework
Systematic self-exploration of behaviors for robots in a dynamical systems framework

Pinneri, C., Martius, G.

In Proc. Artificial Life XI, pages: 319-326, MIT Press, Cambridge, MA, 2018 (inproceedings)

Abstract
One of the challenges of this century is to understand the neural mechanisms behind cognitive control and learning. Recent investigations propose biologically plausible synaptic mechanisms for self-organizing controllers, in the spirit of Hebbian learning. In particular, differential extrinsic plasticity (DEP) [Der and Martius, PNAS 2015], has proven to enable embodied agents to self-organize their individual sensorimotor development, and generate highly coordinated behaviors during their interaction with the environment. These behaviors are attractors of a dynamical system. In this paper, we use the DEP rule to generate attractors and we combine it with a “repelling potential” which allows the system to actively explore all its attractor behaviors in a systematic way. With a view to a self-determined exploration of goal-free behaviors, our framework enables switching between different motion patterns in an autonomous and sequential fashion. Our algorithm is able to recover all the attractor behaviors in a toy system and it is also effective in two simulated environments. A spherical robot discovers all its major rolling modes and a hexapod robot learns to locomote in 50 different ways in 30min.

al

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]


Learning equations for extrapolation and control
Learning equations for extrapolation and control

Sahoo, S. S., Lampert, C. H., Martius, G.

In Proc. 35th International Conference on Machine Learning, ICML 2018, Stockholm, Sweden, 2018, 80, pages: 4442-4450, http://proceedings.mlr.press/v80/sahoo18a/sahoo18a.pdf, (Editors: Dy, Jennifer and Krause, Andreas), PMLR, 2018 (inproceedings)

Abstract
We present an approach to identify concise equations from data using a shallow neural network approach. In contrast to ordinary black-box regression, this approach allows understanding functional relations and generalizing them from observed data to unseen parts of the parameter space. We show how to extend the class of learnable equations for a recently proposed equation learning network to include divisions, and we improve the learning and model selection strategy to be useful for challenging real-world data. For systems governed by analytical expressions, our method can in many cases identify the true underlying equation and extrapolate to unseen domains. We demonstrate its effectiveness by experiments on a cart-pendulum system, where only 2 random rollouts are required to learn the forward dynamics and successfully achieve the swing-up task.

al

Code Arxiv Poster Slides link (url) Project Page [BibTex]

Code Arxiv Poster Slides link (url) Project Page [BibTex]


Robust Affordable 3D Haptic Sensation via Learning Deformation Patterns
Robust Affordable 3D Haptic Sensation via Learning Deformation Patterns

Sun, H., Martius, G.

Proceedings International Conference on Humanoid Robots, pages: 846-853, IEEE, New York, NY, USA, 2018 IEEE-RAS International Conference on Humanoid Robots, 2018, Oral Presentation (conference)

Abstract
Haptic sensation is an important modality for interacting with the real world. This paper proposes a general framework of inferring haptic forces on the surface of a 3D structure from internal deformations using a small number of physical sensors instead of employing dense sensor arrays. Using machine learning techniques, we optimize the sensor number and their placement and are able to obtain high-precision force inference for a robotic limb using as few as 9 sensors. For the optimal and sparse placement of the measurement units (strain gauges), we employ data-driven methods based on data obtained by finite element simulation. We compare data-driven approaches with model-based methods relying on geometric distance and information criteria such as Entropy and Mutual Information. We validate our approach on a modified limb of the “Poppy” robot [1] and obtain 8 mm localization precision.

al

DOI Project Page [BibTex]

DOI Project Page [BibTex]

2013


no image
Information Driven Self-Organization of Complex Robotic Behaviors

Martius, G., Der, R., Ay, N.

PLoS ONE, 8(5):e63400, Public Library of Science, 2013 (article)

al

link (url) DOI [BibTex]

2013


link (url) DOI [BibTex]


no image
Linear combination of one-step predictive information with an external reward in an episodic policy gradient setting: a critical analysis

Zahedi, K., Martius, G., Ay, N.

Frontiers in Psychology, 4(801), 2013 (article)

Abstract
One of the main challenges in the field of embodied artificial intelligence is the open-ended autonomous learning of complex behaviours. Our approach is to use task-independent, information-driven intrinsic motivation(s) to support task-dependent learning. The work presented here is a preliminary step in which we investigate the predictive information (the mutual information of the past and future of the sensor stream) as an intrinsic drive, ideally supporting any kind of task acquisition. Previous experiments have shown that the predictive information (PI) is a good candidate to support autonomous, open-ended learning of complex behaviours, because a maximisation of the PI corresponds to an exploration of morphology- and environment-dependent behavioural regularities. The idea is that these regularities can then be exploited in order to solve any given task. Three different experiments are presented and their results lead to the conclusion that the linear combination of the one-step PI with an external reward function is not generally recommended in an episodic policy gradient setting. Only for hard tasks a great speed-up can be achieved at the cost of an asymptotic performance lost.

al

link (url) DOI [BibTex]


no image
Robustness of guided self-organization against sensorimotor disruptions

Martius, G.

Advances in Complex Systems, 16(02n03):1350001, 2013 (article)

Abstract
Self-organizing processes are crucial for the development of living beings. Practical applications in robots may benefit from the self-organization of behavior, e.g.~to increase fault tolerance and enhance flexibility, provided that external goals can also be achieved. We present results on the guidance of self-organizing control by visual target stimuli and show a remarkable robustness to sensorimotor disruptions. In a proof of concept study an autonomous wheeled robot is learning an object finding and ball-pushing task from scratch within a few minutes in continuous domains. The robustness is demonstrated by the rapid recovery of the performance after severe changes of the sensor configuration.

al

DOI [BibTex]

DOI [BibTex]

2005


no image
Learning to Feel the Physics of a Body

Der, R., Hesse, F., Martius, G.

In Computational Intelligence for Modelling, Control and Automation, CIMCA 2005 , 2, pages: 252-257, Washington, DC, USA, 2005 (inproceedings)

Abstract
Despite the tremendous progress in robotic hardware and in both sensorial and computing efficiencies the performance of contemporary autonomous robots is still far below that of simple animals. This has triggered an intensive search for alternative approaches to the control of robots. The present paper exemplifies a general approach to the self-organization of behavior which has been developed and tested in various examples in recent years. We apply this approach to an underactuated snake like artifact with a complex physical behavior which is not known to the controller. Due to the weak forces available, the controller so to say has to develop a kind of feeling for the body which is seen to emerge from our approach in a natural way with meandering and rotational collective modes being observed in computer simulation experiments.

al

[BibTex]

2005


[BibTex]