Header logo is


2017


Thumb xl amd intentiongan
Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets

Hausman, K., Chebotar, Y., Schaal, S., Sukhatme, G., Lim, J.

In Proceedings from the conference "Neural Information Processing Systems 2017., (Editors: Guyon I. and Luxburg U.v. and Bengio S. and Wallach H. and Fergus R. and Vishwanathan S. and Garnett R.), Curran Associates, Inc., Advances in Neural Information Processing Systems 30 (NIPS), December 2017 (inproceedings)

am

pdf video [BibTex]

2017


pdf video [BibTex]


Thumb xl fig toyex lqr1kernel 1
On the Design of LQR Kernels for Efficient Controller Learning

Marco, A., Hennig, P., Schaal, S., Trimpe, S.

Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

am ics pn

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]


Thumb xl robot legos
Interactive Perception: Leveraging Action in Perception and Perception in Action

Bohg, J., Hausman, K., Sankaran, B., Brock, O., Kragic, D., Schaal, S., Sukhatme, G.

IEEE Transactions on Robotics, 33, pages: 1273-1291, December 2017 (article)

Abstract
Recent approaches in robotics follow the insight that perception is facilitated by interactivity with the environment. These approaches are subsumed under the term of Interactive Perception (IP). We argue that IP provides the following benefits: (i) any type of forceful interaction with the environment creates a new type of informative sensory signal that would otherwise not be present and (ii) any prior knowledge about the nature of the interaction supports the interpretation of the signal. This is facilitated by knowledge of the regularity in the combined space of sensory information and action parameters. The goal of this survey is to postulate this as a principle and collect evidence in support by analyzing and categorizing existing work in this area. We also provide an overview of the most important applications of Interactive Perception. We close this survey by discussing the remaining open questions. Thereby, we hope to define a field and inspire future work.

am

arXiv DOI Project Page [BibTex]

arXiv DOI Project Page [BibTex]


Thumb xl teaser
Optimizing Long-term Predictions for Model-based Policy Search

Doerr, A., Daniel, C., Nguyen-Tuong, D., Marco, A., Schaal, S., Toussaint, M., Trimpe, S.

Proceedings of 1st Annual Conference on Robot Learning (CoRL), 78, pages: 227-238, (Editors: Sergey Levine and Vincent Vanhoucke and Ken Goldberg), 1st Annual Conference on Robot Learning, November 2017 (conference)

Abstract
We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

am ics

PDF Project Page [BibTex]

PDF Project Page [BibTex]


Thumb xl qg net rev
Acquiring Target Stacking Skills by Goal-Parameterized Deep Reinforcement Learning

Li, W., Bohg, J., Fritz, M.

arXiv, November 2017 (article) Submitted

Abstract
Understanding physical phenomena is a key component of human intelligence and enables physical interaction with previously unseen environments. In this paper, we study how an artificial agent can autonomously acquire this intuition through interaction with the environment. We created a synthetic block stacking environment with physics simulation in which the agent can learn a policy end-to-end through trial and error. Thereby, we bypass to explicitly model physical knowledge within the policy. We are specifically interested in tasks that require the agent to reach a given goal state that may be different for every new trial. To this end, we propose a deep reinforcement learning framework that learns policies which are parametrized by a goal. We validated the model on a toy example navigating in a grid world with different target positions and in a block stacking task with different target structures of the final tower. In contrast to prior work, our policies show better generalization across different goals.

am

arXiv [BibTex]


no image
A New Data Source for Inverse Dynamics Learning

Kappler, D., Meier, F., Ratliff, N., Schaal, S.

In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Piscataway, NJ, USA, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017 (inproceedings)

am

[BibTex]

[BibTex]


no image
Bayesian Regression for Artifact Correction in Electroencephalography

Fiebig, K., Jayaram, V., Hesse, T., Blank, A., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 131-136, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

am ei

DOI [BibTex]

DOI [BibTex]


no image
Investigating Music Imagery as a Cognitive Paradigm for Low-Cost Brain-Computer Interfaces

Grossberger, L., Hohmann, M. R., Peters, J., Grosse-Wentrup, M.

Proceedings of the 7th Graz Brain-Computer Interface Conference 2017 - From Vision to Reality, pages: 160-164, (Editors: Müller-Putz G.R., Steyrl D., Wriessnegger S. C., Scherer R.), Graz University of Technology, Austria, Graz Brain-Computer Interface Conference, September 2017 (conference)

am ei

DOI [BibTex]

DOI [BibTex]


Thumb xl screen shot 2017 08 01 at 15.41.10
On the relevance of grasp metrics for predicting grasp success

Rubert, C., Kappler, D., Morales, A., Schaal, S., Bohg, J.

In Proceedings of the IEEE/RSJ International Conference of Intelligent Robots and Systems, September 2017 (inproceedings) Accepted

Abstract
We aim to reliably predict whether a grasp on a known object is successful before it is executed in the real world. There is an entire suite of grasp metrics that has already been developed which rely on precisely known contact points between object and hand. However, it remains unclear whether and how they may be combined into a general purpose grasp stability predictor. In this paper, we analyze these questions by leveraging a large scale database of simulated grasps on a wide variety of objects. For each grasp, we compute the value of seven metrics. Each grasp is annotated by human subjects with ground truth stability labels. Given this data set, we train several classification methods to find out whether there is some underlying, non-trivial structure in the data that is difficult to model manually but can be learned. Quantitative and qualitative results show the complexity of the prediction problem. We found that a good prediction performance critically depends on using a combination of metrics as input features. Furthermore, non-parametric and non-linear classifiers best capture the structure in the data.

am

Project Page [BibTex]

Project Page [BibTex]


no image
Local Bayesian Optimization of Motor Skills

Akrour, R., Sorokin, D., Peters, J., Neumann, G.

Proceedings of the 34th International Conference on Machine Learning, 70, pages: 41-50, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

am ei

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Thumb xl pilqr cover
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning

Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S.

Proceedings of the 34th International Conference on Machine Learning, 70, Proceedings of Machine Learning Research, (Editors: Doina Precup, Yee Whye Teh), PMLR, International Conference on Machine Learning (ICML), August 2017 (conference)

am

pdf video [BibTex]

pdf video [BibTex]


no image
Event-based State Estimation: An Emulation-based Approach

Trimpe, S.

IET Control Theory & Applications, 11(11):1684-1693, July 2017 (article)

Abstract
An event-based state estimation approach for reducing communication in a networked control system is proposed. Multiple distributed sensor agents observe a dynamic process and sporadically transmit their measurements to estimator agents over a shared bus network. Local event-triggering protocols ensure that data is transmitted only when necessary to meet a desired estimation accuracy. The event-based design is shown to emulate the performance of a centralised state observer design up to guaranteed bounds, but with reduced communication. The stability results for state estimation are extended to the distributed control system that results when the local estimates are used for feedback control. Results from numerical simulations and hardware experiments illustrate the effectiveness of the proposed approach in reducing network communication.

am ics

arXiv Supplementary material PDF DOI Project Page [BibTex]

arXiv Supplementary material PDF DOI Project Page [BibTex]


Thumb xl apollo system2 croped
Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

Doerr, A., Nguyen-Tuong, D., Marco, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 5295-5301, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics

PDF arXiv DOI Project Page [BibTex]

PDF arXiv DOI Project Page [BibTex]


Thumb xl learning ct block diagram v2
Learning Feedback Terms for Reactive Planning and Control

Rai, A., Sutanto, G., Schaal, S., Meier, F.

Proceedings 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (conference)

am

pdf video [BibTex]

pdf video [BibTex]


Thumb xl this one
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A. P., Krause, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics pn

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]


Thumb xl fig  quali  arm
Probabilistic Articulated Real-Time Tracking for Robot Manipulation

(Best Paper of RA-L 2017, Finalist of Best Robotic Vision Paper Award of ICRA 2017)

Garcia Cifuentes, C., Issac, J., Wüthrich, M., Schaal, S., Bohg, J.

IEEE Robotics and Automation Letters (RA-L), 2(2):577-584, April 2017 (article)

Abstract
We propose a probabilistic filtering method which fuses joint measurements with depth images to yield a precise, real-time estimate of the end-effector pose in the camera frame. This avoids the need for frame transformations when using it in combination with visual object tracking methods. Precision is achieved by modeling and correcting biases in the joint measurements as well as inaccuracies in the robot model, such as poor extrinsic camera calibration. We make our method computationally efficient through a principled combination of Kalman filtering of the joint measurements and asynchronous depth-image updates based on the Coordinate Particle Filter. We quantitatively evaluate our approach on a dataset recorded from a real robotic platform, annotated with ground truth from a motion capture system. We show that our approach is robust and accurate even under challenging conditions such as fast motion, significant and long-term occlusions, and time-varying biases. We release the dataset along with open-source code of our approach to allow for quantitative comparison with alternative approaches.

am

arXiv video code and dataset video PDF DOI Project Page [BibTex]


no image
Anticipatory Action Selection for Human-Robot Table Tennis

Wang, Z., Boularias, A., Mülling, K., Schölkopf, B., Peters, J.

Artificial Intelligence, 247, pages: 399-414, 2017, Special Issue on AI and Robotics (article)

Abstract
Abstract Anticipation can enhance the capability of a robot in its interaction with humans, where the robot predicts the humans' intention for selecting its own action. We present a novel framework of anticipatory action selection for human-robot interaction, which is capable to handle nonlinear and stochastic human behaviors such as table tennis strokes and allows the robot to choose the optimal action based on prediction of the human partner's intention with uncertainty. The presented framework is generic and can be used in many human-robot interaction scenarios, for example, in navigation and human-robot co-manipulation. In this article, we conduct a case study on human-robot table tennis. Due to the limited amount of time for executing hitting movements, a robot usually needs to initiate its hitting movement before the opponent hits the ball, which requires the robot to be anticipatory based on visual observation of the opponent's movement. Previous work on Intention-Driven Dynamics Models (IDDM) allowed the robot to predict the intended target of the opponent. In this article, we address the problem of action selection and optimal timing for initiating a chosen action by formulating the anticipatory action selection as a Partially Observable Markov Decision Process (POMDP), where the transition and observation are modeled by the \{IDDM\} framework. We present two approaches to anticipatory action selection based on the \{POMDP\} formulation, i.e., a model-free policy learning method based on Least-Squares Policy Iteration (LSPI) that employs the \{IDDM\} for belief updates, and a model-based Monte-Carlo Planning (MCP) method, which benefits from the transition and observation model by the IDDM. Experimental results using real data in a simulated environment show the importance of anticipatory action selection, and that \{POMDPs\} are suitable to formulate the anticipatory action selection problem by taking into account the uncertainties in prediction. We also show that existing algorithms for POMDPs, such as \{LSPI\} and MCP, can be applied to substantially improve the robot's performance in its interaction with humans.

am ei

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Functionalised metal-organic frameworks: a novel approach to stabilising single metal atoms

Szilágyi, P. Á., Rogers, D. M., Zaiser, I., Callini, E., Turner, S., Borgschulte, A., Züttel, A., Geerlings, H., Hirscher, M., Dam, B.

{Journal of Materials Chemistry A}, 5(30):15559-15566, Royal Society of Chemistry, Cambridge, UK, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Exploiting diffusion barrier and chemical affinity of metal-organic frameworks for efficient hydrogen isotope separation

Kim, J. Y., Balderas-Xicohténcatl, R., Zhang, L., Kang, S. G., Hirscher, M., Oh, H., Moon, H. R.

{Journal of the American Chemical Society}, 139(42):15135-15141, American Chemical Society, Washington, DC, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Facile fabrication of mesoporous silica micro-jets with multi-functionalities

Vilela, D., Hortelao, A. C., Balderas-Xicohténcatl, R., Hirscher, M., Hahn, K., Ma, X., Sánchez, S.

{Nanoscale}, 9(37):13990-13997, Royal Society of Chemistry, Cambridge, UK, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Selective hydrogen isotope separation via breathing transition in MIL-53(Al)

Kim, J. Y., Zhang, L., Balderas-Xicohténcatl, R., Park, J., Hirscher, M., Moon, H. R., Oh, H.

{Journal of the American Chemical Society}, 139(49):17743-17746, American Chemical Society, Washington, DC, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Efficient synthesis for large-scale production and characterization for hydrogen storage of ligand exchanged MOF-74/174/184-M (M\textequalsMg2+, Ni2+)

Oh, H., Maurer, S., Balderas-Xicohténcatl, R., Arnold, L., Magdysyuk, O. V., Schütz, G., Müller, U., Hirscher, M.

{International Journal of Hydrogen Energy}, 42(2):1027-1035, Elsevier, Amsterdam, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Corrosion-protected hybrid nanoparticles

Jeong, H., Alarcón-Correa, M., Mark, A. G., Son, K., Lee, T., Fischer, P.

{Advanced Science}, 4(12), Wiley-VCH, Weinheim, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Investigation of the Dzyaloshinskii-Moriya interaction and room temperature skyrmions in W/CoFeB/MgO thin films and microwires

Jaiswal, S., Litzius, K., Lemesh, I., Büttner, F., Finizio, S., Raabe, J., Weigand, M., Lee, K., Langer, J., Ocker, B., Jakob, G., Beach, G. S. D., Kläui, M.

{Applied Physics Letters}, 111(2), American Institute of Physics, Melville, NY, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Ultrafast demagnetization after femtosecond laser pulses: Transfer of angular momentum from the electronic system to magnetoelastic spin-phonon modes

Fähnle, M., Tsatsoulis, T., Illg, C., Haag, M., Müller, B. Y., Zhang, L.

{Journal of Superconductivity and Novel Magnetism}, 30(5):1381-1387, Springer Science + Business Media B.V., New York, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Magnetic behavior of single chain magnets in metal organic frameworks CPO-27-Co

Son, K., Goering, E., Hirscher, M., Oh, H.

{Journal of Nanoscience and Nanotechnology}, 17(10):7541-7546, American Scientific Publishers, Stevenson Ranch, Calif., 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Switching by domain-wall automotion in asymmetric ferromagnetic rings

Mawass, M., Richter, K., Bisig, A., Reeve, R. M., Krüger, B., Weigand, M., Stoll, H., Krone, A., Kronast, F., Schütz, G., Kläui, M.

{Physical Review Applied}, 7(4), American Physical Society, College Park, Md. [u.a.], 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
A neutral atom moving in an external magnetic field does not feel a Lorentz force

Fähnle, M.

{American Journal of Modern Physics}, 6(6):153-155, Science Publishing Group, New York, NY, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Temperature-dependent first-order reversal curve measurements on unusually hard magnetic low-temperature phase of MnBi

Muralidhar, S., Gräfe, J., Chen, Y., Etter, M., Gregori, G., Ener, S., Sawatzki, S., Hono, K., Gutfleisch, O., Kronmüller, H., Schütz, G., Goering, E. J.

{Physical Review B}, 95(2), American Physical Society, Woodbury, NY, 2017 (article)

mms

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Smooth and rapid microwave synthesis of MIL-53(Fe) including superparamagnetic \textlessgamma\textgreater-Fe2O3 nanoparticles

Wengert, S., Albrecht, J., Ruoß, S., Stahl, C., Schütz, G., Schäfer, R.

{Journal of Magnetism and Magnetic Materials}, 444, pages: 168-172, NH, Elsevier, Amsterdam, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Characterization and differentiation of rock varnish types from different environments by microanalytical techniques

Macholdt, D. S., Jochum, K. P., Pöhlker, C., Arangio, A., Förster, J., Stoll, B., Weis, U., Weber, B., Müller, M., Kappl, M., Shiraiwa, M., Kilcoyne, A. L. D., Weigand, M., Scholz, D., Haug, G. H., Al-Amri, A., Andreae, M. O.

{Chemical Geology}, 459, pages: 91-118, Elsevier, Amsterdam, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Skyrmion Hall effect revealed by direct time-resolved X-ray microscopy

Litzius, K., Lemesh, I., Krüger, B., Bassirian, P., Caretta, L., Richter, K., Büttner, F., Sato, K., Tretiakov, O. A., Förster, J., Reeve, R. M., Weigand, M., Bykova, I., Stoll, H., Schütz, G., Beach, G. S. D., Kläui, M.

{Nature Physics}, 13(2):170-175, Nature Pub. Group, London, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Comment on magnonic black holes

Fähnle, M., Schütz, G.

{Journal of Magnetism and Magnetic Materials}, 444, pages: 146-146, NH, Elsevier, Amsterdam, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Cr-Substitution in Ba2In2O5 \mbox⋅ (H2O)x (x \textequals 0.16, 0.74)

Yoon, S., Son, K., Hagemann, H., Widenmeyer, M., Weidenkaff, A.

{Solid State Sciences}, 73, pages: 1-6, Elsevier Masson SAS, Paris, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Comment on half-integer quantum numbers for the total angular momentum of photons in light beams with finite lateral extensions

Fähnle, M.

{American Journal of Modern Physics}, 6(5):88-90, Science Publishing Group, New York, NY, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Advanced magneto-optical Kerr effect measurements of superconductors at low temperatures

Stahl, C., Gräfe, J., Ruoß, S., Zahn, P., Bayer, J., Simmendinger, J., Schütz, G., Albrecht, J.

{AIP Advances}, 7(10), 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Unifying ultrafast demagnetization and intrinsic Gilbert damping in Co/Ni bilayers with electronic relaxation near the Fermi surface

Zhang, W., He, W., Zhang, X.-Q., Cheng, Z.-H., Teng, J., Fähnle, M.

{Physical Review B}, 96(22), American Physical Society, Woodbury, NY, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Influence of the skin barrier on the penetration of topically-applied dexamethasone probed by soft X-ray spectromicroscopy

Yamamoto, K., Klossek, A., Flesch, R., Rancan, F., Weigand, M., Bykova, I., Bechtel, M., Ahlberg, S., Vogt, A., Blume-Peytavi, U., Schrade, P., Bachmann, S., Hedtrich, S., Schäfer-Korting, M., Rühl, E.

{European Journal of Pharmaceutics and Biopharmaceutics}, 118, pages: 30-37, Elsevier, Amsterdam, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Capture of heavy hydrogen isotopes in a metal-organic framework with active Cu(I) sites

Weinrauch, I., Savchenko, I., Denysenko, D., Souliou, S. M., Kim, H., Le Tacon, M., Daemen, L. L., Cheng, Y., Mavrandonakis, A., Ramirez-Cuesta, A. J., Volkmer, D., Schütz, G., Hirscher, M., Heine, T.

{Nature Communications}, 8, Nature Publishing Group, London, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Multiscale simulations of topological transformations in magnetic-skyrmion spin structures

De Lucia, A., Litzius, K., Krüger, B., Tretiakov, O. A., Kläui, M.

{Physical Review B}, 96(2), American Physical Society, Woodbury, NY, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Unexpectedly marginal effect of electronic correlations on ultrafast demagnetization after femtosecond laser-pulse excitation

Weng, W., Huang, Haonan, Briones Paz, J. Z., Teeny, N., Müller, B. Y., Haag, M., Kuhn, T., Fähnle, M.

{Physical Review B}, 95(22), American Physical Society, Woodbury, NY, 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Black manganese-rich crusts on a Gothic cathedral

Macholdt, D. S., Herrmann, S., Jochum, K. P., Kilcoyne, A. L. D., Laubscher, T., Pfisterer, H. K., Pöhlker, C., Schwager, B., Weber, B., Weigand, M., Domke, K. F., Andreae, M. O.

{Atmospheric Environment}, 171, pages: 205-220, Elsevier, Amsterdam [u.a.], 2017 (article)

mms

DOI [BibTex]

DOI [BibTex]

2010


no image
Reinforcement learning of full-body humanoid motor skills

Stulp, F., Buchli, J., Theodorou, E., Schaal, S.

In Humanoid Robots (Humanoids), 2010 10th IEEE-RAS International Conference on, pages: 405-410, December 2010, clmc (inproceedings)

Abstract
Applying reinforcement learning to humanoid robots is challenging because humanoids have a large number of degrees of freedom and state and action spaces are continuous. Thus, most reinforcement learning algorithms would become computationally infeasible and require a prohibitive amount of trials to explore such high-dimensional spaces. In this paper, we present a probabilistic reinforcement learning approach, which is derived from the framework of stochastic optimal control and path integrals. The algorithm, called Policy Improvement with Path Integrals (PI2), has a surprisingly simple form, has no open tuning parameters besides the exploration noise, is model-free, and performs numerically robustly in high dimensional learning problems. We demonstrate how PI2 is able to learn full-body motor skills on a 34-DOF humanoid robot. To demonstrate the generality of our approach, we also apply PI2 in the context of variable impedance control, where both planned trajectories and gain schedules for each joint are optimized simultaneously.

am

link (url) [BibTex]

2010


link (url) [BibTex]


no image
Relative Entropy Policy Search

Peters, J., Mülling, K., Altun, Y.

In Proceedings of the Twenty-Fourth National Conference on Artificial Intelligence, pages: 1607-1612, (Editors: Fox, M. , D. Poole), AAAI Press, Menlo Park, CA, USA, Twenty-Fourth National Conference on Artificial Intelligence (AAAI-10), July 2010 (inproceedings)

Abstract
Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature convergence and implausible solutions. As first suggested in the context of covariant policy gradients (Bagnell and Schneider 2003), many of these problems may be addressed by constraining the information loss. In this paper, we continue this path of reasoning and suggest the Relative Entropy Policy Search (REPS) method. The resulting method differs significantly from previous policy gradient approaches and yields an exact update step. It works well on typical reinforcement learning benchmark problems.

am ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Reinforcement learning of motor skills in high dimensions: A path integral approach

Theodorou, E., Buchli, J., Schaal, S.

In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages: 2397-2403, May 2010, clmc (inproceedings)

Abstract
Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far due to the computational difficulties that reinforcement learning encounters in high dimensional continuous state-action spaces. In this paper, we derive a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals. While solidly grounded in optimal control theory and estimation theory, the update equations for learning are surprisingly simple and have no danger of numerical instabilities as neither matrix inversions nor gradient learning rates are required. Empirical evaluations demonstrate significant performance improvements over gradient-based policy learning and scalability to high-dimensional control problems. Finally, a learning experiment on a robot dog illustrates the functionality of our algorithm in a real-world scenario. We believe that our new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Inverse dynamics control of floating base systems using orthogonal decomposition

Mistry, M., Buchli, J., Schaal, S.

In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages: 3406-3412, May 2010, clmc (inproceedings)

Abstract
Model-based control methods can be used to enable fast, dexterous, and compliant motion of robots without sacrificing control accuracy. However, implementing such techniques on floating base robots, e.g., humanoids and legged systems, is non-trivial due to under-actuation, dynamically changing constraints from the environment, and potentially closed loop kinematics. In this paper, we show how to compute the analytically correct inverse dynamics torques for model-based control of sufficiently constrained floating base rigid-body systems, such as humanoid robots with one or two feet in contact with the environment. While our previous inverse dynamics approach relied on an estimation of contact forces to compute an approximate inverse dynamics solution, here we present an analytically correct solution by using an orthogonal decomposition to project the robot dynamics onto a reduced dimensional space, independent of contact forces. We demonstrate the feasibility and robustness of our approach on a simulated floating base bipedal humanoid robot and an actual robot dog locomoting over rough terrain.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Fast, robust quadruped locomotion over challenging terrain

Kalakrishnan, M., Buchli, J., Pastor, P., Mistry, M., Schaal, S.

In Robotics and Automation (ICRA), 2010 IEEE International Conference on, pages: 2665-2670, May 2010, clmc (inproceedings)

Abstract
We present a control architecture for fast quadruped locomotion over rough terrain. We approach the problem by decomposing it into many sub-systems, in which we apply state-of-the-art learning, planning, optimization and control techniques to achieve robust, fast locomotion. Unique features of our control strategy include: (1) a system that learns optimal foothold choices from expert demonstration using terrain templates, (2) a body trajectory optimizer based on the Zero-Moment Point (ZMP) stability criterion, and (3) a floating-base inverse dynamics controller that, in conjunction with force control, allows for robust, compliant locomotion over unperceived obstacles. We evaluate the performance of our controller by testing it on the LittleDog quadruped robot, over a wide variety of rough terrain of varying difficulty levels. We demonstrate the generalization ability of this controller by presenting test results from an independent external test team on terrains that have never been shown to us.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Policy learning algorithmis for motor learning (Algorithmen zum automatischen Erlernen von Motorfähigkigkeiten)

Peters, J., Kober, J., Schaal, S.

Automatisierungstechnik, 58(12):688-694, 2010, clmc (article)

Abstract
Robot learning methods which allow au- tonomous robots to adapt to novel situations have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to ful- fill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics. If possible, scaling was usually only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general ap- proach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human- like performance. For doing so, we study two major components for such an approach, i. e., firstly, we study policy learning algo- rithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structu- res for task representation and execution.

am

link (url) [BibTex]


no image
A Bayesian approach to nonlinear parameter identification for rigid-body dynamics

Ting, J., DSouza, A., Schaal, S.

Neural Networks, 2010, clmc (article)

Abstract
For complex robots such as humanoids, model-based control is highly beneficial for accurate tracking while keeping negative feedback gains low for compliance. However, in such multi degree-of-freedom lightweight systems, conventional identification of rigid body dynamics models using CAD data and actuator models is inaccurate due to unknown nonlinear robot dynamic effects. An alternative method is data-driven parameter estimation, but significant noise in measured and inferred variables affects it adversely. Moreover, standard estimation procedures may give physically inconsistent results due to unmodeled nonlinearities or insufficiently rich data. This paper addresses these problems, proposing a Bayesian system identification technique for linear or piecewise linear systems. Inspired by Factor Analysis regression, we develop a computationally efficient variational Bayesian regression algorithm that is robust to ill-conditioned data, automatically detects relevant features, and identifies input and output noise. We evaluate our approach on rigid body parameter estimation for various robotic systems, achieving an error of up to three times lower than other state-of-the-art machine learning methods.

am

link (url) [BibTex]