Header logo is


2019


no image
On the Transfer of Inductive Bias from Simulation to the Real World: a New Disentanglement Dataset

Gondal, M. W., Wuthrich, M., Miladinovic, D., Locatello, F., Breidt, M., Volchkov, V., Akpo, J., Bachem, O., Schölkopf, B., Bauer, S.

Advances in Neural Information Processing Systems 32, pages: 15714-15725, (Editors: H. Wallach and H. Larochelle and A. Beygelzimer and F. d’Alché-Buc and E. Fox and R. Garnett), Curran Associates, Inc., 33rd Annual Conference on Neural Information Processing Systems, December 2019 (conference)

am ei sf

link (url) [BibTex]

2019


link (url) [BibTex]


Learning to Explore in Motion and Interaction Tasks
Learning to Explore in Motion and Interaction Tasks

Bogdanovic, M., Righetti, L.

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, November 2019 (conference)

Abstract
Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems.

mg

arXiv [BibTex]

arXiv [BibTex]


no image
Robust Humanoid Locomotion Using Trajectory Optimization and Sample-Efficient Learning

Yeganegi, M. H., Khadiv, M., Moosavian, S. A. A., Zhu, J., Prete, A. D., Righetti, L.

Proceedings International Conference on Humanoid Robots, IEEE, 2019 IEEE-RAS International Conference on Humanoid Robots, October 2019 (conference)

Abstract
Trajectory optimization (TO) is one of the most powerful tools for generating feasible motions for humanoid robots. However, including uncertainties and stochasticity in the TO problem to generate robust motions can easily lead to intractable problems. Furthermore, since the models used in TO have always some level of abstraction, it can be hard to find a realistic set of uncertainties in the model space. In this paper we leverage a sample-efficient learning technique (Bayesian optimization) to robustify TO for humanoid locomotion. The main idea is to use data from full-body simulations to make the TO stage robust by tuning the cost weights. To this end, we split the TO problem into two phases. The first phase solves a convex optimization problem for generating center of mass (CoM) trajectories based on simplified linear dynamics. The second stage employs iterative Linear-Quadratic Gaussian (iLQG) as a whole-body controller to generate full body control inputs. Then we use Bayesian optimization to find the cost weights to use in the first stage that yields robust performance in the simulation/experiment, in the presence of different disturbance/uncertainties. The results show that the proposed approach is able to generate robust motions for different sets of disturbances and uncertainties.

mg

https://arxiv.org/abs/1907.04616 link (url) [BibTex]

https://arxiv.org/abs/1907.04616 link (url) [BibTex]


Learning Variable Impedance Control for Contact Sensitive Tasks
Learning Variable Impedance Control for Contact Sensitive Tasks

Bogdanovic, M., Khadiv, M., Righetti, L.

arXiv preprint, arXiv:1907.07500, July 2019 (article)

Abstract
Reinforcement learning algorithms have shown great success in solving different problems ranging from playing video games to robotics. However, they struggle to solve delicate robotic problems, especially those involving contact interactions. Though in principle a policy outputting joint torques should be able to learn these tasks, in practice we see that they have difficulty to robustly solve the problem without any structure in the action space. In this paper, we investigate how the choice of action space can give robust performance in presence of contact uncertainties. We propose to learn a policy that outputs impedance and desired position in joint space as a function of system states without imposing any other structure to the problem. We compare the performance of this approach to torque and position control policies under different contact uncertainties. Extensive simulation results on two different systems, a hopper (floating-base) with intermittent contacts and a manipulator (fixed-base) wiping a table, show that our proposed approach outperforms policies outputting torque or position in terms of both learning rate and robustness to environment uncertainty.

mg

[BibTex]


Accurate Vision-based Manipulation through Contact Reasoning
Accurate Vision-based Manipulation through Contact Reasoning

Kloss, A., Bauza, M., Wu, J., Tenenbaum, J. B., Rodriguez, A., Bohg, J.

In International Conference on Robotics and Automation, May 2019 (inproceedings) Accepted

Abstract
Planning contact interactions is one of the core challenges of many robotic tasks. Optimizing contact locations while taking dynamics into account is computationally costly and in only partially observed environments, executing contact-based tasks often suffers from low accuracy. We present an approach that addresses these two challenges for the problem of vision-based manipulation. First, we propose to disentangle contact from motion optimization. Thereby, we improve planning efficiency by focusing computation on promising contact locations. Second, we use a hybrid approach for perception and state estimation that combines neural networks with a physically meaningful state representation. In simulation and real-world experiments on the task of planar pushing, we show that our method is more efficient and achieves a higher manipulation accuracy than previous vision-based approaches.

am

Video link (url) [BibTex]

Video link (url) [BibTex]


Learning Latent Space Dynamics for Tactile Servoing
Learning Latent Space Dynamics for Tactile Servoing

Sutanto, G., Ratliff, N., Sundaralingam, B., Chebotar, Y., Su, Z., Handa, A., Fox, D.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings) Accepted

am

pdf video [BibTex]

pdf video [BibTex]


Leveraging Contact Forces for Learning to Grasp
Leveraging Contact Forces for Learning to Grasp

Merzic, H., Bogdanovic, M., Kappler, D., Righetti, L., Bohg, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings)

Abstract
Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two- fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.

am mg

video arXiv [BibTex]

video arXiv [BibTex]


Automated Generation of Reactive Programs from Human Demonstration for Orchestration of Robot Behaviors
Automated Generation of Reactive Programs from Human Demonstration for Orchestration of Robot Behaviors

Berenz, V., Bjelic, A., Mainprice, J.

ArXiv, 2019 (article)

Abstract
Social robots or collaborative robots that have to interact with people in a reactive way are difficult to program. This difficulty stems from the different skills required by the programmer: to provide an engaging user experience the behavior must include a sense of aesthetics while robustly operating in a continuously changing environment. The Playful framework allows composing such dynamic behaviors using a basic set of action and perception primitives. Within this framework, a behavior is encoded as a list of declarative statements corresponding to high-level sensory-motor couplings. To facilitate non-expert users to program such behaviors, we propose a Learning from Demonstration (LfD) technique that maps motion capture of humans directly to a Playful script. The approach proceeds by identifying the sensory-motor couplings that are active at each step using the Viterbi path in a Hidden Markov Model (HMM). Given these activation patterns, binary classifiers called evaluations are trained to associate activations to sensory data. Modularity is increased by clustering the sensory-motor couplings, leading to a hierarchical tree structure. The novelty of the proposed approach is that the learned behavior is encoded not in terms of trajectories in a task space, but as couplings between sensory information and high-level motor actions. This provides advantages in terms of behavioral generalization and reactivity displayed by the robot.

am

Support Video link (url) [BibTex]

2009


no image
Modelling the interplay of central pattern generation and sensory feedback in the neuromuscular control of running

Daley, M., Righetti, L., Ijspeert, A.

In Comparative Biochemistry and Physiology - Part A: Molecular & Integrative Physiology. Annual Main Meeting for the Society for Experimental Biology, 153, Glasgow, Scotland, 2009 (inproceedings)

mg

link (url) DOI [BibTex]

2009


link (url) DOI [BibTex]


no image
Path integral-based stochastic optimal control for rigid body dynamics

Theodorou, E. A., Buchli, J., Schaal, S.

In Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL ’09. IEEE Symposium on, pages: 219-225, 2009, clmc (inproceedings)

Abstract
Recent advances on path integral stochastic optimal control [1],[2] provide new insights in the optimal control of nonlinear stochastic systems which are linear in the controls, with state independent and time invariant control transition matrix. Under these assumptions, the Hamilton-Jacobi-Bellman (HJB) equation is formulated and linearized with the use of the logarithmic transformation of the optimal value function. The resulting HJB is a linear second order partial differential equation which is solved by an approximation based on the Feynman-Kac formula [3]. In this work we review the theory of path integral control and derive the linearized HJB equation for systems with state dependent control transition matrix. In addition we derive the path integral formulation for the general class of systems with state dimensionality that is higher than the dimensionality of the controls. Furthermore, by means of a modified inverse dynamics controller, we apply path integral stochastic optimal control over the new control space. Simulations illustrate the theoretical results. Future developments and extensions are discussed.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Learning locomotion over rough terrain using terrain templates

Kalakrishnan, M., Buchli, J., Pastor, P., Schaal, S.

In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pages: 167-172, 2009, clmc (inproceedings)

Abstract
We address the problem of foothold selection in robotic legged locomotion over very rough terrain. The difficulty of the problem we address here is comparable to that of human rock-climbing, where foot/hand-hold selection is one of the most critical aspects. Previous work in this domain typically involves defining a reward function over footholds as a weighted linear combination of terrain features. However, a significant amount of effort needs to be spent in designing these features in order to model more complex decision functions, and hand-tuning their weights is not a trivial task. We propose the use of terrain templates, which are discretized height maps of the terrain under a foothold on different length scales, as an alternative to manually designed features. We describe an algorithm that can simultaneously learn a small set of templates and a foothold ranking function using these templates, from expert-demonstrated footholds. Using the LittleDog quadruped robot, we experimentally show that the use of terrain templates can produce complex ranking functions with higher performance than standard terrain features, and improved generalization to unseen terrain.

am

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
CESAR: A lunar crater exploration and sample return robot

Schwendner, J., Grimminger, F., Bartsch, S., Kaupisch, T., Yüksel, M., Bresser, A., Akpo, J. B., Seydel, M. K. -., Dieterle, A., Schmidt, S., Kirchner, F.

In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 3355-3360, October 2009 (inproceedings)

am

DOI [BibTex]

DOI [BibTex]


no image
Concept Evaluation of a New Biologically Inspired Robot “Littleape”

Kühn, D., Römmermann, M., Sauthoff, N., Grimminger, F., Kirchner, F.

In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 589–594, IROS’09, IEEE Press, 2009 (inproceedings)

am

DOI [BibTex]

DOI [BibTex]


no image
Adaptive Frequency Oscillators and Applications

Righetti, L., Buchli, J., Ijspeert, A.

The Open Cybernetics \& Systemics Journal, 3, pages: 64-69, 2009 (article)

Abstract
In this contribution we present a generic mechanism to transform an oscillator into an adaptive frequency oscillator, which can then dynamically adapt its parameters to learn the frequency of any periodic driving signal. Adaptation is done in a dynamic way: it is part of the dynamical system and not an offline process. This mechanism goes beyond entrainment since it works for any initial frequencies and the learned frequency stays encoded in the system even if the driving signal disappears. Interestingly, this mechanism can easily be applied to a large class of oscillators from harmonic oscillators to relaxation types and strange attractors. Several practical applications of this mechanism are then presented, ranging from adaptive control of compliant robots to frequency analysis of signals and construction of limit cycles of arbitrary shape.

mg

link (url) [BibTex]

link (url) [BibTex]


Valero-Cuevas, F., Hoffmann, H., Kurse, M. U., Kutch, J. J., Theodorou, E. A.

IEEE Reviews in Biomedical Engineering – (All authors have equally contributed), (2):110?135, 2009, clmc (article)

Abstract
Computational models of the neuromuscular system hold the potential to allow us to reach a deeper understanding of neuromuscular function and clinical rehabilitation by complementing experimentation. By serving as a means to distill and explore specific hypotheses, computational models emerge from prior experimental data and motivate future experimental work. Here we review computational tools used to understand neuromuscular function including musculoskeletal modeling, machine learning, control theory, and statistical model analysis. We conclude that these tools, when used in combination, have the potential to further our understanding of neuromuscular function by serving as a rigorous means to test scientific hypotheses in ways that complement and leverage experimental data.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Compact models of motor primitive variations for predictible reaching and obstacle avoidance

Stulp, F., Oztop, E., Pastor, P., Beetz, M., Schaal, S.

In IEEE-RAS International Conference on Humanoid Robots (Humanoids 2009), Paris, Dec.7-10, 2009, clmc (inproceedings)

Abstract
over and over again. This regularity allows humans and robots to reuse existing solutions for known recurring tasks. We expect that reusing a set of standard solutions to solve similar tasks will facilitate the design and on-line adaptation of the control systems of robots operating in human environments. In this paper, we derive a set of standard solutions for reaching behavior from human motion data. We also derive stereotypical reaching trajectories for variations of the task, in which obstacles are present. These stereotypical trajectories are then compactly represented with Dynamic Movement Primitives. On the humanoid robot Sarcos CB, this approach leads to reproducible, predictable, and human-like reaching motions.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Human optimization strategies under reward feedback

Hoffmann, H., Theodorou, E., Schaal, S.

In Abstracts of Neural Control of Movement Conference (NCM 2009), Waikoloa, Hawaii, 2009, 2009, clmc (inproceedings)

Abstract
Many hypothesis on human movement generation have been cast into an optimization framework, implying that movements are adapted to optimize a single quantity, like, e.g., jerk, end-point variance, or control cost. However, we still do not understand how humans actually learn when given only a cost or reward feedback at the end of a movement. Such a reinforcement learning setting has been extensively explored theoretically in engineering and computer science, but in human movement control, hardly any experiment studied movement learning under reward feedback. We present experiments probing which computational strategies humans use to optimize a movement under a continuous reward function. We present two experimental paradigms. The first paradigm mimics a ball-hitting task. Subjects (n=12) sat in front of a computer screen and moved a stylus on a tablet towards an unknown target. This target was located on a line that the subjects had to cross. During the movement, visual feedback was suppressed. After the movement, a reward was displayed graphically as a colored bar. As reward, we used a Gaussian function of the distance between the target location and the point of line crossing. We chose such a function since in sensorimotor tasks, the cost or loss function that humans seem to represent is close to an inverted Gaussian function (Koerding and Wolpert 2004). The second paradigm mimics pocket billiards. On the same experimental setup as above, the computer screen displayed a pocket (two bars), a white disk, and a green disk. The goal was to hit with the white disk the green disk (as in a billiard collision), such that the green disk moved into the pocket. Subjects (n=8) manipulated with the stylus the white disk to effectively choose start point and movement direction. Reward feedback was implicitly given as hitting or missing the pocket with the green disk. In both paradigms, subjects increased the average reward over trials. The surprising result was that in these experiments, humans seem to prefer a strategy that uses a reward-weighted average over previous movements instead of gradient ascent. The literature on reinforcement learning is dominated by gradient-ascent methods. However, our computer simulations and theoretical analysis revealed that reward-weighted averaging is the more robust choice given the amount of movement variance observed in humans. Apparently, humans choose an optimization strategy that is suitable for their own movement variance.

am

[BibTex]

[BibTex]


no image
Concept evaluation of a new biologically inspired robot “LittleApe”

Kühn, D., Römmermann, M., Sauthoff, N., Grimminger, F., Kirchner, F.

In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 589-594, October 2009 (inproceedings)

am

DOI [BibTex]

DOI [BibTex]


no image
Adaptive compliance control of a multi-legged stair-climbing robot based on proprioceptive data

Waldron, K. J., Eich, M., Grimminger, F., Kirchner, F.

Industrial Robot: An International Journal, 36, pages: 331-339, Emerald Group Publishing Limited, June 2009 (article)

am

DOI [BibTex]

DOI [BibTex]


no image
On-line learning and modulation of periodic movements with nonlinear dynamical systems

Gams, A., Ijspeert, A., Schaal, S., Lenarčič, J.

Autonomous Robots, 27(1):3-23, 2009, clmc (article)

Abstract
Abstract  The paper presents a two-layered system for (1) learning and encoding a periodic signal without any knowledge on its frequency and waveform, and (2) modulating the learned periodic trajectory in response to external events. The system is used to learn periodic tasks on a humanoid HOAP-2 robot. The first layer of the system is a dynamical system responsible for extracting the fundamental frequency of the input signal, based on adaptive frequency oscillators. The second layer is a dynamical system responsible for learning of the waveform based on a built-in learning algorithm. By combining the two dynamical systems into one system we can rapidly teach new trajectories to robots without any knowledge of the frequency of the demonstration signal. The system extracts and learns only one period of the demonstration signal. Furthermore, the trajectories are robust to perturbations and can be modulated to cope with a dynamic environment. The system is computationally inexpensive, works on-line for any periodic signal, requires no additional signal processing to determine the frequency of the input signal and can be applied in parallel to multiple dimensions. Additionally, it can adapt to changes in frequency and shape, e.g. to non-stationary signals, such as hand-generated signals and human demonstrations.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Local dimensionality reduction for non-parametric regression

Hoffman, H., Schaal, S., Vijayakumar, S.

Neural Processing Letters, 2009, clmc (article)

Abstract
Locally-weighted regression is a computationally-efficient technique for non-linear regression. However, for high-dimensional data, this technique becomes numerically brittle and computationally too expensive if many local models need to be maintained simultaneously. Thus, local linear dimensionality reduction combined with locally-weighted regression seems to be a promising solution. In this context, we review linear dimensionality-reduction methods, compare their performance on nonparametric locally-linear regression, and discuss their ability to extend to incremental learning. The considered methods belong to the following three groups: (1) reducing dimensionality only on the input data, (2) modeling the joint input-output data distribution, and (3) optimizing the correlation between projection directions and output data. Group 1 contains principal component regression (PCR); group 2 contains principal component analysis (PCA) in joint input and output space, factor analysis, and probabilistic PCA; and group 3 contains reduced rank regression (RRR) and partial least squares (PLS) regression. Among the tested methods, only group 3 managed to achieve robust performance even for a non-optimal number of components (factors or projection directions). In contrast, group 1 and 2 failed for fewer components since these methods rely on the correct estimate of the true intrinsic dimensionality. In group 3, PLS is the only method for which a computationally-efficient incremental implementation exists. Thus, PLS appears to be ideally suited as a building block for a locally-weighted regressor in which projection directions are incrementally added on the fly.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Proprioceptive control of a hybrid legged-wheeled robot

Eich, M., Grimminger, F., Kirchner, F.

In 2008 IEEE International Conference on Robotics and Biomimetics, pages: 774-779, February 2009 (inproceedings)

am

DOI [BibTex]

DOI [BibTex]


no image
Learning and generalization of motor skills by learning from demonstration

Pastor, P., Hoffmann, H., Asfour, T., Schaal, S.

In International Conference on Robotics and Automation (ICRA2009), Kobe, Japan, May 12-19, 2009, 2009, clmc (inproceedings)

Abstract
We provide a general approach for learning robotic motor skills from human demonstration. To represent an observed movement, a non-linear differential equation is learned such that it reproduces this movement. Based on this representation, we build a library of movements by labeling each recorded movement according to task and context (e.g., grasping, placing, and releasing). Our differential equation is formulated such that generalization can be achieved simply by adapting a start and a goal parameter in the equation to the desired position values of a movement. For object manipulation, we present how our framework extends to the control of gripper orientation and finger position. The feasibility of our approach is demonstrated in simulation as well as on a real robot. The robot learned a pick-and-place operation and a water-serving task and could generalize these tasks to novel situations.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Compliant quadruped locomotion over rough terrain

Buchli, J., Kalakrishnan, M., Mistry, M., Pastor, P., Schaal, S.

In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pages: 814-820, 2009, clmc (inproceedings)

Abstract
Many critical elements for statically stable walking for legged robots have been known for a long time, including stability criteria based on support polygons, good foothold selection, recovery strategies to name a few. All these criteria have to be accounted for in the planning as well as the control phase. Most legged robots usually employ high gain position control, which means that it is crucially important that the planned reference trajectories are a good match for the actual terrain, and that tracking is accurate. Such an approach leads to conservative controllers, i.e. relatively low speed, ground speed matching, etc. Not surprisingly such controllers are not very robust - they are not suited for the real world use outside of the laboratory where the knowledge of the world is limited and error prone. Thus, to achieve robust robotic locomotion in the archetypical domain of legged systems, namely complex rough terrain, where the size of the obstacles are in the order of leg length, additional elements are required. A possible solution to improve the robustness of legged locomotion is to maximize the compliance of the controller. While compliance is trivially achieved by reduced feedback gains, for terrain requiring precise foot placement (e.g. climbing rocks, walking over pegs or cracks) compliance cannot be introduced at the cost of inferior tracking. Thus, model-based control and - in contrast to passive dynamic walkers - active balance control is required. To achieve these objectives, in this paper we add two crucial elements to legged locomotion, i.e., floating-base inverse dynamics control and predictive force control, and we show that these elements increase robustness in face of unknown and unanticipated perturbations (e.g. obstacles). Furthermore, we introduce a novel line-based COG trajectory planner, which yields a simpler algorithm than traditional polygon based methods and creates the appropriate input to our control system.We show results from bot- h simulation and real world of a robotic dog walking over non-perceived obstacles and rocky terrain. The results prove the effectivity of the inverse dynamics/force controller. The presented results show that we have all elements needed for robust all-terrain locomotion, which should also generalize to other legged systems, e.g., humanoid robots.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Incorporating Muscle Activation-Contraction dynamics to an optimal control framework for finger movements

Theodorou, Evangelos A., Valero-Cuevas, Francisco J.

Abstracts of Neural Control of Movement Conference (NCM 2009), 2009, clmc (article)

Abstract
Recent experimental and theoretical work [1] investigated the neural control of contact transition between motion and force during tapping with the index finger as a nonlinear optimization problem. Such transitions from motion to well-directed contact force are a fundamental part of dexterous manipulation. There are 3 alternative hypotheses of how this transition could be accomplished by the nervous system as a function of changes in direction and magnitude of the torque vector controlling the finger. These hypotheses are 1) an initial change in direction with a subsequent change in magnitude of the torque vector; 2) an initial change in magnitude with a subsequent directional change of the torque vector; and 3) a simultaneous and proportionally equal change of both direction and magnitude of the torque vector. Experimental work in [2] shows that the nervous system selects the first strategy, and in [1] we suggest that this may in fact be the optimal strategy. In [4] the framework of Iterative Linear Quadratic Optimal Regulator (ILQR) was extended to incorporate motion and force control. However, our prior simulation work assumed direct and instantaneous control of joint torques, which ignores the known delays and filtering properties of skeletal muscle. In this study, we implement an ILQR controller for a more biologically plausible biomechanical model of the index finger than [4], and add activation-contraction dynamics to the system to simulate muscle function. The planar biomechanical model includes the kinematics of the 3 joints while the applied torques are driven by activation?contraction dynamics with biologically plausible time constants [3]. In agreement with our experimental work [2], the task is to, within 500 ms, move the finger from a given resting configuration to target configuration with a desired terminal velocity. ILQR does not only stabilize the finger dynamics according to the objective function, but it also generates smooth joint space trajectories with minimal tuning and without an a-priori initial control policy (which is difficult to find for highly dimensional biomechanical systems). Furthemore, the use of this optimal control framework and the addition of activation-contraction dynamics considers the full nonlinear dynamics of the index finger and produces a sequence of postures which are compatible with experimental motion data [2]. These simulations combined with prior experimental results suggest that optimal control is a strong candidate for the generation of finger movements prior to abrupt motion-to-force transitions. This work is funded in part by grants NIH R01 0505520 and NSF EFRI-0836042 to Dr. Francisco J. Valero- Cuevas 1 Venkadesan M, Valero-Cuevas FJ. 
Effects of neuromuscular lags on controlling contact transitions. 
Philosophical Transactions of the Royal Society A: 2008. 2 Venkadesan M, Valero-Cuevas FJ. 
Neural Control of Motion-to-Force Transitions with the Fingertip. 
J. Neurosci., Feb 2008; 28: 1366 - 1373; 3 Zajac. Muscle and tendon: properties, models, scaling, and application to biomechanics and motor control. Crit Rev Biomed Eng, 17 4. Weiwei Li., Francisco Valero Cuevas: ?Linear Quadratic Optimal Control of Contact Transition with Fingertip ? ACC 2009

am

PDF [BibTex]

PDF [BibTex]


no image
Inertial parameter estimation of floating-base humanoid systems using partial force sensing

Mistry, M., Schaal, S., Yamane, K.

In IEEE-RAS International Conference on Humanoid Robots (Humanoids 2009), Paris, Dec.7-10, 2009, clmc (inproceedings)

Abstract
Recently, several controllers have been proposed for humanoid robots which rely on full-body dynamic models. The estimation of inertial parameters from data is a critical component for obtaining accurate models for control. However, floating base systems, such as humanoid robots, incur added challenges to this task (e.g. contact forces must be measured, contact states can change, etc.) In this work, we outline a theoretical framework for whole body inertial parameter estimation, including the unactuated floating base. Using a least squares minimization approach, conducted within the nullspace of unmeasured degrees of freedom, we are able to use a partial force sensor set for full-body estimation, e.g. using only joint torque sensors, allowing for estimation when contact force measurement is unavailable or unreliable (e.g. due to slipping, rolling contacts, etc.). We also propose how to determine the theoretical minimum force sensor set for full body estimation, and discuss the practical limitations of doing so.

am

link (url) [BibTex]

link (url) [BibTex]


no image
On-line learning and modulation of periodic movements with nonlinear dynamical systems

Gams, A., Ijspeert, A., Schaal, S., Lenarčič, J.

Autonomous Robots, 27(1):3-23, 2009, clmc (article)

Abstract
Abstract  The paper presents a two-layered system for (1) learning and encoding a periodic signal without any knowledge on its frequency and waveform, and (2) modulating the learned periodic trajectory in response to external events. The system is used to learn periodic tasks on a humanoid HOAP-2 robot. The first layer of the system is a dynamical system responsible for extracting the fundamental frequency of the input signal, based on adaptive frequency oscillators. The second layer is a dynamical system responsible for learning of the waveform based on a built-in learning algorithm. By combining the two dynamical systems into one system we can rapidly teach new trajectories to robots without any knowledge of the frequency of the demonstration signal. The system extracts and learns only one period of the demonstration signal. Furthermore, the trajectories are robust to perturbations and can be modulated to cope with a dynamic environment. The system is computationally inexpensive, works on-line for any periodic signal, requires no additional signal processing to determine the frequency of the input signal and can be applied in parallel to multiple dimensions. Additionally, it can adapt to changes in frequency and shape, e.g. to non-stationary signals, such as hand-generated signals and human demonstrations.

am

link (url) [BibTex]

link (url) [BibTex]

1999


no image
Is imitation learning the route to humanoid robots?

Schaal, S.

Trends in Cognitive Sciences, 3(6):233-242, 1999, clmc (article)

Abstract
This review will focus on two recent developments in artificial intelligence and neural computation: learning from imitation and the development of humanoid robots. It will be postulated that the study of imitation learning offers a promising route to gain new insights into mechanisms of perceptual motor control that could ultimately lead to the creation of autonomous humanoid robots. This hope is justified because imitation learning channels research efforts towards three important issues: efficient motor learning, the connection between action and perception, and modular motor control in form of movement primitives. In order to make these points, first, a brief review of imitation learning will be given from the view of psychology and neuroscience. In these fields, representations and functional connections between action and perception have been explored that contribute to the understanding of motor acts of other beings. The recent discovery that some areas in the primate brain are active during both movement perception and execution provided a first idea of the possible neural basis of imitation. Secondly, computational approaches to imitation learning will be described, initially from the perspective of traditional AI and robotics, and then with a focus on neural network models and statistical learning research. Parallels and differences between biological and computational approaches to imitation will be highlighted. The review will end with an overview of current projects that actually employ imitation learning for humanoid robots.

am

link (url) [BibTex]

1999


link (url) [BibTex]


no image
Segmentation of endpoint trajectories does not imply segmented control

Sternad, D., Schaal, D.

Experimental Brain Research, 124(1):118-136, 1999, clmc (article)

Abstract
While it is generally assumed that complex movements consist of a sequence of simpler units, the quest to define these units of action, or movement primitives, still remains an open question. In this context, two hypotheses of movement segmentation of endpoint trajectories in 3D human drawing movements are re-examined: (1) the stroke-based segmentation hypothesis based on the results that the proportionality coefficient of the 2/3 power law changes discontinuously with each new â??strokeâ?, and (2) the segmentation hypothesis inferred from the observation of piecewise planar endpoint trajectories of 3D drawing movements. In two experiments human subjects performed a set of elliptical and figure-8 patterns of different sizes and orientations using their whole arm in 3D. The kinematic characteristics of the endpoint trajectories and the seven joint angles of the arm were analyzed. While the endpoint trajectories produced similar segmentation features as reported in the literature, analyses of the joint angles show no obvious segmentation but rather continuous oscillatory patterns. By approximating the joint angle data of human subjects with sinusoidal trajectories, and by implementing this model on a 7-degree-of-freedom anthropomorphic robot arm, it is shown that such a continuous movement strategy can produce exactly the same features as observed by the above segmentation hypotheses. The origin of this apparent segmentation of endpoint trajectories is traced back to the nonlinear transformations of the forward kinematics of human arms. The presented results demonstrate that principles of discrete movement generation may not be reconciled with those of rhythmic movement as easily as has been previously suggested, while the generalization of nonlinear pattern generators to arm movements can offer an interesting alternative to approach the question of units of action.

am

link (url) [BibTex]

link (url) [BibTex]