Optimizing Long-term Predictions for Model-based Policy Search

Institute Homepage

Institute Homepage EN Sign In

Back

Autonomous Motion Intelligent Control Systems Conference Paper 2017

Intelligent Control Systems

Andreas Doerr

Intelligent Control Systems

Alonso Marco Valle

Autonomous Motion

Stefan Schaal

Director

Intelligent Control Systems

Sebastian Trimpe

We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

Author(s):	Andreas Doerr and Christian Daniel and Duy Nguyen-Tuong and Alonso Marco and Stefan Schaal and Marc Toussaint and Sebastian Trimpe
Book Title:	Proceedings of 1st Annual Conference on Robot Learning (CoRL)
Volume:	78
Pages:	227-238
Year:	2017
Month:	November
Editors:	Sergey Levine and Vincent Vanhoucke and Ken Goldberg

Project(s):	Learning Probabilistic Dynamics Models
Bibtex Type:	Conference Paper (conference)

Event Name:	1st Annual Conference on Robot Learning
Event Place:	Mountain View, CA, USA
State:	Published

Electronic Archiving:	grant_archive

Links:	PDF

BibTex

@conference{doerr2017optimizing,
  title = {Optimizing Long-term Predictions for Model-based Policy Search},
  booktitle = {Proceedings of 1st Annual Conference on Robot Learning (CoRL)},
  abstract = {We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment. },
  volume = {78},
  pages = {227-238},
  editors = {Sergey Levine and Vincent Vanhoucke and Ken Goldberg},
  month = nov,
  year = {2017},
  slug = {doerr_corl_2017},
  author = {Doerr, Andreas and Daniel, Christian and Nguyen-Tuong, Duy and Marco, Alonso and Schaal, Stefan and Toussaint, Marc and Trimpe, Sebastian},
  month_numeric = {11}
}

Forschung

Abteilungen

Forschungsgruppen

Personen

Kontakt

Our Institute

Unsere Geschichte

Karriere

Überblick über Promotionsprogramme

Karriere

Service-Einrichtungen

Zentrale Wissenschaftliche Einrichtungen

Werkstätten

Campus Services

Impact

Kooperationen

Initiativen und Partner

Forschung

Abteilungen

Forschungsgruppen

Personen

Kontakt

Our Institute

Unsere Geschichte

Karriere

Überblick über Promotionsprogramme

Karriere

Service-Einrichtungen

Zentrale Wissenschaftliche Einrichtungen

Werkstätten

Campus Services

Impact

Kooperationen

Initiativen und Partner

BibTex