Sample-efficient Cross-Entropy Method for Real-time Planning

Institute Homepage

Institute Homepage EN Sign In

Back

Autonomous Learning Embodied Vision Conference Paper 2020

Autonomous Learning

Cristina Pinneri

Robust Machine Learning

Sebastian Blaes

Postdoctoral Researcher

Max Planck Research Group Leader

Autonomous Learning

Michal Rolinek

Empirische Inferenz, Autonomous Learning

Georg Martius

Senior Research Scientist

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

Author(s):	Cristina Pinneri and Shambhuraj Sawant and Sebastian Blaes and Jan Achterhold and Joerg Stueckler and Michal Rolinek and Georg Martius
Book Title:	Conference on Robot Learning 2020
Year:	2020

Project(s):	Model-based Reinforcement Learning and Planning
Bibtex Type:	Conference Paper (inproceedings)

State:	Published
URL:	https://corlconf.github.io/corl2020/paper_217/

Electronic Archiving:	grant_archive

Links:	Paper Code Spotlight-Video

BibTex

@inproceedings{PinneriEtAl2020:iCEM,
  title = {Sample-efficient Cross-Entropy Method for Real-time Planning},
  booktitle = {Conference on Robot Learning 2020},
  abstract = {Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.},
  year = {2020},
  slug = {pinnerietal2020-icem},
  author = {Pinneri, Cristina and Sawant, Shambhuraj and Blaes, Sebastian and Achterhold, Jan and Stueckler, Joerg and Rolinek, Michal and Martius, Georg},
  url = {https://corlconf.github.io/corl2020/paper_217/ }
}

Forschung

Abteilungen

Forschungsgruppen

Personen

Kontakt

Our Institute

Unsere Geschichte

Karriere

Überblick über Promotionsprogramme

Karriere

Service-Einrichtungen

Zentrale Wissenschaftliche Einrichtungen

Werkstätten

Campus Services

Impact

Kooperationen

Initiativen und Partner

Forschung

Abteilungen

Forschungsgruppen

Personen

Kontakt

Our Institute

Unsere Geschichte

Karriere

Überblick über Promotionsprogramme

Karriere

Service-Einrichtungen

Zentrale Wissenschaftliche Einrichtungen

Werkstätten

Campus Services

Impact

Kooperationen

Initiativen und Partner

BibTex