Autonomous Learning
Embodied Vision
Conference Paper
2020
Sample-efficient Cross-Entropy Method for Real-time Planning

Autonomous Learning
Cristina Pinneri

Robust Machine Learning
Sebastian Blaes
- Postdoctoral Researcher

Embodied Vision
Jan Achterhold

Embodied Vision
Jörg Stückler
Max Planck Research Group Leader

Autonomous Learning
Michal Rolinek

Empirical Inference, Autonomous Learning
Georg Martius
Senior Research Scientist
Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.
Author(s): | Cristina Pinneri and Shambhuraj Sawant and Sebastian Blaes and Jan Achterhold and Joerg Stueckler and Michal Rolinek and Georg Martius |
Book Title: | Conference on Robot Learning 2020 |
Year: | 2020 |
Project(s): | |
Bibtex Type: | Conference Paper (inproceedings) |
State: | Published |
URL: | https://corlconf.github.io/corl2020/paper_217/ |
Electronic Archiving: | grant_archive |
Links: |
BibTex
@inproceedings{PinneriEtAl2020:iCEM, title = {Sample-efficient Cross-Entropy Method for Real-time Planning}, booktitle = {Conference on Robot Learning 2020}, abstract = {Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.}, year = {2020}, slug = {pinnerietal2020-icem}, author = {Pinneri, Cristina and Sawant, Shambhuraj and Blaes, Sebastian and Achterhold, Jan and Stueckler, Joerg and Rolinek, Michal and Martius, Georg}, url = {https://corlconf.github.io/corl2020/paper_217/ } }