Applying the episodic natural actor-critic architecture to motor primitive learning
In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the Òbuilding blocks of movement generationÓ, called motor primitives. Motor primitives are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. We show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.
Author(s): | Peters, J. and Schaal, S. |
Book Title: | Proceedings of the 2007 European Symposium on Artificial Neural Networks (ESANN) |
Year: | 2007 |
Bibtex Type: | Conference Paper (inproceedings) |
Address: | Bruges, Belgium, April 25-27 |
URL: | http://www-clmc.usc.edu/publications//P/peters-ESANN2007.pdf |
Cross Ref: | p2673 |
Electronic Archiving: | grant_archive |
Note: | clmc |
BibTex
@inproceedings{Peters_PESANN_2007, title = {Applying the episodic natural actor-critic architecture to motor primitive learning}, booktitle = {Proceedings of the 2007 European Symposium on Artificial Neural Networks (ESANN)}, abstract = {In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the Òbuilding blocks of movement generationÓ, called motor primitives. Motor primitives are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. We show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.}, address = {Bruges, Belgium, April 25-27}, year = {2007}, note = {clmc}, slug = {peters_pesann_2007}, author = {Peters, J. and Schaal, S.}, crossref = {p2673}, url = {http://www-clmc.usc.edu/publications//P/peters-ESANN2007.pdf} }