Learning from demonstration

Institute Homepage DE Sign In

Autonomous Motion Conference Paper 1997

Director

By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstrations of other humans. For learning control, this paper investigates how learning from demonstration can be applied in the context of reinforcement learning. We consider priming the Q-function, the value function, the policy, and the model of the task dynamics as possible areas where demonstrations can speed up learning. In general nonlinear learning problems, only model-based reinforcement learning shows significant speed-up after a demonstration, while in the special case of linear quadratic regulator (LQR) problems, all methods profit from the demonstration. In an implementation of pole balancing on a complex anthropomorphic robot arm, we demonstrate that, when facing the complexities of real signal processing, model-based reinforcement learning offers the most robustness for LQR problems. Using the suggested methods, the robot learns pole balancing in just a single trial after a 30 second long demonstration of the human instructor.Â

Author(s):	Schaal, S.
Book Title:	Advances in Neural Information Processing Systems 9
Pages:	1040-1046
Year:	1997
Editors:	Mozer, M. C.;Jordan, M.;Petsche, T.
Publisher:	MIT Press

Bibtex Type:	Conference Paper (inproceedings)

Address:	Cambridge, MA
URL:	http://www-clmc.usc.edu/publications/S/schaal-NIPS1997.pdf

Cross Ref:	p873
Electronic Archiving:	grant_archive
Note:	clmc

BibTex

@inproceedings{Schaal_ANIPS_1997,
  title = {Learning from demonstration},
  booktitle = {Advances in Neural Information Processing Systems 9},
  abstract = {By now it is widely accepted that learning a task from scratch, i.e., without any prior knowledge, is a daunting undertaking. Humans, however, rarely attempt to learn from scratch. They extract initial biases as well as strategies how to approach a learning problem from instructions and/or demonstrations of other humans. For learning control, this paper investigates how learning from demonstration can be applied in the context of reinforcement learning. We consider priming the Q-function, the value function, the policy, and the model of the task dynamics as possible areas where demonstrations can speed up learning. In general nonlinear learning problems, only model-based reinforcement learning shows significant speed-up after a demonstration, while in the special case of linear quadratic regulator (LQR) problems, all methods profit from the demonstration. In an implementation of pole balancing on a complex anthropomorphic robot arm, we demonstrate that, when facing the complexities of real signal processing, model-based reinforcement learning offers the most robustness for LQR problems. Using the suggested methods, the robot learns pole balancing in just a single trial after a 30 second long demonstration of the human instructor.Â },
  pages = {1040-1046},
  editors = {Mozer, M. C.;Jordan, M.;Petsche, T.},
  publisher = {MIT Press},
  address = {Cambridge, MA},
  year = {1997},
  note = {clmc},
  slug = {schaal_anips_1997},
  author = {Schaal, S.},
  crossref = {p873},
  url = {http://www-clmc.usc.edu/publications/S/schaal-NIPS1997.pdf}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives