Back
Robot learning from demonstration
The goal of robot learning from demonstration is to have a robot learn from watching a demonstration of the task to be performed. In our approach to learning from demonstration the robot learns a reward function from the demonstration and a task model from repeated attempts to perform the task. A policy is computed based on the learned reward function and task model. Lessons learned from an implementation on an anthropomorphic robot arm using a pendulum swing up task include 1) simply mimicking demonstrated motions is not adequate to perform this task, 2) a task planner can use a learned model and reward function to compute an appropriate policy, 3) this model-based planning process supports rapid learning, 4) both parametric and nonparametric models can be learned and used, and 5) incorporating a task level direct learning component, which is non-model-based, in addition to the model-based planner, is useful in compensating for structural modeling errors and slow model learning.Â
@inproceedings{Atkeson_MLPFIC_1997, title = {Robot learning from demonstration}, booktitle = {Machine Learning: Proceedings of the Fourteenth International Conference (ICML '97)}, abstract = {The goal of robot learning from demonstration is to have a robot learn from watching a demonstration of the task to be performed. In our approach to learning from demonstration the robot learns a reward function from the demonstration and a task model from repeated attempts to perform the task. A policy is computed based on the learned reward function and task model. Lessons learned from an implementation on an anthropomorphic robot arm using a pendulum swing up task include 1) simply mimicking demonstrated motions is not adequate to perform this task, 2) a task planner can use a learned model and reward function to compute an appropriate policy, 3) this model-based planning process supports rapid learning, 4) both parametric and nonparametric models can be learned and used, and 5) incorporating a task level direct learning component, which is non-model-based, in addition to the model-based planner, is useful in compensating for structural modeling errors and slow model learning. }, pages = {12-20}, editors = {Fisher Jr., D. H.}, publisher = {Morgan Kaufmann}, address = {Nashville, TN, July 8-12, 1997}, year = {1997}, note = {clmc}, slug = {atkeson_mlpfic_1997}, author = {Atkeson, C. G. and Schaal, S.}, crossref = {p42}, url = {http://www-clmc.usc.edu/publications/A/atkeson-ICML1997.pdf} }