Autonomous Motion Intelligent Control Systems Conference Paper 2015

Direct Loss Minimization Inverse Optimal Control

Screen shot 2015 08 22 at 21.47.37

Inverse Optimal Control (IOC) has strongly impacted the systems engineering process, enabling automated planner tuning through straightforward and intuitive demonstration. The most successful and established applications, though, have been in lower dimensional problems such as navigation planning where exact optimal planning or control is feasible. In higher dimensional systems, such as humanoid robots, research has made substantial progress toward generalizing the ideas to model free or locally optimal settings, but these systems are complicated to the point where demonstration itself can be difficult. Typically, real-world applications are restricted to at best noisy or even partial or incomplete demonstrations that prove cumbersome in existing frameworks. This work derives a very flexible method of IOC based on a form of Structured Prediction known as Direct Loss Minimization. The resulting algorithm is essentially Policy Search on a reward function that rewards similarity to demonstrated behavior (using Covariance Matrix Adaptation (CMA) in our experiments). Our framework blurs the distinction between IOC, other forms of Imitation Learning, and Reinforcement Learning, enabling us to derive simple, versatile, and practical algorithms that blend imitation and reinforcement signals into a unified framework. Our experiments analyze various aspects of its performance and demonstrate its efficacy on conveying preferences for motion shaping and combined reach and grasp quality optimization.

Author(s): Doerr, Andreas and Ratliff, Nathan and Bohg, Jeannette and Toussaint, Marc and Schaal, Stefan
Book Title: Proceedings of Robotics: Science and Systems
Year: 2015
Month: July
Project(s):
Bibtex Type: Conference Paper (inproceedings)
Address: Rome, Italy
Event Name: Robotics: Science and Systems XI
State: Published
Electronic Archiving: grant_archive
Links:

BibTex

@inproceedings{Doerr-RSS-15,
  title = {Direct Loss Minimization Inverse Optimal Control},
  booktitle = {Proceedings of Robotics: Science and Systems},
  abstract = {Inverse Optimal Control (IOC) has strongly impacted the systems engineering process, enabling automated planner tuning through straightforward and intuitive demonstration. The most successful and established applications, though, have been in lower dimensional problems such as navigation planning where exact optimal planning or control is feasible. In higher dimensional systems, such as humanoid robots, research has made substantial progress toward generalizing the ideas to model free or locally optimal settings, but these systems are complicated to the point where demonstration itself can be difficult. Typically, real-world applications are restricted to at best noisy or even partial or incomplete demonstrations that prove cumbersome in existing frameworks. This work derives a very flexible method of IOC based on a form of Structured Prediction known as Direct Loss Minimization. The resulting algorithm is essentially Policy Search on a reward function that rewards similarity to demonstrated behavior (using Covariance Matrix Adaptation (CMA) in our experiments). Our framework blurs the distinction between IOC, other forms of Imitation Learning, and Reinforcement Learning, enabling us to derive simple, versatile, and practical algorithms that blend imitation and reinforcement signals into a unified framework. Our experiments analyze various aspects of its performance and demonstrate its efficacy on conveying preferences for motion shaping and combined reach and grasp quality optimization.},
  address = {Rome, Italy},
  month = jul,
  year = {2015},
  slug = {doerr-rss-15},
  author = {Doerr, Andreas and Ratliff, Nathan and Bohg, Jeannette and Toussaint, Marc and Schaal, Stefan},
  month_numeric = {7}
}