On the Design of LQR Kernels for Efficient Controller Learning

Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.
Author(s): | Alonso Marco and Philipp Hennig and Stefan Schaal and Sebastian Trimpe |
Book Title: | Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC) |
Pages: | 5193--5200 |
Year: | 2017 |
Month: | December |
Day: | 12-15 |
Publisher: | IEEE |
Project(s): | |
Bibtex Type: | Conference Paper (conference) |
DOI: | 10.1109/CDC.2017.8264429 |
Event Name: | IEEE Conference on Decision and Control |
Event Place: | Melbourne, VIC, Australia |
State: | Published |
Electronic Archiving: | grant_archive |
Links: |
BibTex
@conference{MaHeScTr17, title = {On the Design of {LQR} Kernels for Efficient Controller Learning}, booktitle = {Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC)}, abstract = {Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.}, pages = {5193--5200}, publisher = {IEEE}, month = dec, year = {2017}, slug = {mahesctr17}, author = {Marco, Alonso and Hennig, Philipp and Schaal, Stefan and Trimpe, Sebastian}, month_numeric = {12} }