Offline Diversity Maximization under Imitation Constraints

Institute Homepage

Institute Homepage DE Sign In

Back

Autonomous Learning Article 2023

Autonomous Learning

Marin Vlastelica Pogancic

Autonomous Learning

Jin Cheng

Intern

Empirical Inference, Autonomous Learning

Georg Martius

Senior Research Scientist

Autonomous Learning

Pavel Kolev

There has been significant recent progress in the area of unsupervised skill discovery, utilizing various information-theoretic objectives as measures of diversity. Despite these advances, challenges remain: current methods require significant online interaction, fail to leverage vast amounts of available task-agnostic data and typically lack a quantitative measure of skill utility. We address these challenges by proposing a principled offline algorithm for unsupervised skill discovery that, in addition to maximizing diversity, ensures that each learned skill imitates state-only expert demonstrations to a certain degree. Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery to maximize a mutual information objective subject to KL-divergence state occupancy constraints. Furthermore, we demonstrate the effectiveness of our method on the standard offline benchmark D4RL and on a custom offline dataset collected from a 12-DoF quadruped robot for which the policies trained in simulation transfer well to the real robotic system.

Author(s):	Vlastelica Marin and Cheng Jin and Martius, Georg and Kolev, Pavel
Book Title:	Offline Diversity Maximization under Imitation Constraints
Journal:	Reinforcement Learning Journal
Volume:	3
Pages:	1377-1409
Year:	2023
Month:	July
Day:	21

Project(s):	Offline Diversity Under Imitation Constraints
Bibtex Type:	Article (article)

DOI:	https://doi.org/10.5281/zenodo.13899776
State:	Published
URL:	https://api.semanticscholar.org/CorpusID:264709383

Links:	Website

BibTex

@article{vlastelica2023:OfflineDM,
  title = {Offline Diversity Maximization under Imitation Constraints},
  journal = {Reinforcement Learning Journal},
  booktitle = {Offline Diversity Maximization under Imitation Constraints},
  abstract = {There has been significant recent progress in the area of unsupervised skill discovery, utilizing various information-theoretic objectives as measures of diversity. Despite these advances, challenges remain: current methods require significant online interaction, fail to leverage vast amounts of available task-agnostic data and typically lack a quantitative measure of skill utility. We address these challenges by proposing a principled offline algorithm for unsupervised skill discovery that, in addition to maximizing diversity, ensures that each learned skill imitates state-only expert demonstrations to a certain degree. Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery to maximize a mutual information objective subject to KL-divergence state occupancy constraints. Furthermore, we demonstrate the effectiveness of our method on the standard offline benchmark D4RL and on a custom offline dataset collected from a 12-DoF quadruped robot for which the policies trained in simulation transfer well to the real robotic system.},
  volume = {3},
  pages = {1377-1409},
  month = jul,
  year = {2023},
  slug = {vlastelica2023-offlinedm},
  author = {Marin, Vlastelica and Jin, Cheng and Martius, Georg and Kolev, Pavel},
  url = {https://api.semanticscholar.org/CorpusID:264709383},
  month_numeric = {7}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives

BibTex