Autonomous Learning Conference Paper 2023

Regularity as Intrinsic Reward for Free Play

Thumb xxl rair project

We propose regularity as a novel reward signal for intrinsically-motivated reinforcement learning. Taking inspiration from child development, we postulate that striving for structure and order helps guide exploration towards a subspace of tasks that are not favored by naive uncertainty-based intrinsic rewards. Our generalized formulation of Regularity as Intrinsic Reward (RaIR) allows us to operationalize it within model-based reinforcement learning. In a synthetic environment, we showcase the plethora of structured patterns that can emerge from pursuing this regularity objective. We also demonstrate the strength of our method in a multi-object robotic manipulation environment. We incorporate RaIR into free play and use it to complement the model’s epistemic uncertainty as an intrinsic reward. Doing so, we witness the autonomous construction of towers and other regular structures during free play, which leads to a substantial improvement in zero-shot downstream task performance on assembly tasks.

Author(s): Sancaktar, Cansu and Piater, Justus and Martius, Georg
Book Title: Advances in Neural Information Processing Systems (NeurIPS
Year: 2023
Month: September
Day: 21
Bibtex Type: Conference Paper (inproceedings)
Event Name: Advances in Neural Information Processing Systems 36
Event Place: New Orleans, USA
State: Published
URL: https://openreview.net/forum?id=BHHrX3CRE1

BibTex

@inproceedings{sancaktar2023:regularity,
  title = {Regularity as Intrinsic Reward for Free Play},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS},
  abstract = {We propose regularity as a novel reward signal for intrinsically-motivated reinforcement learning. Taking inspiration from child development, we postulate that striving for structure and order helps guide exploration towards a subspace of tasks that are not favored by naive uncertainty-based intrinsic rewards. Our generalized formulation of Regularity as Intrinsic Reward (RaIR) allows us to operationalize it within model-based reinforcement learning. In a synthetic environment, we showcase the plethora of structured patterns that can emerge from pursuing this regularity objective. We also demonstrate the strength of our method in a multi-object robotic manipulation environment. We incorporate RaIR into free play and use it to complement the model’s epistemic uncertainty as an intrinsic reward. Doing so, we witness the autonomous construction of towers and other regular structures during free play, which leads to a substantial improvement in zero-shot downstream task performance on assembly tasks.},
  month = sep,
  year = {2023},
  slug = {sancaktar2023-regularity},
  author = {Sancaktar, Cansu and Piater, Justus and Martius, Georg},
  url = {https://openreview.net/forum?id=BHHrX3CRE1},
  month_numeric = {9}
}