Autonomous Learning Conference Paper 2023

Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning

Thumb2

In off-policy deep reinforcement learning with continuous action spaces, exploration is often implemented by injecting action noise into the action selection process. Popular algorithms based on stochastic policies, such as SAC or MPO, inject white noise by sampling actions from uncorrelated Gaussian distributions. In many tasks, however, white noise does not provide sufficient exploration, and temporally correlated noise is used instead. A common choice is Ornstein-Uhlenbeck (OU) noise, which is closely related to Brownian motion (red noise). Both red noise and white noise belong to the broad family of colored noise. In this work, we perform a comprehensive experimental evaluation on MPO and SAC to explore the effectiveness of other colors of noise as action noise. We find that pink noise, which is halfway between white and red noise, significantly outperforms white noise, OU noise, and other alternatives on a wide range of environments. Thus, we recommend it as the default choice for action noise in continuous control.

Author(s): Onno Eberhard and Jakob Hollenstein and Cristina Pinneri and Georg Martius
Book Title: Proceedings of the Eleventh International Conference on Learning Representations (ICLR)
Year: 2023
Month: May
Bibtex Type: Conference Paper (inproceedings)
Event Name: The Eleventh International Conference on Learning Representations (ICLR)
Event Place: Kigali, Rwanda
URL: https://openreview.net/forum?id=hQ9V5QN27eS
Electronic Archiving: grant_archive
Talk Type: Spotlight (notable top 25%)

BibTex

@inproceedings{EberhardEtal2023:PinkNoise,
  title = {Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning},
  booktitle = {Proceedings of the Eleventh International Conference on Learning Representations (ICLR)},
  abstract = {In off-policy deep reinforcement learning with continuous action spaces, exploration is often implemented by injecting action noise into the action selection process. Popular algorithms based on stochastic policies, such as SAC or MPO, inject white noise by sampling actions from uncorrelated Gaussian distributions. In many tasks, however, white noise does not provide sufficient exploration, and temporally correlated noise is used instead. A common choice is Ornstein-Uhlenbeck (OU) noise, which is closely related to Brownian motion (red noise). Both red noise and white noise belong to the broad family of colored noise. In this work, we perform a comprehensive experimental evaluation on MPO and SAC to explore the effectiveness of other colors of noise as action noise. We find that pink noise, which is halfway between white and red noise, significantly outperforms white noise, OU noise, and other alternatives on a wide range of environments. Thus, we recommend it as the default choice for action noise in continuous control.},
  month = may,
  year = {2023},
  slug = {eberhardetal2023-pinknoise},
  author = {Eberhard, Onno and Hollenstein, Jakob and Pinneri, Cristina and Martius, Georg},
  url = {https://openreview.net/forum?id=hQ9V5QN27eS},
  month_numeric = {5}
}