Autonomous Learning Conference Paper 2023

Backpropagation through Combinatorial Algorithms: Identity with Projection Works

Identity method

Embedding discrete solvers as differentiable layers has given modern deep learning architectures combinatorial expressivity and discrete reasoning capabilities. The derivative of these solvers is zero or undefined, therefore a meaningful replacement is crucial for effective gradient-based learning. Prior works rely on smoothing the solver with input perturbations, relaxing the solver to continuous problems, or interpolating the loss landscape with techniques that typically require additional solver calls, introduce extra hyper-parameters, or compromise performance. We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass and further provide a theoretical justification. Our experiments demonstrate that such a straightforward hyper-parameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval. Furthermore, we substitute the previously proposed problem-specific and label-dependent margin with a generic regularization procedure that prevents cost collapse and increases robustness.

Author(s): Subham Sahoo and Anselm Paulus and Marin Vlastelica and Vít Musil and Volodymyr Kuleshov and Georg Martius
Book Title: Proceedings of the Eleventh International Conference on Learning Representations
Year: 2023
Month: May
Day: 1-5
Bibtex Type: Conference Paper (inproceedings)
Event Place: Kigali, Rwanda
State: Accepted
URL: https://openreview.net/forum?id=JZMR727O29
Electronic Archiving: grant_archive
Links:

BibTex

@inproceedings{SahooPaulus2023:Identity,
  title = {Backpropagation through Combinatorial Algorithms: Identity with Projection Works},
  booktitle = {Proceedings of the Eleventh International Conference on Learning Representations},
  abstract = {Embedding discrete solvers as differentiable layers has given modern deep learning architectures combinatorial expressivity and discrete reasoning capabilities.
  The derivative of these solvers is zero or undefined, therefore a meaningful replacement is crucial for effective gradient-based learning. 
  Prior works rely on smoothing the solver with input perturbations, relaxing the solver to continuous problems, or interpolating the loss landscape with techniques that typically require additional solver calls, introduce extra hyper-parameters, or compromise performance.
  We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass and further provide a theoretical justification.
  Our experiments demonstrate that such a straightforward hyper-parameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval.
  Furthermore, we substitute the previously proposed problem-specific and label-dependent margin with a generic regularization procedure that prevents cost collapse and increases robustness.},
  month = may,
  year = {2023},
  slug = {sahoopaulus2023-identity},
  author = {Sahoo, Subham and Paulus, Anselm and Vlastelica, Marin and Musil, Vít and Kuleshov, Volodymyr and Martius, Georg},
  url = {https://openreview.net/forum?id=JZMR727O29},
  month_numeric = {5}
}