Perceiving Systems Conference Paper 2021

Learning To Regress Bodies From Images Using Differentiable Semantic Rendering

Dsr1

Learning to regress 3D human body shape and pose (e.g. SMPL parameters) from monocular images typically exploits losses on 2D keypoints, silhouettes, and/or part segmentation when 3D training data is not available. Such losses, however, are limited because 2D keypoints do not supervise body shape and segmentations of people in clothing do not match projected minimally-clothed SMPL shapes. To exploit richer image information about clothed people, we introduce higher-level semantic information about clothing to penalize clothed and non-clothed regions of the human body differently. To do so, we train a body regressor using a novel “Differentiable Semantic Rendering (DSR)” loss. For Minimally-Clothed (MC) regions, we define the DSRMC loss, which encourages a tight match between a rendered SMPL body and the minimally-clothed regions of the image. For clothed regions, we define the DSR-C loss to encourage the rendered SMPL body to be inside the clothing mask. To ensure end-to-end differentiable training, we learn a semantic clothing prior for SMPL vertices from thousands of clothed human scans. We perform extensive qualitative and quantitative experiments to evaluate the role of clothing semantics on the accuracy of 3D human pose and shape estimation. We outperform all previous state-of-the-art methods on 3DPW and Human3.6M and obtain on par results on MPI-INF-3DHP. Code and trained models are available for research at https://dsr.is.tue.mpg.de/

Author(s): Dwivedi, Sai Kumar and Athanasiou, Nikos and Kocabas, Muhammed and Black, Michael J.
Book Title: Proc. International Conference on Computer Vision (ICCV)
Pages: 11230--11239
Year: 2021
Month: October
Publisher: IEEE
Project(s):
Bibtex Type: Conference Paper (inproceedings)
Address: Piscataway, NJ
DOI: 10.1109/ICCV48922.2021.01106
Event Name: International Conference on Computer Vision 2021
Event Place: virtual (originally Montreal, Canada)
State: Published
Digital: True
Electronic Archiving: grant_archive
ISBN: 978-1-6654-2812-5
Links:

BibTex

@inproceedings{DSR:ICCV:2021,
  title = {Learning To Regress Bodies From Images Using Differentiable Semantic Rendering},
  booktitle = {Proc. International Conference on Computer Vision (ICCV)},
  abstract = {Learning to regress 3D human body shape and pose (e.g. SMPL parameters) from monocular images typically exploits losses on 2D keypoints, silhouettes, and/or part segmentation when 3D training data is not available. Such losses, however, are limited because 2D keypoints do not supervise body shape and segmentations of people in clothing do not match projected minimally-clothed SMPL shapes. To exploit richer image information about clothed people, we introduce higher-level semantic information about clothing to penalize clothed and non-clothed regions of the human body differently. To do so, we train a body regressor using a novel “Differentiable Semantic Rendering (DSR)” loss. For Minimally-Clothed (MC) regions, we define the DSRMC loss, which encourages a tight match between a rendered SMPL body and the minimally-clothed regions of the image. For clothed regions, we define the DSR-C loss to encourage the rendered SMPL body to be inside the clothing mask. To ensure end-to-end differentiable training, we learn a semantic clothing prior for SMPL vertices from thousands of clothed human scans. We perform extensive qualitative and quantitative experiments to evaluate the role of clothing semantics on the accuracy of 3D human pose and shape estimation. We outperform all previous state-of-the-art methods on 3DPW and Human3.6M and obtain on par results on MPI-INF-3DHP. Code and trained models are available for research at https://dsr.is.tue.mpg.de/},
  pages = {11230--11239},
  publisher = {IEEE},
  address = {Piscataway, NJ},
  month = oct,
  year = {2021},
  slug = {dsr-iccv-2021},
  author = {Dwivedi, Sai Kumar and Athanasiou, Nikos and Kocabas, Muhammed and Black, Michael J.},
  month_numeric = {10}
}