Perceiving Systems Conference Paper 2021

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks

Scanimate

We present SCANimate, an end-to-end trainable framework that takes raw 3D scans of a clothed human and turns them into an animatable avatar. These avatars are driven by pose parameters and have realistic clothing that moves and deforms naturally. SCANimate does not rely on a customized mesh template or surface mesh registration. We observe that fitting a parametric 3D body model, like SMPL, to a clothed human scan is tractable while surface registration of the body topology to the scan is often not, because clothing can deviate significantly from the body shape. We also observe that articulated transformations are invertible, resulting in geometric cycle-consistency in the posed and unposed shapes. These observations lead us to a weakly supervised learning method that aligns scans into a canonical pose by disentangling articulated deformations without template-based surface registration. Furthermore, to complete missing regions in the aligned scans while modeling pose-dependent deformations, we introduce a locally pose-aware implicit function that learns to complete and model geometry with learned pose correctives. In contrast to commonly used global pose embeddings, our local pose conditioning significantly reduces long-range spurious correlations and improves generalization to unseen poses, especially when training data is limited. Our method can be applied to pose- aware appearance modeling to generate a fully textured avatar. We demonstrate our approach on various clothing types with different amounts of training data, outperforming existing solutions and other variants in terms of fidelity and generality in every setting. The code is available at https://scanimate.is.tue.mpg.de.

Award: (Candidate for Best Paper Award)
Author(s): Shunsuke Saito and Jinlong Yang and Qianli Ma and Michael J. Black
Book Title: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021)
Pages: 2885--2896
Year: 2021
Month: June
Publisher: IEEE
Project(s):
Bibtex Type: Conference Paper (inproceedings)
Address: Piscataway, NJ
DOI: 10.1109/CVPR46437.2021.00291
Event Name: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021)
Event Place: Virtual
State: Published
URL: https://scanimate.is.tue.mpg.de
Award Paper: Candidate for Best Paper Award
Electronic Archiving: grant_archive
ISBN: 978-1-6654-4510-8
Links:

BibTex

@inproceedings{Saito:CVPR:2021,
  title = {{SCANimate}: Weakly Supervised Learning of Skinned Clothed Avatar Networks},
  aword_paper = {Candidate for Best Paper Award},
  booktitle = {2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021)},
  abstract = {We present SCANimate, an end-to-end trainable framework that takes raw 3D scans of a clothed human and turns them into an animatable avatar. These avatars are driven by pose parameters and have realistic clothing that moves and deforms naturally. SCANimate does not rely on a customized mesh template or surface mesh registration. We observe that fitting a parametric 3D body model, like SMPL, to a clothed human scan is tractable while surface registration of the body topology to the scan is often not, because clothing can deviate significantly from the body shape. We also observe that articulated transformations are invertible, resulting in geometric cycle-consistency in the posed and unposed shapes. These observations lead us to a weakly supervised learning method that aligns scans into a canonical pose by disentangling articulated deformations without template-based surface registration. Furthermore, to complete missing regions in the aligned scans while modeling pose-dependent deformations, we introduce a locally pose-aware implicit function that learns to complete and model geometry with learned pose correctives. In contrast to commonly used global pose embeddings, our local pose conditioning significantly reduces long-range spurious correlations and improves generalization to unseen poses, especially when training data is limited. Our method can be applied to pose- aware appearance modeling to generate a fully textured avatar. We demonstrate our approach on various clothing types with different amounts of training data, outperforming existing solutions and other variants in terms of fidelity and generality in every setting. The code is available at https://scanimate.is.tue.mpg.de.
  },
  pages = {2885--2896},
  publisher = {IEEE},
  address = {Piscataway, NJ},
  month = jun,
  year = {2021},
  slug = {saito-cvpr-2021},
  author = {Saito, Shunsuke and Yang, Jinlong and Ma, Qianli and Black, Michael J.},
  url = {https://scanimate.is.tue.mpg.de},
  month_numeric = {6}
}