Back

Perceiving Systems Publications Website

PuzzleAvatar: Assembling 3D Avatars from Personal Albums

Puzzleavatar
Given a causal photo collection containing diverse poses, viewpoints, and crops, we create an animatable avatar. PuzzleAvatar [File Icon] bypasses the challenging problem of body and camera pose estimation by fine-tuning a vision-language model (VLM) to encode the appearance, identity, garments, hairstyles, and accessories of a person into (separate) learned tokens that we exploit as "puzzle pieces" to assemble a personalized 3D avatar.

Publications

Perceiving Systems Article PuzzleAvatar: Assembling 3D Avatars from Personal Albums Xiu, Y., Liu, Z., Tzionas, D., Black, M. J. ACM Transactions on Graphics, 43(6):1-15, ACM, December 2024 (Published) DOI URL BibTeX