Text-Driven 3D Modeling of Avatars

Generating 3D objects poses notable challenges due to the limited availability of annotated 3D datasets, unlike their 2D counterparts. Current approaches often resort to models trained on 2D data, resulting in prolonged optimization phases. Conversely, models trained on 3D datasets enable inference without optimization but suffer from limited dataset diversity. This talk explores methodologies for generative 3D modelling of human heads and garments, pivotal for human avatar creation. First, we introduce "Clip-Head," a text-to-textured 3D head generation model that generates a textured NPHM head model. This model bypasses the need for expensive optimization processes and directly generates textured 3D heads from text prompts. Secondly, we briefly discuss "WordRobe," a text-to-garment generation framework that learns a latent space of garments. WordRobe can produce open-surface garments ready for graphics pipelines while generating consistent texture maps. This approach paves the way for text-driven garment design and virtual try-on applications.
Speaker Biography
Pranav Manu (Centre for Visual Information Technology (CVIT) at IIIT Hyderabad, India)
Pranav is a Master's student at the Centre for Visual Information Technology (CVIT) at IIIT Hyderabad, India. He is working under the supervision of Dr. Avinash Sharma and Dr. P.J. Narayanan. His research interests focus primarily on 3D and 4D human reconstruction and generation, particularly human heads. He has also worked on text-driven textured 3D head and garment generation.