Perceiving Systems The MIT License 2019-05-01

VOCA: Capture, Learning, and Synthesis of 3D Speaking Styles

Teaser voca

VOCA (Voice Operated Character Animation) is a framework that takes a speech signal as input and realistically animates a wide range of adult faces. <p><strong>Code: </strong>We provide Python demo code that outputs a 3D head animation given a speech signal and a static 3D head mesh. The codebase further provides animation control to alter the speaking style, identity-dependent facial shape, and head pose (i.e. head rotation around the neck) during animation. The code further demonstrates how to sample 3D head meshes from the publicly available FLAME model, that can then be animated&nbsp; with the provided code.</p> <p><strong>Dataset: </strong>We capture a unique 4D face dataset (VOCASET) with about 29 minutes of 3D scans captured at 60 fps and synchronized audio from 12 speakers. We provide the raw 3D scans, registrations in FLAME topology, and unposed registrations (i.e. registrations in &quot;zero pose&quot;).</p>

Release Date: 01 May 2019
licence_type: The MIT License
Copyright: Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V.
Authors: Daniel Cudeiro and Timo Bolkart and Cassidy Laidlaw and Anurag Ranjan and Michael Black
Link (URL): https://voca.is.tue.mpg.de
Repository: https://github.com/TimoBolkart/voca