VOCA: Capture, Learning, and Synthesis of 3D Speaking Styles

Institute Homepage

Institute Homepage DE Sign In

Back

Perceiving Systems The MIT License 2019-05-01

Perceiving Systems

Daniel Cudeiro

Ph.D. Student (deceased)

Perceiving Systems

Timo Bolkart

Research Scientist

Doctoral Researcher

Perceiving Systems

Michael Black

Director

VOCA (Voice Operated Character Animation) is a framework that takes a speech signal as input and realistically animates a wide range of adult faces. Code: We provide Python demo code that outputs a 3D head animation given a speech signal and a static 3D head mesh. The codebase further provides animation control to alter the speaking style, identity-dependent facial shape, and head pose (i.e. head rotation around the neck) during animation. The code further demonstrates how to sample 3D head meshes from the publicly available FLAME model, that can then be animated  with the provided code. Dataset: We capture a unique 4D face dataset (VOCASET) with about 29 minutes of 3D scans captured at 60 fps and synchronized audio from 12 speakers. We provide the raw 3D scans, registrations in FLAME topology, and unposed registrations (i.e. registrations in "zero pose").

Release Date:	01 May 2019
licence_type:	The MIT License
Copyright:	Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V.
Authors:	Daniel Cudeiro and Timo Bolkart and Cassidy Laidlaw and Anurag Ranjan and Michael Black
Link (URL):	https://voca.is.tue.mpg.de
Repository:	https://github.com/TimoBolkart/voca

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives