Toward Reconstructing Face from Voice

ORGANIZERS

Perceiving Systems

Timo Bolkart

Research Scientist

We address a new challenge posed by voice profiling - reconstructing someone’s face from their voice. Specifically, given an audio clip spoken by an unseen person, we aim to reconstruct a face that has as many associations as possible with the speaker in terms of identity. In this talk, I will introduce how we explore and approach the ultimate goal step by step. First, we investigate the audio-visual association by matching voices to faces based on identity, and vice versa. Second, we set up a baseline for reconstructing 2D face images from a voice recording and show reasonable reconstruction results. Furthermore, we extend the generated face from 2D image to 3D face to discover more specific associations of voices with facial geometry.

Speaker Biography

Yandong Wen (Carnegie Mellon University)

Ph.D. candidate

Yandong Wen is a Ph.D. candidate at Carnegie Mellon University. Before that, he obtained his master's and bachelor's degree in Electronic and Information Engineering from South China University of Technology. In summer and fall 2020, he was a research intern at the Facebook Reality Labs. His current research interests are deep learning for face recognition, audio-visual association learning, and 3D face reconstruction.

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives