Perceiving Systems Talk Biography
05 October 2021 at 14:00 - 14:45 | Zoom meeting

Toward Reconstructing Face from Voice

Yangdong

We address a new challenge posed by voice profiling - reconstructing someone’s face from their voice. Specifically, given an audio clip spoken by an unseen person, we aim to reconstruct a face that has as many associations as possible with the speaker in terms of identity. In this talk, I will introduce how we explore and approach the ultimate goal step by step. First, we investigate the audio-visual association by matching voices to faces based on identity, and vice versa. Second, we set up a baseline for reconstructing 2D face images from a voice recording and show reasonable reconstruction results. Furthermore, we extend the generated face from 2D image to 3D face to discover more specific associations of voices with facial geometry.

Speaker Biography

Yandong Wen (Carnegie Mellon University)

Ph.D. candidate

Yandong Wen is a Ph.D. candidate at Carnegie Mellon University. Before that, he obtained his master's and bachelor's degree in Electronic and Information Engineering from South China University of Technology. In summer and fall 2020, he was a research intern at the Facebook Reality Labs. His current research interests are deep learning for face recognition, audio-visual association learning, and 3D face reconstruction.