Back

Perceiving Systems Members Publications Website

EMAGE: Full-body Gestures from Audio

Emagesab
EMAGE [File Icon] uses a novel framework and holistic gesture dataset to jointly generate facial expressions, body and hand movements, and global translation, conditioned on audio. The BEAT2 dataset behind EMAGE provides 60 hours of full-body emotional behavior in SMPL-X format.

Members

Publications

Perceiving Systems Conference Paper EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling Liu, H., Zhu, Z., Becherini, G., Peng, Y., Su, M., Zhou, Y., Zhe, X., Iwamoto, N., Zheng, B., Black, M. J. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), :1144-1154, CVPR, June 2024 (Published) arXiv project dataset code gradio colab video DOI URL BibTeX