Back
Significant progress has been made over the last years in estimating people's shape and motion from video and nonetheless the problem still remains unsolved. This is especially true in uncontrolled environments such as people in the streets or the office where background clutter and occlusions make the problem even more challenging. The goal of our research is to develop computational methods that enable human pose estimation from video and inertial sensors in indoor and outdoor environments. Specifically, I will focus on one of our past projects in which we introduce a hybrid Human Motion Capture system that combines video input with sparse inertial sensor input. Employing a particle-based optimization scheme, our idea is to use orientation cues derived from the inertial input to sample particles from the manifold of valid poses. Additionally, we introduce a novel sensor noise model to account for uncertainties based on the von Mises-Fisher distribution. Doing so, orientation constraints are naturally fulfilled and the number of needed particles can be kept very small. More generally, our method can be used to sample poses that fulfill arbitrary orientation or positional kinematic constraints. In the experiments, we show that our system can track even highly dynamic motions in an outdoor environment with changing illumination, background clutter, and shadows.
Gerard Pons-Moll (Leibniz Universität Hannover)
Short Bio: Gerard Pons-Moll was born in Barcelona in 1984. He obtained the Telecommunications Engineering with emphasis in signal processing from the Technical University of Catalonia (UPC).<br /> From Sept. 2007 - July 2008 he completed his Master Thesis in Boston, USA at Northeastern University with a fellowship from the Vodafone Foundation.<br /> Since 2009, he is working towards his PhD degree at the TNT group of the Leibniz University in Germany.<br /> During 2012 he did research internships at the University of Toronto and Microsoft Research, Cambridge. His research interests include computer vision, machine learning, and computer graphics, with a focus on the application of 3D human pose and shape estimation from video and inertial sensors.