Multi-Camera Capture

Institute Homepage

Institute Homepage Sign In

Back

Research Overview

Learning Control

Learning Coupling Terms of Movement Primitives

Incremental Local Regression

Perception for Action

Autonomous Robotic Manipulation

Modeling Top-Down Saliency for Visual Object Search

Interactive Perception

State Estimation and Sensor Fusion for the Control of Legged Robots

Probabilistic Object and Manipulator Tracking

Global Object Shape Reconstruction by Fusing Visual and Tactile Data

Robot Arm Pose Estimation as a Learning Problem

Learning to Grasp from Big Data

Gaussian Filtering as Variational Inference

Template-Based Learning of Model Free Grasping

Associative Skill Memories

Real-Time Perception meets Reactive Motion Generation

Motion planning and control

Autonomous Robotic Manipulation

Learning Coupling Terms of Movement Primitives

State Estimation and Sensor Fusion for the Control of Legged Robots

Inverse Optimal Control

Motion Optimization

Optimal Control for Legged Robots

Movement Representation for Reactive Behavior

Associative Skill Memories

Real-Time Perception meets Reactive Motion Generation

Neural Control of Movement

Experimental Robotics

Autonomous Robotic Manipulation

Inverse Optimal Control

Motion Optimization

Optimal Control for Legged Robots

Associative Skill Memories

Real-Time Perception meets Reactive Motion Generation

Other

Perzeptive Systeme Members Publications

Multi-Camera Capture

Humanpose3dmulti — Top row: (left) In [], bodies are represented by a part-based graphical model in space and time. (middle) Messages between parts are represented by particles. (right) Non-parametric belief propagation computes message products. Bottom row: In [], we segment and fit bodies multi-camera images. (a) Articulated template models. (b) Input silhouettes. (c) Segmentation. (d) Contour labels assigned to each person. (e) Estimated surface. (f) Estimated 3D models with embedded skeletons.

While multi-camera video data facilitates markerless motion capture, many challenges remain.

We formulate the problem of 3D human pose estimation and tracking as inference in a graphical model []. The body is modeled as a collection of loosely-connected body-parts (a 3D pictorial structure) using an undirected graphical model in which nodes correspond to parts and edges to kinematic, penetration, and temporal constraints. These constraints are encoded using pair-wise statistical distributions, learned from mocap data. Human pose and motion are computed using Particle Message Passing, a form of non-parametric belief propagation that can be applied over graphical models with loops. The loose-limbed model and decentralized graph structure allow us to incorporate "bottom-up" visual cues, such as limb and head detectors into the inference process. These detectors enable automatic initialization and aid recovery from transient tracking failures.

Capturing the skeleton motion and detailed time-varying surface geometry of multiple, closely interacting persons is harder still, even in a multi-camera setup, due to frequent occlusions and ambiguities in feature-to-person assignments. To address this, we propose a framework that exploits multi-view image segmentation []. To this end, a probabilistic shape and appearance model is employed to segment the input images and to assign each pixel uniquely to one person. Given the articulated template models of each person and the labeled pixels, a combined optimization scheme, which splits the skeleton pose optimization problem into a local one and a lower dimensional global one, is applied one-by-one to each individual, followed by surface estimation to capture detailed non-rigid deformations. Our approach can capture the 3D motion of humans accurately even if they move rapidly, wear apparel, and engage in challenging multi-person motions.

Members

Affiliated Researcher

Publications

Perceiving Systems Conference Paper Human Pose Estimation: New Benchmark and State of the Art Analysis Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), :3686 - 3693, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition, June 2014 () pdf DOI BibTeX

Perceiving Systems Article Markerless Motion Capture of Multiple Characters Using Multi-view Image Segmentation Liu, Y., Gall, J., Stoll, C., Dai, Q., Seidel, H., Theobalt, C. Transactions on Pattern Analysis and Machine Intelligence, 35(11):2720-2735, 2013 () data and video pdf DOI BibTeX

Perceiving Systems Article Coupled Action Recognition and Pose Estimation from Multiple Views Yao, A., Gall, J., van Gool, L. International Journal of Computer Vision, 100(1):16-37, October 2012 () publisher's site code pdf BibTeX

Perceiving Systems Article Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation Sigal, L., Isard, M., Haussecker, H., Black, M. J. International Journal of Computer Vision, 98(1):15-48, Springer Netherlands, May 2011 () pdf publisher's site DOI BibTeX