Human Pose, Shape and Action

Institute Homepage

Institute Homepage Sign In

Back

Research Overview

Intrinsically Motivated Learning

Regularity as Intrinsic Reward for Free Play

SENSEI: Semantic Exploration Guided by Foundation Models to Learn Versatile World Models

Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation

Learning with Muscles

Natural and Robust Walking from Generic Rewards

The effect of muscles in Learning Behavior

Scaling RL to Large Musculoskeletal Systems

Reinforcement Learning for Diverse Solutions

Offline Diversity Under Imitation Constraints

Learning Diverse Skills for Local Navigation

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Reinforcement Learning and Control

Model-based Reinforcement Learning and Planning

Object-centric Self-supervised Reinforcement Learning

Self-exploration of Behavior

Causal Reasoning in RL

Equation Learner for Extrapolation and Control

Intrinsically Motivated Hierarchical Learner

Regularity as Intrinsic Reward for Free Play

Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation

Natural and Robust Walking from Generic Rewards

Goal-conditioned Offline Planning

Offline Diversity Under Imitation Constraints

Learning Diverse Skills for Local Navigation

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Deep Learning

Combinatorial Optimization as a Layer / Blackbox Differentiation

Object-centric Self-supervised Reinforcement Learning

Symbolic Regression and Equation Learning

Representation Learning

Stepsize adaptation for stochastic optimization

Probabilistic Neural Networks

Learning with 3D rotations: A hitchhiker’s guide to SO(3)

Haptic Sensing

Super-resolution Sensing for Haptics

Insight: a Haptic Sensor Powered by Vision and Machine Learning

Minsight: Learning-based tactile sensing for robotics

ML for Science

Predicting brain activity (fMRI)

Equation Learning for Statistical Physics

Machine Learning for Understanding Quantum Systems

Symbolic Regression and Equation Learning

Previous Research Projects

The Playful Machine

Robust and Affordable Haptic Sensation with Sparse Sensor Configuration

Perceiving Systems Members Publications

Human Pose, Shape and Action

Pose shape action — We propose novel challenging datasets for human pose estimation, 3D mesh registration and action recognition: a) MPII Human Pose, including around 25000 images of over 40000 people with annotated 2D body joints; b) FAUST, collecting 300 real human body scans with automatically computed ground-truth correspondences; c) J-HMDB, a dataset for action recognition with annotated human joints, segmentation, and optical flow.

Human pose estimation, 3D mesh registration and action recognition techniques have made significant progress during the last years. However, most existing datasets to evaluate them are inadequate for capturing the challenges of real-world scenarios. We introduce novel datasets and benchmarks, all publicly available for research purposes.

In [], we describe the datasets currently available for pose estimation and the performance of state-of-the-art methods on them. In [], we introduce a novel benchmark for pose estimation, "MPII Human Pose", that makes a significant advance with respect to previous work in terms of diversity and difficulty. It includes around 25000 images containing over 40000 people performing more than 400 different activities. We provide a rich set of labels including body joint positions, occlusion labels, and activity labels. Given these rich annotations we perform a detailed analysis of the leading human pose estimation approaches, gaining insights for the successes and failures of these methods.

FAUST [] is the first dataset for 3D mesh registration providing both real data (300 human body scans of different people in a wide range of poses) and automatically computed ground-truth correspondences between them. We define a benchmark on FAUST, and find that current shape registration methods have trouble with this real-world data.

With the "Joints for the HMDB" dataset (J-HMDB) we focus on action recognition []. We annotate complex videos using a 2D "puppet" body model to obtain "ground truth" joint locations as well as optical flow and segmentation. We evaluate current methods using this dataset by systematically replacing the input to various algorithms with ground truth. This enables us to discover what is important -- e.g., should we improve flow algorithms, or enable pose estimation? We find that high-level pose features greatly outperform low/mid level features; in particular, pose over time is critical. Our analysis and the J-HMDB dataset should facilitate a deeper understanding of action recognition algorithms.

Members

Research Group Leader

Perceiving Systems

Javier Romero

Affiliated Researcher

Guest Scientist

Affiliated Researcher

Publications

Perceiving Systems Conference Paper Human Pose Estimation: New Benchmark and State of the Art Analysis Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), :3686 - 3693, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition, June 2014 () pdf DOI BibTeX

Perceiving Systems Conference Paper FAUST: Dataset and evaluation for 3D mesh registration Bogo, F., Romero, J., Loper, M., Black, M. J. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), :3794 -3801, Columbus, Ohio, USA, IEEE International Conference on Computer Vision and Pattern Recognition, June 2014 () pdf Video Dataset Poster Talk DOI BibTeX

Perceiving Systems Conference Paper Towards understanding action recognition Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M. J. In IEEE International Conference on Computer Vision (ICCV), :3192-3199, IEEE, Sydney, Australia, December 2013 () Website Errata Poster Paper Slides DOI BibTeX

Perceiving Systems Book Chapter Benchmark datasets for pose estimation and tracking Andriluka, M., Sigal, L., Black, M. J. In Visual Analysis of Humans: Looking at People, :253-274, (Editors: Moesland and Hilton and Kr"uger and Sigal), Springer-Verlag, London, 2011 () publisher's site BibTeX