Efficient and Scalable Inference

Institute Homepage

Institute Homepage Sign In

Back

Research Overview

Inferring and exploiting contact

Generative Proxemics: A Prior for 3D Social Interaction from Images

BITE -- Dog Shape and Pose from an Image

HOLD -- inferring 3D hand and object shape from video

MOVER -- Reconstructing 3D Scenes and People using Interaction

Datasets for understanding humans and animals

The Poses for Equine Research Dataset (PFERD)

BEAT2 Dataset for Holistic Co-Speech Gesture Generation

ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation

The BioAMASS Dataset

OpenCapBench dataset

Human health and the 3D body

Body Shape Models in Treating Anorexia Nervosa

Customized Bone Plants for Humerus Shaft Fractures

Reconstructing Signing Avatars From Video Using Linguistic Priors

The AI animator

HAAR: Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles

Gaussian Garments

PuzzleAvatar: Assembling 3D Avatars from Personal Albums

FLARE: Fast Learning of Animatable and Relightable Mesh Avatars

Language, Vision, and World Models

AWOL: Analysis WithOut synthesis using Language

Re-Thinking Inverse Graphics with Large Language Models

TeCH: Text-guided Reconstruction of Clothed Humans

Human pose, shape, and motion capture

WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

3D Human Pose Estimation via Intuitive Physics

Accurate 3D Body Shape Regression using Metric and Semantic Attributes

BEV

Generating human motion

Generating Human Interaction Motions in Scenes with Text Control

TEMOS: Generating Diverse Human Motions from Text

EMAGE: Full-body Gestures from Audio

TEACH: Temporal Action Compositions for 3D Humans

Robot Perception Group

AirCap: 3D Motion Capture

AirCap: Perception-Based Control

AirCapRL: Aerial Motion Capture Using Deep RL

Data Team

Lab Tours and Public Outreach

Collecting Data - From the Idea to the Publication

Capture Technologies Setup

Completed Projects

Human Pose, Shape and Action

3D Pose from Images

2D Pose from Images

Beyond Motion Capture

Action and Behavior

Body Perception

Body Applications

Pose and Motion Priors

Clothing Models (2011-2015)

Reflectance Filtering

Learning on Manifolds

Markerless Animal Motion Capture

Multi-Camera Capture

2D Pose from Optical Flow

Body Perception

Neural Prosthetics and Decoding

Part-based Body Models

Intrinsic Depth

Lie Bodies

Layers, Time and Segmentation

Understanding Action Recognition (JHMDB)

Intrinsic Video

Intrinsic Images

Action Recognition with Tracking

Neural Control of Grasping

Flowing Puppets

Faces

Deformable Structures

Model-based Anthropometry

Modeling 3D Human Breathing

Optical flow in the LGN

FlowCap

Smooth Loops from Unconstrained Video

PCA Flow

Efficient and Scalable Inference

Motion Blur in Layers

Facade Segmentation

Smooth Metric Learning

Robust PCA

3D Recognition

Object Detection

Perzeptive Systeme Members Publications

Efficient and Scalable Inference

Efficientandscalabalefig — (A) Effects of adaptive batch size. Training loss, test accuracy and batch size from top to bottom against accessed examples for the Street View House Numbers dataset. Green, blue and grey are constant batch sizes, red uses an alternative batch size adaptation criterion, orange, corresponds to the coupled adaptive batch size (CAPS) introduced in []. (B) Top and middle are validation and test loss for validation based stopping (red) and evidence based criterion []. Criterion is shown in bottom. The blue bar is the proposed stopping point.

Machine Learning is an important tool for Computer Vision. After the success of Deep Neural Networks(DNN)s in image classification tasks, many other tasks were solved using DNNs. However, working with deep learning, researchers are confronted with many problems related to implementation, hyper-parameter selection, training time, training data and computation time that seek to be solved.

Writing code for a deep learning project is often very time consuming.. Even though, Caffe has many disadvantages in comparison to other frameworks, many works are still written using Caffe. For example it's missing a full Python interface and relies heavily on configuration files. To address these issues we developed Barrista [], a wrapper for Caffe, that offers full pythonic control over Caffe.

Another difficulty that researchers working on deep learning have to face is the large amount of training data needed. This problem becomes even more serious, because a validation and test set is needed to avoid over-fitting. In new work, we show an alternative to a validation set for early stopping. The criterion depends on fast-to-compute local statistics of the gradients. By these means it is possible to decrease the problem of small training sets [].

Training neural networks comes with many hyper-parameters that must be chosen, which is often a time consuming process. Mini-batch stochastic gradient descent and variants thereof are the standard for training neural networks. The batch size is commonly chosen based on empirical inspection and practical reasons. However, the batech size determines the variance of gradient estimates and therefore influences strongly the behavior of the optimization process. Further, the variance changes during optimization. Because of these reasons we propose a dynamic batch size adaptation in []. It leads to faster convergence and simplifies the learning rate tuning.

Many Computer Vision problems can be formulated as a graph partitioning and node labeling task. But the resulting problem, known as minimum cost node labeling multicut problem, is proven to be NP-hard. In order to get a reasonably fast Computer Vision algorithm heuristics are needed to solve minimum cost node labeling multicut problems. In [] we propose two local search algorithms that converge to a local optimum. Both of which achieve state-of-the-art accuracy achieved for the tasks mentioned above. The general formulation enables the usage of this method by researchers for many different tasks.

Members

Affiliated Researcher

Autonomous Motion

Daniel Kappler

Doctoral Researcher

Robust Machine Learning

Martin Kiefel

Perzeptive Systeme

Peter Vincent Gehler

Research Group Leader

Probabilistic Numerics

Maren Mahsereci

Doctoral Researcher

Probabilistic Numerics

Lukas Balles

Probabilistic Numerics, Empirische Inferenz

Philipp Hennig

Affiliated Researcher

Perzeptive Systeme

Javier Romero

Affiliated Researcher

Probabilistic Numerics, Empirische Inferenz

Philipp Hennig

Affiliated Researcher

Publications

Perceiving Systems Conference Paper Joint Graph Decomposition and Node Labeling by Local Search Levinkov, E., Uhrig, J., Tang, S., Omran, M., Insafutdinov, E., Kirillov, A., Rother, C., Brox, T., Schiele, B., Andres, B. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), :1904-1912, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 () PDF Supplementary DOI BibTeX

Autonomous Vision Perceiving Systems Conference Paper OctNet: Learning Deep 3D Representations at High Resolutions Riegler, G., Ulusoy, O., Geiger, A. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, :6620-6629, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 () pdf suppmat Project Page Video BibTeX

Probabilistic Numerics Perceiving Systems Article Early Stopping Without a Validation Set Mahsereci, M., Balles, L., Lassner, C., Hennig, P. arXiv preprint arXiv:1703.09580, 2017 () URL BibTeX

Perceiving Systems Conference Paper Learning to Filter Object Detections Prokudin, S., Kappler, D., Nowozin, S., Gehler, P. In Pattern Recognition: 39th German Conference, GCPR 2017, Basel, Switzerland, September 12–15, 2017, Proceedings, :52-62, Springer International Publishing, Cham, 2017 () Paper DOI URL BibTeX

Perceiving Systems Autonomous Motion Conference Paper Barrista - Caffe Well-Served Lassner, C., Kappler, D., Kiefel, M., Gehler, P. In ACM Multimedia Open Source Software Competition, ACM OSSC16, October 2016 (Published) pdf DOI URL BibTeX

Perceiving Systems Conference Paper Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks Jampani, V., Kiefel, M., Gehler, P. V. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), :4452-4461, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 () code CVF open-access pdf supplementary poster BibTeX