Self-exploration of Behavior

Institute Homepage

Institute Homepage Sign In

Back

Research Overview

Intrinsically Motivated Learning

Regularity as Intrinsic Reward for Free Play

SENSEI: Semantic Exploration Guided by Foundation Models to Learn Versatile World Models

Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation

Learning with Muscles

Natural and Robust Walking from Generic Rewards

The effect of muscles in Learning Behavior

Scaling RL to Large Musculoskeletal Systems

Reinforcement Learning for Diverse Solutions

Offline Diversity Under Imitation Constraints

Learning Diverse Skills for Local Navigation

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Reinforcement Learning and Control

Model-based Reinforcement Learning and Planning

Object-centric Self-supervised Reinforcement Learning

Self-exploration of Behavior

Causal Reasoning in RL

Equation Learner for Extrapolation and Control

Intrinsically Motivated Hierarchical Learner

Regularity as Intrinsic Reward for Free Play

Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation

Natural and Robust Walking from Generic Rewards

Goal-conditioned Offline Planning

Offline Diversity Under Imitation Constraints

Learning Diverse Skills for Local Navigation

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Deep Learning

Combinatorial Optimization as a Layer / Blackbox Differentiation

Object-centric Self-supervised Reinforcement Learning

Symbolic Regression and Equation Learning

Representation Learning

Stepsize adaptation for stochastic optimization

Probabilistic Neural Networks

Learning with 3D rotations: A hitchhiker’s guide to SO(3)

Haptic Sensing

Super-resolution Sensing for Haptics

Insight: a Haptic Sensor Powered by Vision and Machine Learning

Minsight: Learning-based tactile sensing for robotics

ML for Science

Predicting brain activity (fMRI)

Equation Learning for Statistical Physics

Machine Learning for Understanding Quantum Systems

Symbolic Regression and Equation Learning

Previous Research Projects

The Playful Machine

Robust and Affordable Haptic Sensation with Sparse Sensor Configuration

Autonomous Learning Video Members Publications

Self-exploration of Behavior

Dep variants2 — Hexapod simulated robot with 18 DoF (a). Thanks to the DEP-controller with repelling potential the robot is able to discover 48 attractor behaviors displayed in the space of velocities (b). The behaviors can later be replayed (c). The SUBMODES system can extract models of behavioral primitives $B(t)$ from such self-explored behavior using unexpected increases in prediction error $e(t)$ (d).

Humans and other animals need to learn behavior from exploring their sensorimotor capabilities. In this project we aim to understand how robotic systems can generate structured behavior, without specifying any reward nor objective function. Instead, the systems should explore behavior purely based on their embodiment and their interactions with the environment.

One way to bootstrap goal-free explorative process, is to use differential extrinsic plasticity (DEP)[]. DEP is a biologically plausible synaptic mechanism that can be used within very simple neural network controllers. When applying the DEP-controller to embodied agents, highly coordinated, rhythmic behaviors can emerge. These behaviors correspond to attractors of a dynamical system. When combining DEP with a “repelling potential” an agent can actively explore all its attractor behaviors in a systematic way []. As a result, highly complex robotic systems, e.g, a Hexapod robot (Fig. a), discover a variety of meaningful behaviors (Fig. b), such as various modes of locomotion.

Such self-organizing behavior can be used to discover and learn a repertoire of behavioral primitives from scratch. Our SUBMODES system [] explores behavior using the DEP-controller. During exploration, internal models are trained to predict the motor commands and the resulting sensory consequences of the performed behavior. The SUBMODES system use an unexpected increase in prediction error to detect transitions between behaviors and switch between internal models (Fig. d). In this way, the system is able systematically structure its sensorimotor experience on-line into compositional models of behavior that can later be used for planning and control.

Video

Members

Empirische Inferenz, Autonomous Learning

Georg Martius

Senior Research Scientist

Autonomous Learning

Christian Gumbsch

Doctoral Researcher

Autonomous Learning

Cristina Pinneri

Publications

Autonomous Learning Article Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives Gumbsch, C., Butz, M. V., Martius, G. IEEE Transactions on Cognitive and Developmental Systems, 13(2):298-311, June 2019 (Published) arXiv PDF video DOI URL BibTeX

Autonomous Learning Conference Paper Systematic self-exploration of behaviors for robots in a dynamical systems framework Pinneri, C., Martius, G. In Proc. Artificial Life XI, :319-326, MIT Press, Cambridge, MA, 2018 () DOI URL BibTeX

Autonomous Learning Article Self-Organized Behavior Generation for Musculoskeletal Robots Der, R., Martius, G. Frontiers in Neurorobotics, 11:8, 2017 () Videos DOI URL BibTeX

Autonomous Learning Article Novel plasticity rule can explain the development of sensorimotor intelligence Der, R., Martius, G. Proceedings of the National Academy of Sciences, 112(45):E6224-E6232, 2015 () DOI URL BibTeX