One of the primary questions of this project is understanding how to generate structured behaviors for a robotic system, without specifying any reward nor objective function. The systems we are interested in should be able to exploit the embodiment among brain, body and environment to self-explore a wide range of behaviors and automatically extract a suitable controller for each of them. In order to bootstrap this goal-free explorative process, we use a biologically plausible synaptic mechanism for self-organizing controllers, in particular, differential extrinsic plasticity (DEP), which has proven to enable embodied agents to self-organize their individual sensorimotor development and generate highly coordinated behaviors during their interaction with the environment.
We use a dynamical systems framework to describe a behavior as an attractor in the brain-body-environment system using DEP. The behaviors self-organize within a few seconds of live interaction and are specific to the embodiment of the robot. Each behavior corresponds to a potentially useful motion primitive.
The behavioral landscape generated by DEP is then explored thanks to a "repelling potential" which allows the system to actively explore all its attractor behaviors in a systematic way. With a view to a self-determined exploration of goal-free behaviors, our framework enables switching between different motion patterns in an autonomous and sequential fashion. Our algorithm is able to recover all the attractor behaviors in a toy system and it is also effective in two simulated environments. A spherical robot discovers all its major rolling modes and a hexapod robot learns to locomote in 50 different ways in 30min.