Intelligent Systems


2024


no image
The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks

Spieler, A., Rahaman, N., Martius, G., Schölkopf, B., Levina, A.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

2024

arXiv [BibTex]


no image
Emergent mechanisms for long timescales depend on training curriculum and affect performance in memory tasks

Khajehabdollahi, S., Zeraati, R., Giannakakis, E., Schäfer, T. J., Martius, G., Levina, A.

In The Twelfth International Conference on Learning Representations, ICLR 2024, May 2024 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Learning Hierarchical World Models with Adaptive Temporal Abstractions from Discrete Latent Dynamics

Gumbsch, C., Sajid, N., Martius, G., Butz, M. V.

In The Twelfth International Conference on Learning Representations, ICLR 2024, May 2024 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Multi-View Causal Representation Learning with Partial Observability

Yao, D., Xu, D., Lachapelle, S., Magliacane, S., Taslakian, P., Martius, G., von Kügelgen, J., Locatello, F.

Proceedings of the Twelfth International Conference on Learning Representations (ICLR), May 2024 (conference) Accepted

arXiv [BibTex]

arXiv [BibTex]


no image
Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal Models
2024 (misc)

Abstract
Humans excel at robust bipedal walking in complex natural environments. In each step, they adequately tune the interaction of biomechanical muscle dynamics and neuronal signals to be robust against uncertainties in ground conditions. However, it is still not fully understood how the nervous system resolves the musculoskeletal redundancy to solve the multi-objective control problem considering stability, robustness, and energy efficiency. In computer simulations, energy minimization has been shown to be a successful optimization target, reproducing natural walking with trajectory optimization or reflex-based control methods. However, these methods focus on particular motions at a time and the resulting controllers are limited when compensating for perturbations. In robotics, reinforcement learning~(RL) methods recently achieved highly stable (and efficient) locomotion on quadruped systems, but the generation of human-like walking with bipedal biomechanical models has required extensive use of expert data sets. This strong reliance on demonstrations often results in brittle policies and limits the application to new behaviors, especially considering the potential variety of movements for high-dimensional musculoskeletal models in 3D. Achieving natural locomotion with RL without sacrificing its incredible robustness might pave the way for a novel approach to studying human walking in complex natural environments. Videos: this https://sites.google.com/view/naturalwalkingrl

link (url) [BibTex]


no image
Machine learning of a density functional for anisotropic patchy particles

Simon, A., Weimar, J., Martius, G., Oettel, M.

Journal of Chemical Theory and Computation, 2024 (article)

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2023


no image
Online Learning under Adversarial Nonlinear Constraints

Kolev, P., Martius, G., Muehlebach, M.

In Advances in Neural Information Processing Systems 36, December 2023 (inproceedings)

link (url) [BibTex]

2023

link (url) [BibTex]


no image
Regularity as Intrinsic Reward for Free Play

Sancaktar, C., Piater, J., Martius, G.

In Advances in Neural Information Processing Systems 37, December 2023 (inproceedings)

Website Code link (url) [BibTex]

Website Code link (url) [BibTex]


no image
Object-Centric Learning for Real-World Videos by Predicting Temporal Feature Similarities

Zadaianchuk, A., Seitzer, M., Martius, G.

In Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023), Advances in Neural Information Processing Systems 36, December 2023 (inproceedings)

Abstract
Unsupervised video-based object-centric learning is a promising avenue to learn structured representations from large, unlabeled video collections, but previous approaches have only managed to scale to real-world datasets in restricted domains. Recently, it was shown that the reconstruction of pre-trained self-supervised features leads to object-centric representations on unconstrained real-world image datasets. Building on this approach, we propose a novel way to use such pre-trained features in the form of a temporal feature similarity loss. This loss encodes semantic and temporal correlations between image patches and is a natural way to introduce a motion bias for object discovery. We demonstrate that this loss leads to state-of-the-art performance on the challenging synthetic MOVi datasets. When used in combination with the feature reconstruction loss, our model is the first object-centric video model that scales to unconstrained video datasets such as YouTube-VIS.

arXiv Website OpenReview link (url) [BibTex]

arXiv Website OpenReview link (url) [BibTex]


no image
Goal-conditioned Offline Planning from Curious Exploration

Bagatella, M., Martius, G.

In Advances in Neural Information Processing Systems 36, December 2023 (inproceedings)

Abstract
Curiosity has established itself as a powerful exploration strategy in deep reinforcement learning. Notably, leveraging expected future novelty as intrinsic motivation has been shown to efficiently generate exploratory trajectories, as well as a robust dynamics model. We consider the challenge of extracting goal-conditioned behavior from the products of such unsupervised exploration techniques, without any additional environment interaction. We find that conventional goal-conditioned reinforcement learning approaches for extracting a value function and policy fall short in this difficult offline setting. By analyzing the geometry of optimal goal-conditioned value functions, we relate this issue to a specific class of estimation artifacts in learned values. In order to mitigate their occurrence, we propose to combine model-based planning over learned value landscapes with a graph-based value aggregation scheme. We show how this combination can correct both local and global artifacts, obtaining significant improvements in zero-shot goal-reaching performance across diverse simulated environments.

link (url) [BibTex]

link (url) [BibTex]


no image
Improving Behavioural Cloning with Positive Unlabeled Learning

Wang, Q., McCarthy, R., Bulens, D. C., McGuinness, K., O’Connor, N. E., Sanchez, F. R., Gürtler, N., Widmaier, F., Redmond, S. J.

7th Annual Conference on Robot Learning (CoRL), November 2023 (conference) Accepted

[BibTex]

[BibTex]


Minsight: A Fingertip-Sized Vision-Based Tactile Sensor for Robotic Manipulation
Minsight: A Fingertip-Sized Vision-Based Tactile Sensor for Robotic Manipulation

Andrussow, I., Sun, H., Kuchenbecker, K. J., Martius, G.

Advanced Intelligent Systems, 5(8):2300042, August 2023, Inside back cover (article)

Abstract
Intelligent interaction with the physical world requires perceptual abilities beyond vision and hearing; vibrant tactile sensing is essential for autonomous robots to dexterously manipulate unfamiliar objects or safely contact humans. Therefore, robotic manipulators need high-resolution touch sensors that are compact, robust, inexpensive, and efficient. The soft vision-based haptic sensor presented herein is a miniaturized and optimized version of the previously published sensor Insight. Minsight has the size and shape of a human fingertip and uses machine learning methods to output high-resolution maps of 3D contact force vectors at 60 Hz. Experiments confirm its excellent sensing performance, with a mean absolute force error of 0.07 N and contact location error of 0.6 mm across its surface area. Minsight's utility is shown in two robotic tasks on a 3-DoF manipulator. First, closed-loop force control enables the robot to track the movements of a human finger based only on tactile data. Second, the informative value of the sensor output is shown by detecting whether a hard lump is embedded within a soft elastomer with an accuracy of 98%. These findings indicate that Minsight can give robots the detailed fingertip touch sensing needed for dexterous manipulation and physical human–robot interaction.

DOI Project Page [BibTex]


Backpropagation through Combinatorial Algorithms: Identity with Projection Works
Backpropagation through Combinatorial Algorithms: Identity with Projection Works

Sahoo, S., Paulus, A., Vlastelica, M., Musil, V., Kuleshov, V., Martius, G.

In Proceedings of the Eleventh International Conference on Learning Representations, May 2023 (inproceedings) Accepted

Abstract
Embedding discrete solvers as differentiable layers has given modern deep learning architectures combinatorial expressivity and discrete reasoning capabilities. The derivative of these solvers is zero or undefined, therefore a meaningful replacement is crucial for effective gradient-based learning. Prior works rely on smoothing the solver with input perturbations, relaxing the solver to continuous problems, or interpolating the loss landscape with techniques that typically require additional solver calls, introduce extra hyper-parameters, or compromise performance. We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass and further provide a theoretical justification. Our experiments demonstrate that such a straightforward hyper-parameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval. Furthermore, we substitute the previously proposed problem-specific and label-dependent margin with a generic regularization procedure that prevents cost collapse and increases robustness.

OpenReview Arxiv Pdf link (url) [BibTex]

OpenReview Arxiv Pdf link (url) [BibTex]


DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems
DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems

Schumacher, P., Haeufle, D. F., Büchler, D., Schmitt, S., Martius, G.

In Proceedings of the Eleventh International Conference on Learning Representations (ICLR), The Eleventh International Conference on Learning Representations (ICLR), May 2023 (inproceedings)

Abstract
Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by our finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems, outperforming current approaches in all considered tasks in sample efficiency and robustness.

Arxiv pdf Website link (url) [BibTex]

Arxiv pdf Website link (url) [BibTex]


Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware

Gürtler, N., Blaes, S., Kolev, P., Widmaier, F., Wüthrich, M., Bauer, S., Schölkopf, B., Martius, G.

In Proceedings of the Eleventh International Conference on Learning Representations, The Eleventh International Conference on Learning Representations (ICLR), May 2023 (inproceedings)

Abstract
Learning policies from previously recorded data is a promising direction for real-world robotics tasks, as online learning is often infeasible. Dexterous manipulation in particular remains an open problem in its general form. The combination of offline reinforcement learning with large diverse datasets, however, has the potential to lead to a breakthrough in this challenging domain analogously to the rapid progress made in supervised learning in recent years. To coordinate the efforts of the research community toward tackling this problem, we propose a benchmark including: i) a large collection of data for offline learning from a dexterous manipulation platform on two tasks, obtained with capable RL agents trained in simulation; ii) the option to execute learned policies on a real-world robotic system and a simulation for efficient debugging. We evaluate prominent open-sourced offline reinforcement learning algorithms on the datasets and provide a reproducible experimental setup for offline reinforcement learning on real systems.

Website arXiv Code link (url) [BibTex]

Website arXiv Code link (url) [BibTex]


no image
Efficient Learning of High Level Plans from Play

Armengol Urpi, N., Bagatella, M., Hilliges, O., Martius, G., Coros, S.

In International Conference on Robotics and Automation, 2023 (inproceedings) Accepted

Abstract
Real-world robotic manipulation tasks remain an elusive challenge, since they involve both fine-grained environment interaction, as well as the ability to plan for long-horizon goals. Although deep reinforcement learning (RL) methods have shown encouraging results when planning end-to-end in high-dimensional environments, they remain fundamentally limited by poor sample efficiency due to inefficient exploration, and by the complexity of credit assignment over long horizons. In this work, we present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL to achieve long-horizon complex manipulation tasks. We leverage task-agnostic play data to learn a discrete behavioral prior over object-centric primitives, modeling their feasibility given the current context. We then design a high-level goal-conditioned policy which (1) uses primitives as building blocks to scaffold complex long-horizon tasks and (2) leverages the behavioral prior to accelerate learning. We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks and learns policies that can be easily transferred to physical hardware.

Arxiv Website Poster [BibTex]

Arxiv Website Poster [BibTex]


Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning
Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning

Eberhard, O., Hollenstein, J., Pinneri, C., Martius, G.

In Proceedings of the Eleventh International Conference on Learning Representations (ICLR), The Eleventh International Conference on Learning Representations (ICLR), May 2023 (inproceedings)

Abstract
In off-policy deep reinforcement learning with continuous action spaces, exploration is often implemented by injecting action noise into the action selection process. Popular algorithms based on stochastic policies, such as SAC or MPO, inject white noise by sampling actions from uncorrelated Gaussian distributions. In many tasks, however, white noise does not provide sufficient exploration, and temporally correlated noise is used instead. A common choice is Ornstein-Uhlenbeck (OU) noise, which is closely related to Brownian motion (red noise). Both red noise and white noise belong to the broad family of colored noise. In this work, we perform a comprehensive experimental evaluation on MPO and SAC to explore the effectiveness of other colors of noise as action noise. We find that pink noise, which is halfway between white and red noise, significantly outperforms white noise, OU noise, and other alternatives on a wide range of environments. Thus, we recommend it as the default choice for action noise in continuous control.

link (url) [BibTex]

link (url) [BibTex]


Versatile Skill Control via Self-supervised Adversarial Imitation of Unlabeled Mixed Motions
Versatile Skill Control via Self-supervised Adversarial Imitation of Unlabeled Mixed Motions

Li, C., Blaes, S., Kolev, P., Vlastelica, M., Frey, J., Martius, G.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), IEEE International Conference on Robotics and Automation (ICRA), May 2023 (inproceedings) Accepted

Abstract
Learning diverse skills is one of the main challenges in robotics. To this end, imitation learning approaches have achieved impressive results. These methods require explicitly labeled datasets or assume consistent skill execution to enable learning and active control of individual behaviors, which limits their applicability. In this work, we propose a cooperative adversarial method for obtaining controllable skill sets from unlabeled datasets containing diverse state transition patterns by maximizing their discriminability. Moreover, we show that by utilizing unsupervised skill discovery in the generative adversarial imitation learning framework, novel and useful skills emerge with successful task fulfillment. Finally, the obtained universal policies are tested on an agile quadruped robot called Solo 8 and present faithful replications of diverse skills encoded in the demonstrations.

Arxiv Videos Project [BibTex]

Arxiv Videos Project [BibTex]


no image
Bridging the Gap to Real-World Object-Centric Learning

Seitzer, M., Horn, M., Zadaianchuk, A., Zietlow, D., Xiao, T., Simon-Gabriel, C., He, T., Zhang, Z., Schölkopf, B., Brox, T., Locatello, F.

In Proceedings of the Eleventh International Conference on Learning Representations, The Eleventh International Conference on Learning Representations (ICLR), May 2023 (inproceedings)

Abstract
Humans naturally decompose their environment into entities at the appropriate level of abstraction to act in the world. Allowing machine learning algorithms to derive this decomposition in an unsupervised way has become an important line of research. However, current methods are restricted to simulated data or require additional information in the form of motion or depth in order to successfully discover objects. In this work, we overcome this limitation by showing that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way. Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data and is the first unsupervised object-centric model that scales to real world-datasets such as COCO and PASCAL VOC. DINOSAUR is conceptually simple and shows competitive performance compared to more involved pipelines from the computer vision literature.

Code Website link (url) [BibTex]

Code Website link (url) [BibTex]


Predicting the Force Map of an {ERT}-Based Tactile Sensor Using Simulation and Deep Networks
Predicting the Force Map of an ERT-Based Tactile Sensor Using Simulation and Deep Networks

Lee, H., Sun, H., Park, H., Serhat, G., Javot, B., Martius, G., Kuchenbecker, K. J.

IEEE Transactions on Automation Science and Engineering, 20(1):425-439, January 2023 (article)

Abstract
Electrical resistance tomography (ERT) can be used to create large-scale soft tactile sensors that are flexible and robust. Good performance requires a fast and accurate mapping from the sensor's sequential voltage measurements to the distribution of force across its surface. However, particularly with multiple contacts, this task is challenging for both previously developed approaches: physics-based modeling and end-to-end data-driven learning. Some promising results were recently achieved using sim-to-real transfer learning, but estimating multiple contact locations and accurate contact forces remains difficult because simulations tend to be less accurate with a high number of contact locations and/or high force. This paper introduces a modular hybrid method that combines simulation data synthesized from an electromechanical finite element model with real measurements collected from a new ERT-based tactile sensor. We use about 290,000 simulated and 90,000 real measurements to train two deep neural networks: the first (Transfer-Net) captures the inevitable gap between simulation and reality, and the second (Recon-Net) reconstructs contact forces from voltage measurements. The number of contacts, contact locations, force magnitudes, and contact diameters are evaluated for a manually collected multi-contact dataset of 150 measurements. Our modular pipeline's results outperform predictions by both a physics-based model and end-to-end learning.

DOI Project Page [BibTex]


no image
Discovering causal relations and equations from data

Camps-Valls, G., Gerhardus, A., Ninad, U., Varando, G., Martius, G., Balaguer-Ballester, E., Vinuesa, R., Diaz, E., Zanna, L., Runge, J.

Physics Reports, 1044, pages: 1-68, 2023 (article)

Abstract
{Physics is a field of science that has traditionally used the scientific method to answer questions about why natural phenomena occur and to make testable models that explain the phenomena. Discovering equations, laws, and principles that are invariant, robust, and causal has been fundamental in physical sciences throughout the centuries. Discoveries emerge from observing the world and, when possible, performing interventions on the system under study. With the advent of big data and data-driven methods, the fields of causal and equation discovery have developed and accelerated progress in computer science, physics, statistics, philosophy, and many applied fields. This paper reviews the concepts, methods, and relevant works on causal and equation discovery in the broad field of physics and outlines the most important challenges and promising future lines of research. We also provide a taxonomy for data-driven causal and equation discovery, point out connections, and showcase comprehensive case studies in Earth and climate sciences, fluid dynamics and mechanics, and the neurosciences. This review demonstrates that discovering fundamental laws and causal relations by observing natural phenomena is revolutionised with the efficient exploitation of observational data and simulations, modern machine learning algorithms and the combination with domain knowledge. Exciting times are ahead with many challenges and opportunities to improve our understanding of complex systems.

DOI [BibTex]

DOI [BibTex]


no image
Interpretable Symbolic Regression for Data Science: Analysis of the 2022 Competition

Franca, F. D., Virgolin, M., Kommenda, M., Majumder, M., Cranmer, M., Espada, G., Ingelse, L., Fonseca, A., Landajuela, M., Petersen, B., Glatt, R., Mundhenk, N., Lee, C., Hochhalter, J., Randall, D., Kamienny, P., Zhang, H., Dick, G., Simon, A., Burlacu, B., Kasak, J., Machado, M., Wilstrup, C., Cava, W. L.

arXiv, 2023 (article)

link (url) [BibTex]

link (url) [BibTex]

2022


Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations
Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

(Best Paper Award Finalist)

Li, C., Vlastelica, M., Blaes, S., Frey, J., Grimminger, F., Martius, G.

Proceedings of the 6th Conference on Robot Learning (CoRL), Conference on Robot Learning (CoRL), December 2022 (conference) Accepted

Abstract
Learning agile skills is one of the main challenges in robotics. To this end, reinforcement learning approaches have achieved impressive results. These methods require explicit task information in terms of a reward function or an expert that can be queried in simulation to provide a target control output, which limits their applicability. In this work, we propose a generative adversarial method for inferring reward functions from partial and potentially physically incompatible demonstrations for successful skill acquirement where reference or expert demonstrations are not easily accessible. Moreover, we show that by using a Wasserstein GAN formulation and transitions from demonstrations with rough and partial information as input, we are able to extract policies that are robust and capable of imitating demonstrated behaviors. Finally, the obtained skills such as a backflip are tested on an agile quadruped robot called Solo 8 and present faithful replication of hand-held human demonstrations.

Arxiv Videos Project link (url) [BibTex]

2022

Arxiv Videos Project link (url) [BibTex]


no image
Learning with Muscles: Benefits for Data-Efficiency and Robustness in Anthropomorphic Tasks

Wochner, I., Schumacher, P., Martius, G., Büchler, D., Schmitt, S., Haeufle, D.

Proceedings of the 6th Conference on Robot Learning (CoRL), 205, pages: 1178-1188, Proceedings of Machine Learning Research, (Editors: Liu, Karen and Kulic, Dana and Ichnowski, Jeff), PMLR, December 2022 (conference)

link (url) [BibTex]

link (url) [BibTex]


no image
A Sequential Group VAE for Robot Learning of Haptic Representations

Richardson, B. A., Kuchenbecker, K. J., Martius, G.

pages: 1-11, Workshop paper (8 pages) presented at the CoRL Workshop on Aligning Robot Representations with Humans, Auckland, New Zealand, December 2022 (misc)

Abstract
Haptic representation learning is a difficult task in robotics because information can be gathered only by actively exploring the environment over time, and because different actions elicit different object properties. We propose a Sequential Group VAE that leverages object persistence to learn and update latent general representations of multimodal haptic data. As a robot performs sequences of exploratory procedures on an object, the model accumulates data and learns to distinguish between general object properties, such as size and mass, and trial-to-trial variations, such as initial object position. We demonstrate that after very few observations, the general latent representations are sufficiently refined to accurately encode many haptic object properties.

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation

Sancaktar, C., Blaes, S., Martius, G.

In Advances in Neural Information Processing Systems 35 (NeurIPS 2022), Curran Associates, Inc., 36th Annual Conference on Neural Information Processing Systems, December 2022 (inproceedings)

Arxiv Videos Openreview link (url) [BibTex]

Arxiv Videos Openreview link (url) [BibTex]


Embrace the Gap: VAEs Perform Independent Mechanism Analysis
Embrace the Gap: VAEs Perform Independent Mechanism Analysis

Reizinger*, P., Gresele*, L., Brady*, J., von Kügelgen, J., Zietlow, D., Schölkopf, B., Martius, G., Brendel, W., Besserve, M.

Advances in Neural Information Processing Systems (NeurIPS 2022), 35, pages: 12040-12057, (Editors: S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh), Curran Associates, Inc., 36th Annual Conference on Neural Information Processing Systems, December 2022, *equal first authorship (conference)

Arxiv PDF link (url) [BibTex]

Arxiv PDF link (url) [BibTex]


no image
Real Robot Challenge 2022: Learning Dexterous Manipulation from Offline Data in the Real World

Gürtler, N., Widmaier, F., Sancaktar, C., Blaes, S., Kolev, P., Bauer, S., Wüthrich, M., Wulfmeier, M., Riedmiller, M., Allshire, A., Wang, Q., McCarthy, R., Kim, H., Baek, J., Kwon, W., Qian, S., Toshimitsu, Y., Michelis, M. Y., Kazemipour, A., Raayatsanati, A., Zheng, H., Cangan, B. G., Schölkopf, B., Martius, G.

Proceedings of the NeurIPS 2022 Competitions Track, 220, pages: 133-150, Proceedings of Machine Learning Research, (Editors: Ciccone, Marco and Stolovitzky, Gustavo and Albrecht, Jacob), PMLR, December 2022 (conference)

link (url) [BibTex]

link (url) [BibTex]


no image
Self-supervised Reinforcement Learning with Independently Controllable Subgoals

Zadaianchuk, A., Martius, G., Yang, F.

In Proceedings of the 5th Conference on Robot Learning, 164, pages: 384-394, PMLR, 2022 (inproceedings) Accepted

Abstract
To successfully tackle challenging manipulation tasks, autonomous agents must learn a diverse set of skills and how to combine them. Recently, self-supervised agents that set their own abstract goals by exploiting the discovered structure in the environment were shown to perform well on many different tasks. In particular, some of them were applied to learn basic manipulation skills in compositional multi-object environments. However, these methods learn skills without taking the dependencies between objects into account. Thus, the learned skills are difficult to combine in realistic environments. We propose a novel self-supervised agent that estimates relations between environment components and uses them to independently control different parts of the environment state. In addition, the estimated relations between objects can be used to decompose a complex goal into a compatible sequence of subgoals. We show that, by using this framework, an agent can efficiently and automatically learn manipulation tasks in multi-object environments with different relations between objects.

Arxiv Openreview Poster link (url) Project Page [BibTex]

Arxiv Openreview Poster link (url) Project Page [BibTex]


A Soft Vision-Based Tactile Sensor for Robotic Fingertip Manipulation
A Soft Vision-Based Tactile Sensor for Robotic Fingertip Manipulation

Andrussow, I., Sun, H., Kuchenbecker, K. J., Martius, G.

Workshop paper (1 page) presented at the IROS Workshop on Large-Scale Robotic Skin: Perception, Interaction and Control, Kyoto, Japan, October 2022 (misc)

Abstract
For robots to become fully dexterous, their hardware needs to provide rich sensory feedback. High-resolution haptic sensing similar to the human fingertip can enable robots to execute delicate manipulation tasks like picking up small objects, inserting a key into a lock, or handing a cup of coffee to a human. Many tactile sensors have emerged in recent years; one especially promising direction is vision-based tactile sensors due to their low cost, low wiring complexity and high-resolution sensing capabilities. In this work, we build on previous findings to create a soft fingertip-sized tactile sensor. It can sense normal and shear contact forces all around its 3D surface with an average prediction error of 0.05 N, and it localizes contact on its shell with an average prediction error of 0.5 mm. The software of this sensor uses a data-efficient machine-learning pipeline to run in real time on hardware with low computational power like a Raspberry Pi. It provides a maximum data frame rate of 60 Hz via USB.

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Developing hierarchical anticipations via neural network-based event segmentation
Developing hierarchical anticipations via neural network-based event segmentation

(SmartBot Challenge award winner)

Gumbsch, C., Adam, M., Elsner, B., Martius, G., Butz, M. V.

In Proceedings of the IEEE International Conference on Development and Learning (ICDL 2022), pages: 1-8, 2022 IEEE International Conference on Development and Learning (ICDL), September 2022 (inproceedings)

Abstract
Humans can make predictions on various time scales and hierarchical levels. Thereby, the learning of event encodings seems to play a crucial role. In this work we model the development of hierarchical predictions via autonomously learned latent event codes. We present a hierarchical recurrent neural network architecture, whose inductive learning biases foster the development of sparsely changing latent state that compress sensorimotor sequences. A higher level network learns to predict the situations in which the latent states tend to change. Using a simulated robotic manipulator, we demonstrate that the system (i) learns latent states that accurately reflect the event structure of the data, (ii) develops meaningful temporal abstract predictions on the higher level, and (iii) generates goal-anticipatory behavior similar to gaze behavior found in eye-tracking studies with infants. The architecture offers a step towards the autonomous learning of compressed hierarchical encodings of gathered experiences and the exploitation of these encodings to generate adaptive behavior.

link (url) [BibTex]

link (url) [BibTex]


{InvGAN}: Invertible {GANs}
InvGAN: Invertible GANs

(Best Paper Award)

Ghosh, P., Zietlow, D., Black, M. J., Davis, L. S., Hu, X.

In Pattern Recognition, pages: 3-19, Lecture Notes in Computer Science, 13485, (Editors: Andres, Björn and Bernard, Florian and Cremers, Daniel and Frintrop, Simone and Goldlücke, Bastian and Ihrke, Ivo), Springer, Cham, 44th DAGM German Conference on Pattern Recognition (DAGM GCPR 2022), September 2022 (inproceedings)

Abstract
Generation of photo-realistic images, semantic editing and representation learning are only a few of many applications of high-resolution generative models. Recent progress in GANs have established them as an excellent choice for such tasks. However, since they do not provide an inference model, downstream tasks such as classification cannot be easily applied on real images using the GAN latent space. Despite numerous efforts to train an inference model or design an iterative method to invert a pre-trained generator, previous methods are dataset (e.g. human face images) and architecture (e.g. StyleGAN) specific. These methods are nontrivial to extend to novel datasets or architectures. We propose a general framework that is agnostic to architecture and datasets. Our key insight is that, by training the inference and the generative model together, we allow them to adapt to each other and to converge to a better quality model. Our InvGAN, short for Invertible GAN, successfully embeds real images in the latent space of a high quality generative model. This allows us to perform image inpainting, merging, interpolation and online data augmentation. We demonstrate this with extensive qualitative and quantitative experiments.

pdf DOI [BibTex]

pdf DOI [BibTex]


no image
Leveling Down in Computer Vision: Pareto Inefficiencies in Fair Deep Classifiers

Zietlow, D., Lohaus, M., Balakrishnan, G., Kleindessner, M., Locatello, F., Schölkopf, B., Russell, C.

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages: 10410-10421, June 2022 (conference)

arXiv link (url) [BibTex]

arXiv link (url) [BibTex]


no image
Orchestrated Value Mapping for Reinforcement Learning

Fatemi, M., Tavakoli, A.

International Conference on Learning Representations, April 2022 (conference)

link (url) [BibTex]

link (url) [BibTex]


Guiding the Design of Superresolution Tactile Skins with Taxel Value Isolines Theory
Guiding the Design of Superresolution Tactile Skins with Taxel Value Isolines Theory

Sun, H., Martius, G.

Science Robotics, 7(63):eabm0608, February 2022 (article)

Abstract
Tactile feedback is essential to make robots more agile and effective in unstructured environments. However, high-resolution tactile skins are not widely available; this is due to the large size of robust sensing units and because many units typically lead to fragility in wiring and to high costs. One route toward high-resolution and robust tactile skins involves the embedding of a few sensor units (taxels) into a flexible surface material and the use of signal processing to achieve sensing with superresolution accuracy. Here, we propose a theory for geometric superresolution to guide the development of tactile sensors of this kind and link it to machine learning techniques for signal processing. This theory is based on sensor isolines and allows us to compute the possible force sensitivity and accuracy in contact position and force magnitude as a spatial quantity before building a sensor. We evaluate the influence of different factors, such as elastic properties of the material, structure design, and transduction methods, using finite element simulations and by implementing real sensors. We empirically determine sensor isolines and validate the theory in two custom-built sensors with 1D and 2D measurement surfaces that use barometric units. Using machine learning methods to infer contact information, our sensors obtain an average superresolution factor of over 100 and 1200, respectively. Our theory can guide future tactile sensor designs and inform various design choices. We propose a theory using taxel value isolines to guide superresolution tactile sensor design and evaluate it empirically.

Authors copy link (url) DOI Project Page [BibTex]


A Soft Thumb-Sized Vision-Based Sensor with Accurate All-Round Force Perception
A Soft Thumb-Sized Vision-Based Sensor with Accurate All-Round Force Perception

Sun, H., Kuchenbecker, K. J., Martius, G.

Nature Machine Intelligence, 4(2):135-145, February 2022 (article)

Abstract
Vision-based haptic sensors have emerged as a promising approach to robotic touch due to affordable high-resolution cameras and successful computer-vision techniques. However, their physical design and the information they provide do not yet meet the requirements of real applications. We present a robust, soft, low-cost, vision-based, thumb-sized 3D haptic sensor named Insight: it continually provides a directional force-distribution map over its entire conical sensing surface. Constructed around an internal monocular camera, the sensor has only a single layer of elastomer over-molded on a stiff frame to guarantee sensitivity, robustness, and soft contact. Furthermore, Insight is the first system to combine photometric stereo and structured light using a collimator to detect the 3D deformation of its easily replaceable flexible outer shell. The force information is inferred by a deep neural network that maps images to the spatial distribution of 3D contact force (normal and shear). Insight has an overall spatial resolution of 0.4 mm, force magnitude accuracy around 0.03 N, and force direction accuracy around 5 degrees over a range of 0.03--2 N for numerous distinct contacts with varying contact area. The presented hardware and software design concepts can be transferred to a wide variety of robot parts.

Paper link (url) DOI Project Page [BibTex]


no image
Machine-Learning-Driven Haptic Sensor Design

Sun, H.

University of Tuebingen, Library, 2022 (phdthesis)

Abstract
Similar to biological systems, robots may need skin-like sensing ability to perceive interactions in complex, changing, and human-involved environments. Current skin-like sensing technologies are still far behind their biological counterparts when considering resolution, dynamics range, robustness, and surface coverage together. One key challenge is the wiring of sensing elements. During my Ph.D. study, I explore how machine learning can enable the design of a new kind of haptic sensors to deal with such a challenge. On the one hand, I propose super-resolution-oriented tactile skins, reducing the number of physical sensing elements while achieving high spatial accuracy. On the other hand, I explore vision-based haptic sensor designs. In this thesis, I present four types of machine-learning-driven haptic sensors that I designed for coarse and fine robotic applications, varying from large surface (robot limbs) to small surface sensing (robot fingers). Moreover, I propose a super-resolution theory to guide sensor designs at all levels ranging from hardware design (material/structure/transduction), data collection (real/simulated), and signal processing methods (analytical/data-driven). I investigate two designs for large-scale coarse-resolution sensing, e.g., robotic limbs. HapDef sparsely attaches a few strain gauges on a large curved surface internally to measure the deformation over the whole surface. ERT-DNN wraps a large surface with a piece of multi-layered conductive fabric, which varies its conductivity upon contacts exerted. I also conceive two approaches for small-scale fine-resolution sensing, e.g., robotic fingertips. BaroDome sparsely embeds a few barometers inside a soft elastomer to measure internal pressure changes caused by external contact. Insight encloses a high-resolution camera to view a soft shell from within. Generically, an inverse problem needs to be solved when trying to obtain high-resolution sensing with a few physical sensing elements. I develop machine-learning frameworks suitable for solving this inverse problem. They process various raw sensor data and extract useful haptic information in practice. Machine learning methods rely on data collected by an automated robotic stimulation device or synthesized using finite element methods. I build several physical testbeds and finite element models to collect copious data. I propose machine learning frameworks to combine data from different sources that are good enough to deal with the noise in real data and generalize well from seen to unseen situations. While developing my prototype sensors, I have faced reoccurring design choices. To help my developments and guide future research, I propose a unified theory with the concept of taxel-value-isolines. It captures the physical effects required for super-resolution, ties them to all parts of the sensor design, and allows us to assess them quantitatively. The theory offers an explanation about physically achievable accuracies for localizing and quantifying contact based on uncertainties introduced by measurement noise in sensor elements. The theoretical analysis aims to predict the best performance before a physical prototype is built and helps to evaluate the hardware design, data collection, and data processing methods during implementation. This thesis presents a new perspective on haptic sensor design. Using machine learning to substitute the entire data-processing pipeline, I present several haptic sensor designs for applications ranging from large-surface skins to high-resolution tactile fingertip sensors. The developed theory for obtaining optimal super-resolution can guide future sensor designs.

link (url) [BibTex]

link (url) [BibTex]


no image
Inference of affordances and active motor control in simulated agents

Scholz, F., Gumbsch, C., Otte, S., Butz, M. V.

Frontiers in Neurobiotics, 16, 2022 (article)

DOI [BibTex]

DOI [BibTex]


Intelligent problem-solving as integrated hierarchical reinforcement learning
Intelligent problem-solving as integrated hierarchical reinforcement learning

Eppe, M., Gumbsch, C., Kerzel, M., Nguyen, P. D. H., Butz, M. V., Wermter, S.

Nature Machine Intelligence, 4(1):11-20, 2022 (article)

Abstract
According to cognitive psychology and related disciplines, the development of complex problem-solving behaviour in biological agents depends on hierarchical cognitive mechanisms. Hierarchical reinforcement learning is a promising computational approach that may eventually yield comparable problem-solving behaviour in artificial agents and robots. However, so far, the problem-solving abilities of many human and non-human animals are clearly superior to those of artificial systems. Here we propose steps to integrate biologically inspired hierarchical mechanisms to enable advanced problem-solving skills in artificial agents. We first review the literature in cognitive psychology to highlight the importance of compositional abstraction and predictive processing. Then we relate the gained insights with contemporary hierarchical reinforcement learning methods. Interestingly, our results suggest that all identified cognitive mechanisms have been implemented individually in isolated computational architectures, raising the question of why there exists no single unifying architecture that integrates them. As our final contribution, we address this question by providing an integrative perspective on the computational challenges to develop such a unifying architecture. We expect our results to guide the development of more sophisticated cognitively inspired hierarchical machine learning architectures.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Uncertainty in Equation Learning

Werner, M., Junginger, A., Hennig, P., Martius, G.

In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO), pages: 2298-2305, Association for Computing Machinery, 2022 (inproceedings)

Abstract
Equation learning is a deep learning framework for explainable machine learning in regression settings, with applications in engineering and the natural sciences. Equation learners typically do not capture uncertainty about the model or its predictions, although uncertainty is often highly structured and particularly relevant for these kinds of applications. We show how simple, yet effective, forms of Bayesian deep learning can be used to build structure and explainable uncertainty over a set of found equations. Specifically, we use a mixture of Laplace approximations, where each mixture component captures a different equation structure, and the local Laplace approximations capture parametric uncertainty within one family of equations. We present results on both synthetic and real world examples.

Paper PDF DOI [BibTex]

Paper PDF DOI [BibTex]

2021


Causal Influence Detection for Improving Efficiency in Reinforcement Learning
Causal Influence Detection for Improving Efficiency in Reinforcement Learning

Seitzer, M., Schölkopf, B., Martius, G.

In Advances in Neural Information Processing Systems 34, 27, pages: 22905-22918, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (inproceedings)

arXiv PDF Data Code link (url) Project Page [BibTex]

2021

arXiv PDF Data Code link (url) Project Page [BibTex]


no image
Hierarchical Reinforcement Learning with Timed Subgoals

Gürtler, N., Büchler, D., Martius, G.

In Advances in Neural Information Processing Systems 34, 26, pages: 21732-21743, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (inproceedings)

video arXiv code link (url) [BibTex]

video arXiv code link (url) [BibTex]


Planning from Pixels in Environments with Combinatorially Hard Search Spaces
Planning from Pixels in Environments with Combinatorially Hard Search Spaces

Bagatella, M., Olšák, M., Rolínek, M., Martius, G.

In Advances in Neural Information Processing Systems 34, 30, pages: 24707-24718, Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (inproceedings)

Abstract
The ability to form complex plans based on raw visual input is a litmus test for current capabilities of artificial intelligence, as it requires a seamless combination of visual processing and abstract algorithmic execution, two traditionally separate areas of computer science. A recent surge of interest in this field brought advances that yield good performance in tasks ranging from arcade games to continuous control; these methods however do not come without significant issues, such as limited generalization capabilities and difficulties when dealing with combinatorially hard planning instances. Our contribution is two-fold: (i) we present a method that learns to represent its environment as a latent graph and leverages state reidentification to reduce the complexity of finding a good policy from exponential to linear (ii) we introduce a set of lightweight environments with an underlying discrete combinatorial structure in which planning is challenging even for humans. Moreover, we show that our methods achieves strong empirical generalization to variations in the environment, even across highly disadvantaged regimes, such as “one-shot” planning, or in an offline RL paradigm which only provides low-quality trajectories.

link (url) [BibTex]

link (url) [BibTex]


Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains
Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains

Gumbsch, C., Butz, M. V., Martius, G.

In Advances in Neural Information Processing Systems 34, 21, pages: 17518-17531, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (inproceedings)

Abstract
A common approach to prediction and planning in partially observable domains is to use recurrent neural networks (RNNs), which ideally develop and maintain a latent memory about hidden, task-relevant factors. We hypothesize that many of these hidden factors in the physical world are constant over time, changing only sparsely. Accordingly, we propose Gated $L_0$ Regularized Dynamics (GateL0RD), a novel recurrent architecture that incorporates the inductive bias to maintain stable, sparsely changing latent states. The bias is implemented by means of a novel internal gating function and a penalty on the $L_0$ norm of latent state changes. We demonstrate that GateL0RD can compete with or outperform state-of-the-art RNNs in a variety of partially observable prediction and control tasks. GateL0RD tends to encode the underlying generative factors of the environment, ignores spurious temporal dependencies, and generalizes better, improving sampling efficiency and prediction accuracy as well as behavior in model-based planning and reinforcement learning tasks. Moreover, we show that the developing latent states can be easily interpreted, which is a step towards better explainability in RNNs.

arXiv Openreview link (url) [BibTex]

arXiv Openreview link (url) [BibTex]


Risk-Averse Zero-Order Trajectory Optimization
Risk-Averse Zero-Order Trajectory Optimization

Vlastelica*, M., Blaes*, S., Pinneri, C., Martius, G.

In Conference on Robot Learning, 2021, *Equal Contribution (inproceedings) Accepted

Abstract
We introduce a simple but effective method for managing risk in zero-order trajectory optimization that involves probabilistic safety constraints and balancing of optimism in the face of epistemic uncertainty and pessimism in the face of aleatoric uncertainty of an ensemble of stochastic neural networks. Various experiments indicate that the separation of uncertainties is essential to performing well with data-driven MPC approaches in uncertain and safety-critical control environments.

OpenReview PDF link (url) Project Page [BibTex]


no image
Falsification of hybrid systems with symbolic reachability analysis and trajectory splicing

Bogomolov, B. S., Frehse, G., Gurung, A., Li, D., Martius, G., Ray, R.

Nonlinear Analysis: Hybrid Systems, 42, pages: 101093, Elsevier, November 2021 (article)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Making Higher Order MOT Scalable: An Efficient Approximate Solver for Lifted Disjoint Paths

Hornakova, A. K. T. S. P. R. M. R. B. H. R.

Proceedings 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), pages: 6310-6320, IEEE, ICCV 2021, October 2021 (conference)

DOI [BibTex]

DOI [BibTex]