A deep learning-based method for vision-based tactile sensing

To effectively interact with their surrounding environment, robots should be able to identify characteristics of different objects just by touching them, like humans do. This would allow them to get hold of and manage objects more efficiently, using feedback gathered by sensors to adjust their grasp and manipulation strategies.

With this in mind, research groups worldwide have been trying to develop techniques that could give robots a sense of touch by analyzing data collected by sensors, many of which are based on the use of deep learning architectures. While some of these methods are promising, they typically require vast amounts of training data and do not always generalize well across previously unseen objects.

Researchers at ETH Zurich have recently introduced a new deep learning-based strategy that could enable tactile sensing in robots without requiring large amounts of real-world data. Their approach, outlined in a paper pre-published on arXiv, entails training deep neural networks entirely on simulation data.

"Our technique learns from data how to predict the distribution of the forces exerted by an object in contact with the sensing surface," Carlo Sferrazza, one of the researchers who carried out the study, told TechXplore. "So far, this data (in the order of tens of thousands of data points) needed to be collected in an experimental setup over several hours, which was expensive in terms of time and equipment. In this work, we generated our data entirely in simulation, retaining high sensing accuracy when deploying our technique in the real world."

In their experiments, Sferrazza and his colleagues used a sensor they built with simple and low-cost components. This sensor is comprised of a standard camera placed below a soft material, which contains a random spread of tiny plastic particles.

When a force is applied to its surface, the soft material deforms and causes the plastic particles to move. This motion is then captured by the sensor's camera and recorded.

"We exploit the image patterns created by the moving particles to extract information about the forces causing the material deformation," Sferrazza explained. "By densely embedding the particles into the material we can obtain an extremely high resolution. Since we take a data-driven approach to solve this task, we can overcome the complexity of modeling contact with soft materials and estimate the distribution of these forces with high accuracy."

Essentially, the researchers created models of the sensor's soft material and camera projection using state-of-the-art computational methods. They then used these models in simulations, to create a dataset of 13,448 synthetic images that is ideal for training tactile sensing algorithms. The fact that they were able to generate training data for their tactile sensing model in simulations is highly advantageous, as it prevented them from having to collect and annotate data in the real world.

"We also developed a transfer learning technique that allows us to use the same model on multiple instances of the tactile sensors we produce in the real-world, without the need for additional data," Sferrazza said. "This means that each sensor becomes cheaper to produce, as they don't require additional calibration efforts."

The researchers used the synthetic dataset they created to train a neural network architecture for vision-based tactile sensing applications and then evaluated its performance in a series of tests. The neural network achieved remarkable results, making accurate sensing predictions on real data, even if it was trained on simulations.

"The tailored neural network architecture that we trained also shows very promising generalization possibilities for use in other situations, when applied to data that is quite different from that used in our simulations, e.g., for the estimation of contact with single or multiple objects of arbitrary shapes," Sferrazza said.

In the future, the deep learning architecture developed by Sferrazza and his colleagues could provide robots with an artificial sense of touch, potentially enhancing their grasping and manipulation skills. In addition, the synthetic dataset they compiled could be used to train other models for tactile sensing or may inspire the creation of new simulation-based datasets.

"We now want to evaluate our algorithms in tasks that involve very general interactions with complex objects, and we are also working on improving their accuracy," Sferrazza said. "We think that this technique will show its advantages when applied to real-world robotic tasks, such as applications that involve the fine manipulation of fragile objects—such as a glass or an egg."

More information: Learning the sense of touch in simulation: a sim-to-real strategy for vision-based tactile sensing. arXiv:2003.02640 [cs.RO]. arxiv.org/abs/2003.02640

C. Sferrazza, A. Wahlsten, C. Trueeb and R. D'Andrea, "Ground Truth Force Distribution for Learning-Based Tactile Sensing: A Finite Element Approach," in IEEE Access, vol. 7, pp. 173438-173449, 2019.