Sim-to-Real Transfer for Optical Tactile Sensing
Deep learning and reinforcement learning methods have been shown to enable learning of flexible and complex robot controllers. However, the reliance on large amounts of training data often requires data collection to be carried out in simulation, with a number of sim-to-real transfer methods being developed in recent years. In this paper, we study these techniques for tactile sensing using the TacTip optical tactile sensor, which consists of a deformable tip with a camera observing the positions of pins inside this tip. We designed a model for soft body simulation which was implemented using the Unity physics engine, and trained a neural network to predict the locations and angles of edges when in contact with the sensor. Using domain randomisation techniques for sim-to-real transfer, we show how this framework can be used to accurately predict edges with less than 1 mm prediction error in real-world testing, without any real-world data at all.
Try this test: with your fingers placed on a bottle top, try unscrewing the top from the bottle with your eyes closed, and again with your eyes open. You'll find that your eyes are really not necessary to do this. Yet if we could do the same experiment by de-activating our tactile senses (not quite so easy...), then we would likely find that the task becomes almost impossible. So, whilst we use our eyes for making high-level decisions (e.g. where is the bottle?), our tactile senses take over when we actually physically interact with objects. But even though cameras and computer vision have comparable abilities to human eyes and human visual perception, we are some way from bringing tactile sensing to robots.
One example of a step in the right direction is the TacTip sensor developed at the University of Bristol. It consists of a set of pins attached to a deformable tip, with a camera built into the sensor which observes the movement of these pins as the tip deforms, creating a "tactile image":
We decided to take a closer look at this sensor and study whether we could use deep learning on these tactile images. Specifically, we extracted the 2D locations of the pins in the camera image, and trained a neural network to predict the position and orientation of edges that the sensor is in contact with. This method was tested on three different tasks, each involving prediction of a different coordinate (in red below):
However, deep learning requires a very large amount of training data. One option would be to collect this data with the sensor mounted to the robot, but this is not very practical for collecting large datasets. Whilst it may not be so challenging for the above simple tasks, if we were to extend this framework to one learning a control policy, such as with deep reinforcement learning, this would require an impractically large amount of human supervision during training. Therefore, we began to think about whether we could collect tactile data in simulation, train a neural network with this simulated data, and then apply the neural network to a real sensor and real objects.
To this end, we designed a soft-body model of the sensor's deformable tip, and built this into the Unity physics engine:
We can see that the simulated sensor deforms in a similar way to the real sensor, but on close inspection there is still a significant difference between the simulated behaviour and real-world deformation. This could be troublesome when transferring our trained model from simulation to reality, because the neural network would not have seen anything like the real-world data. One simple method for addressing this is to randomise simulation parameters, often referred to as domain randomisation, such that the learned model is robust to the difference between the simulation and reality. Effectively, this aims to cover all the possible deformation behaviours that may occur in the real world, so that even if the real data is just one small subset of the total simulated data, at least the trained model will know how to interpret the real data. Below shows the simulated sensor with three different simulation parameters, when undergoing the same external force:
Using these randomised parameters, we collected data in simulation for all three tasks. The tip was moved along an object in the simulated environment, recording the tactile image along with the position of the sensor relative to the object:
A simple neural network was then trained to take 2D positions of the sensor's pins as input, and predict the position or angle of the object relative to the sensor. The trained network was then tested in the real-world, by mounting the TacTip sensor onto our Sawyer robot:
The real-world results showed us a number of interesting conclusions. First, a simple representation of the tactile image, taking the 2D positions of the pins, performed significantly better than manually-engineered alternatives. Second, when different levels of randomisation were used in the simulator, the performance was robust to this choice and performed equally well across different levels. This is helpful in avoiding the need to carefully calibrate the simulator to the specific real-world task. The figure below illustrates this for the third task of predicting the x-y position of the sensor on the pole:
The third conclusion was the trained neural network was able to successfully predict positions and orientations with less than 10% error of the training data range, which corresponds to less than 1mm in position and 15 degrees in orientation.
This project has shown us that it is possible to transfer tactile data from simulation to reality for the TacTip, using a simple representation of the pin positions and a standard domain randomisation technique for sim-to-real transfer. In other preliminary experiments not presented in this paper, we have found that learning a control policy (e.g. for edge following) using reinforcement learning, and transferring this to the real world, is a far more challenging task than transferring a supervised learning model as in this paper. However, the success with supervised learning gives us hope that, with further investigation, it may be possible to apply this method to more complex tasks, with a view towards dexterous manipulation policies that can be trained in simulation and applied directly to the real world, without any real-world training.
To read the full paper, please click here.