MEng Projects 2019-2010


Our robot, Sawyer

If you are an MEng student in the Department of Computing at Imperial College, and are keen to do your individual project with me in the Robot Learning Lab, then I would like to hear from you! This is a unique opportunity for top students who are interested in robotics, as well as deep learning, reinforcement learning, or computer vision, to gain some experience in a state-of-the-art robotics research lab.


These projects are best suited to students considering a career in research, either in academia or a top tech firm, and will also give you an ideal experience of what life as a PhD student would be like. PhD positions in my lab may be available to the best students, and those who are productive will also have the opportunity to publish and present their work at a leading international conference.


All projects aim to solve state-of-the-art research challenges, which I am personally genuinely interested for us to explore; they are not simply old projects which are repeated every year. Each project has a clear plan and set of milestones, but you are also free to progress the research in your own way if you discover something in particular you would like to develop. 

Listed below are the three projects I am offering to MEng students for the 2019-2020 academic year. All projects involve aspects of robotics, machine learning, and computer vision, but if you are unsure which project is right for you, then here's a rough guide. If you are most interested in robotics, choose Project 1; if you are most interested in machine learning (specifically reinforcement learning), choose Project 2; and if you are most interested in computer vision, choose Project 3.

Please contact me at to express your interest, and we will arrange a meeting where you can find out more. I look forward to hearing from you!

Further departmental guidelines on the MEng individual projects can be found by clicking here.

Project 1: Deep Learning for Image-Based Robot Control from Human Demonstrations

The power of deep learning has enabled robots to interact with the world directly from raw images, by mapping pixels to motor actions in an end-to-end manner. Recent works have shown this to be successful for tasks such as picking up objects, operating a hammer, and folding a towel (see video on right). The method uses imitation learning: image-action pairs are collected during demonstrations, which then form a dataset for training a convolutional neural network. However, this requires a very large number of demonstrations to enable the robot to generalise to small changes in the environment, such as object positions and illumination effects.

In this project, you will study automatic control of robot arms using the above method, but with a difference. Rather than controlling the robot in an end-to-end manner with a single neural network, two different neural networks will be trained. The first will be trained to detect an object and estimate its pose, and the second will be trained to control the robot based on this object pose. As such, the overall pipeline is broken down into a modular approach, whilst still retaining the power of deep learning. You will explore whether this is able to train a robot with fewer demonstrations than the end-to-end approach. Experiments will initially be conducted in simulation, and you will be able to evaluate your method on the robot in our lab, on a range of different tasks.

Click here and here to read some related papers.

Project 2: Deep Reinforcement Learning for Robot Control with Visual Attention


(Kelvin Xu et al.)

Deep reinforcement learning is an intriguing method for training robots to perform complex tasks by "trial and error". Typically, the observations are images, and the actions are motor controls. But in the field of computer vision, reinforcement learning has recently been used in a new way, by applying attention to an image which then focusses on the important parts of the image (see image above). This can help with tasks where different parts of an image, and their spatial relationships, contribute to the overall meaning of the image. However, this idea has not yet been explored for robot learning.

In this project, you will train a convolutional neural network using deep reinforcement learning, to automatically decide which parts of an image are most useful for a particular robot control task. For example, if the task is to operate a screwdriver, the robot should focus on the parts of the scene corresponding to the screwdriver, and the screw, and ignore all other parts of the scene. The experimental setup will involve training a robot initially with human demonstrations (as with Project 1), but this is then followed by further training with deep reinforcement learning, which will control a sliding window moving around the observed images. Experiments will initially be conducted in simulation, and you will be able to evaluate your method on the robot in our lab, on a range of different tasks.

Click here and here to read some related papers.

Project 3: Deep Learning for Object Pose Estimation and Robot Control using Feature Augmentation


(Sumit Saha)

Deep learning with convolutional neural networks has shown to be very powerful for recognising objects in images. This same method can also be used to estimate the pose of an object, by predicting a position and orientation alongside the object class. This could then be used to train a robot how to interact with that object. However, typical network architectures involve a set of convolutional layers, followed by a "flattening" operation, and then a set of fully-connected layers (see image above). This flattening means that each node in the fully-connected layer is responsive to only part of the original image, and therefore, a huge dataset is required to estimate object poses if the object can appear anywhere in the image.

In this project, you will investigate a new idea to address this problem, based on data augmentation. First, features from the final convolutional layer will be interpolated based on how similar the ground-truth object poses are for pairs of images. Then these new, interpolated features, will be added to the original dataset, and incorporated into the training. Experiments will explore whether this then allows for object pose estimation with far fewer training images than existing methods, and also the effect of how far apart ground-truth poses can be when interpolation is applied, whilst still creating new data which is representative of real images. The project can then be extended to a robot control task where, instead of predicting object poses based on an image of an object, the network predicts robot actions based on the image.

Click here and here to read some related papers.