top of page
At The Robot Learning Lab, we are developing advanced robots that are empowered by artificial intelligence, for assisting us all in everyday environments. Our research lies at the intersection of robotics, computer vision, and machine learning, and we are primarily studying robot manipulation: robots that can physically interact with objects using their arms and hands. We are currently investigating new strategies based around Imitation Learning, Reinforcement Learning, and Vision-Language Models, to enable efficient and general learning capabilities. Applications include domestic robots (e.g. tidying the home), manufacturing robots (e.g. assembling products in a factory), and warehouse robots (e.g. picking and placing from/into storage). The lab is led by Dr Edward Johns in the Department of Computing at Imperial College London. Welcome!
Dream2Real accepted at ICRA 2024!
Dream2Real enables robots to "dream" in 3D using NeRFs, and "evaluate" in 2D using VLMs. First, an object-centric NeRF of a scene is created. Then, 2D images of plausible reconfigurations of the scene are rendered, and evaluated with respect to a language command using CLIP. Finally, the robot recreates the configuration with the best score via pick-and-place.
Prismer published in TMLR!
Prismer is a data- and parameter-efficient vision-language model that leverages an ensemble of diverse, pre-trained task-specific experts. Prismer achieves fine-tuned and few-shot learning vision-language reasoning performance which is competitive with current state-of-the-arts, whilst requiring up to two orders of magnitude less training data.
On the Effectiveness of Retrieval, Alignment, and Replay in Manipulation
published in RA-Letters!
We study the taxonomy of recent imitation learning methods, across two axes: whether generalisation is achieved via retrieval or via interpolation, and whether or not a trajectory is decomposed into "approaching" and "interacting" phases. And we show that, for efficient learning with a single demonstration per object, the optimal combination is "retreval, alignment, and replay".
Annual review of all our research in 2023!
Language Models as Zero-Shot Trajectory Generators
presented at the CoRL 2023 LangRob and NeurIPS Robot Learning Workshops!
Can an LLM (GPT-4) directly predict a dense sequence of end-effector poses for manipulation skills, when given access to only object detection and segmentation vision models? We study how well a single task-agnostic prompt, without any in-context examples, motion primitives, or external trajectory optimisers, can perform across 26 real-world language-based tasks,
SceneScore: Learning a Cost Function for Object Arrangement
presented at the CoRL 2023 LEAP Workshop!
We propose an energy-based graph neural network which can learn to predict the cost of an arrangement of objects, when trained on example arrangements. Minimising the energy during inference then enables a robot to determine object goal poses for rearrangement.
bottom of page