Researchers at Cornell University have introduced a robotic learning system designed to enable robots to perform tasks after viewing a single instructional video. The system, known as RHyME (Retrieval for Hybrid Imitation under Mismatched Execution), is aimed at overcoming the limitations of traditional robot training methods, which often require extensive, precisely labeled datasets and struggle with deviations from scripted actions.
The research, led by doctoral student Kushal Kedia and computer science assistant professor Sanjiban Choudhury, addresses a core challenge in robotic learning: the mismatch between human demonstrations and robotic execution. Existing imitation learning systems often fail when human movements in instructional videos diverge from what robots are capable of replicating. RHyME proposes a retrieval-based framework that enables robots to reference previously observed actions to bridge these gaps.
The system enables a robot to view a demonstration—such as placing a mug in a sink—and then retrieve similar motion sequences from its memory to complete the task, even if the demonstration does not exactly align with the robot’s own capabilities or environment. According to the researchers, this allows for generalization across different task contexts and significantly reduces the need for extensive training datasets. RHyME-trained robots required only 30 minutes of robot-specific data and, in laboratory tests, demonstrated more than a 50% improvement in task completion rates compared to earlier methods.
The research will be presented in May at the IEEE International Conference on Robotics and Automation in Atlanta.