Toyota Research Institute (TRI) recently revealed a generative AI approach, based on Diffusion Policy, that facilitates the quick teaching of dexterous skills to robots. This method is an advancement towards creating “Large Behavior Models (LBMs)” for robots, drawing parallels to the Large Language Models (LLMs) that have impacted conversational AI.
Gill Pratt, CEO of TRI and Chief Scientist for Toyota Motor Corporation, emphasized that the focus of the institute’s robotics research is to augment human capabilities. The introduction of this new teaching method, according to Pratt, has enabled robots to better support human activities.
In the past, instructing robots was characterized by slow, inconsistent, and often narrowly-focused techniques. Typically, these methods required intensive coding and numerous trial-and-error cycles to instill desired behaviors. However, with TRI’s new approach, robots have acquired over 60 complex skills such as pouring liquids, utilizing tools, and manipulating deformable objects, all without the need for additional code. The only alteration was providing the robot with fresh data. Following this trajectory, TRI plans to teach its robots hundreds more skills by the end of the current year and aims for a total of 1,000 skills by the end of 2024.
This advancement indicates that robots can be trained to function in diverse scenarios and execute a broad spectrum of behaviors beyond just basic tasks like picking up and placing objects. The robots at TRI have demonstrated the ability to interact with their surroundings in a multifaceted manner, paving the way for robots to assist humans in a variety of contexts and dynamic environments.
Russ Tedrake, Vice President of Robotics Research at TRI, voiced his astonishment at the capabilities of these robots. He stated that the distinctiveness of this approach lies in its speed and reliability in teaching new skills, especially those involving traditionally challenging materials like cloth and liquids.
From a technical standpoint, TRI’s robot behavior model draws from haptic demonstrations and a linguistic description of the desired outcome. The AI-based Diffusion Policy then allows the robot to internalize the skill shown. This method has consistently produced rapid and dependable results.