Researchers at Princeton University, in collaboration with Google, have developed a novel method to enhance the decision-making capabilities of robots. This technique teaches robots to recognize their own uncertainty and seek human assistance when necessary.
The approach quantifies the ambiguity inherent in human language, enabling robots to determine when to ask for further instructions. For example, a command like “pick up a bowl” from a table with multiple bowls introduces significant uncertainty, prompting the robot to request clarification.
This development is particularly relevant for tasks that are more complex than straightforward commands. Large language models (LLMs) like ChatGPT are instrumental in helping robots navigate complex environments. However, LLM outputs can sometimes be unreliable. Anirudha Majumdar, assistant professor of mechanical and aerospace engineering at Princeton and the senior author of the study, emphasized the necessity for robots equipped with LLMs to recognize the limits of their knowledge.
The system also allows users to set desired success rates for robots, which correspond to specific uncertainty thresholds. For instance, a surgical robot would have a much lower error tolerance compared to a robot used for domestic tasks.
Allen Ren, a graduate student at Princeton and the study’s lead author, highlighted the balance between achieving user-defined success levels and minimizing the need for human intervention. Ren’s work earned him a best student paper award at the Conference on Robot Learning in Atlanta.
The team tested their method on a simulated robotic arm and on actual robots at Google’s facilities in New York City and Mountain View, California. One experiment involved a robotic arm tasked with sorting toy food items, while another featured a robot in an office kitchen environment faced with ambiguous instructions about microwaving a bowl.
The researchers employed a statistical approach called conformal prediction to enable the robot to request human help based on a calculated probability threshold. This method proved effective in scenarios where the robot had to decide between multiple actions, like choosing which bowl to place in a microwave.
Coauthor Andy Zeng, a research scientist at Google DeepMind, noted the insights gained from addressing the physical limitations of robots, which are not apparent in abstract systems like LLMs alone.
The collaboration between Ren and Majumdar with Zeng began after Zeng’s presentation at the Princeton Robotics Seminar series. This partnership leveraged Google’s resources, including access to large language models and diverse hardware platforms.
Ren is now expanding this research to address challenges in active perception for robots, such as predicting the location of objects in different parts of a house.
Photo credit : Allen Ren et al./Princeton University