Researchers from Duke University and the Army Research Laboratory have introduced a new artificial intelligence (AI) training framework called GUIDE, which enables AI to learn complex tasks through real-time human feedback rather than relying solely on extensive datasets and simulations. The framework, which aims to make AI training more adaptive and responsive, will be presented at the Conference on Neural Information Processing Systems (NeurIPS 2024) in Vancouver, Canada.
GUIDE allows humans to provide continuous feedback as they observe an AI system’s performance. This approach is likened to a driving instructor offering real-time guidance, fostering incremental improvements in the AI’s decision-making processes. Unlike existing methods that rely on limited feedback categories such as “good,” “bad,” or “neutral,” GUIDE employs a gradient feedback scale, enabling nuanced and detailed input.
In a study demonstrating the platform, participants trained an AI to play a hide-and-seek game featuring two computer-controlled players, with the AI-guided red player seeking the green one in an obstacle-filled environment. Feedback from human trainers, delivered over just 10-minute sessions, resulted in a 30% improvement in the AI’s performance compared to traditional reinforcement learning methods.
Lead researcher Boyuan Chen noted that GUIDE addresses the limitations of conventional AI training, which often depends on extensive pre-existing datasets and lacks adaptability. The system also incorporates a simulated human trainer AI, modeled on insights gained from human feedback, allowing the AI to continue training independently after human involvement ceases.
The study, involving 50 participants without prior training, highlighted the influence of individual cognitive abilities, such as spatial reasoning and decision-making, on the effectiveness of AI training. These findings open possibilities for further research into enhancing human capabilities to improve AI guidance.
Chen emphasized the broader potential of GUIDE, including its application in environments with limited information and the development of more intuitive systems that everyday users can interact with effectively. Future work aims to integrate diverse communication signals, such as language, gestures, and facial expressions, to enhance the system’s responsiveness.