Home Bots & Brains ‘OpenAI’s o3 surpasses humans in reasoning test’

‘OpenAI’s o3 surpasses humans in reasoning test’

by Marco van der Hoeven

OpenAI’s new o3 model has become the first AI system to surpass human performance on the Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI), scoring 76% accuracy compared to the average human score of slightly above 75%. The achievement was confirmed in a test formally coordinated by OpenAI and François Chollet, the creator of ARC-AGI and a researcher at Google.

ARC-AGI is designed to evaluate an AI system’s ability to adapt to novel tasks, testing abstract reasoning and pattern recognition in a visual domain rather than through natural language processing. OpenAI’s o3 represents a departure from the architecture of previous GPT models, employing methods that allow it to handle these challenges with a new level of adaptability.

Chollet described o3’s performance as a “step-function increase in AI capabilities” and attributed its success to innovations in its underlying architecture, which he suggests involve sophisticated search processes during problem-solving. Despite its accomplishment, Chollet noted that o3’s results were achieved with the aid of training on ARC-related datasets, raising questions about how it might perform without such preparation.

While o3’s achievement is being hailed as a breakthrough in AI capabilities, Chollet and OpenAI emphasize that it does not represent the arrival of Artificial General Intelligence (AGI). The model still fails certain straightforward tasks within the ARC-AGI framework, underscoring its limitations compared to human cognition. OpenAI plans to release a “mini” version of o3 by the end of January 2025, with the full version to follow later.

Misschien vind je deze berichten ook interessant