OpenAI recently unveiled its new o1 models, providing ChatGPT users their initial experienc with AI that pauses to ‘think’ before responding.
Dubbed ‘Strawberry’ internally, the o1 models promise enhanced reasoning capabilities. However, they come with several limitations and a higher cost compared to their predecessor, GPT-4o.
Thinking Through Big Ideas
OpenAI o1 stands out for its approach to problem-solving, breaking down complex tasks into smaller, manageable steps. This capability, termed ‘multi-step reasoning,’ has been explored in research but is now practically implemented.
According to Kian Katanforoosh, CEO of Workera and adjunct lecturer at Stanford, ‘If you can train a reinforcement learning algorithm paired with some of the language model techniques that OpenAI has, you can technically create step-by-step thinking and allow the AI model to walk backwards from big ideas you’re trying to work through.’
Despite its capabilities, o1 is notably expensive. Users pay for both input and output tokens, as well as hidden ‘reasoning tokens’ that account for the model’s internal processing steps. This hidden cost can significantly add up, emphasizing the need for careful usage to avoid excessive charges.
Tempering Expectations
The anticipation surrounding ‘Strawberry’ has been immense. Reports of OpenAI’s advanced reasoning models date back to late 2023, coinciding with significant organizational changes. However, OpenAI has tempered expectations, clarifying that o1 is not a form of AGI (Artificial General Intelligence).
Sam Altman, OpenAI’s CEO, has stated that the o1 model is ‘still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.’ This realistic outlook helps set appropriate expectations for users and the AI community.
The initial excitement has waned as users and experts alike recognize that while o1 introduces some novel capabilities, it does not represent a revolutionary leap forward akin to its predecessor, GPT-4.
Practical Applications and Limitations
The o1 model shines in scenarios requiring detailed reasoning and logic. For instance, when tasked with planning a complex event or schedule, o1 excels by providing comprehensive, step-by-step guidance.
In one example, the AI helped plan a Thanksgiving dinner for 11 people, considering various factors like oven space and cost management. The detailed response, though lengthy, showcased o1’s ability to handle complex queries effectively.
However, o1’s tendency to overthink can be counterproductive for simpler questions. A query about cedar trees in America resulted in an overly detailed, 800+ word response, whereas GPT-4o provided concise and relevant information. This highlights the importance of selecting the right model for the task at hand.
Industry Reactions
The reception of o1 within the AI industry has been mixed. Many experts appreciate its advanced reasoning capabilities but remain critical of its practical utility and cost. Rohan Pandey, a research engineer at ReWorkd, believes o1 can solve niche problems beyond GPT-4’s scope.
Despite the excitement, industry leaders like Brightwave CEO Mike Conover suggest that o1 does not achieve the ‘step function change’ in AI capabilities that many anticipated. This sentiment reflects a broader cautious optimism within the AI community.
Overall, while o1 is seen as a valuable tool for specific applications, it has not met the broader expectations set by its predecessors.
Underlying Principles and Historical Context
The techniques employed in developing o1 are not entirely new. Similar methods were used by Google in 2016 to create AlphaGo, which famously defeated a world champion in the board game Go. Andy Harrison, CEO of the venture firm S32, draws parallels between these approaches.
AlphaGo’s training involved extensive self-play to achieve superhuman capabilities. This method underscores a longstanding debate in AI: whether workflows can be automated through agentic processes or if generalized intelligence and reasoning are required.
Harrison supports the former view, emphasizing that current AI models, including o1, are more tools than autonomous decision-makers. This perspective aligns with the broader industry view that AI’s role is to assist and enhance human decision-making rather than replace it.
Value Proposition and Cost Concerns
One of the most debated aspects of o1 is its cost. As AI models generally become more affordable, o1 bucks the trend by being significantly more expensive. This raises questions about its value proposition.
Kian Katanforoosh provides an example of using o1 to optimize an interview process for hiring, illustrating its potential utility. However, the high cost may limit its widespread adoption, making it suitable only for specific, high-stakes scenarios.
The AI community is closely monitoring how users balance the benefits of o1’s advanced reasoning capabilities with its financial implications. As technology evolves, the hope is that future iterations will offer similar benefits at a lower cost.
In conclusion, OpenAI’s o1 model introduces significant advancements in reasoning and problem-solving. However, its high cost and tendency to overthink make it best suited for specific, complex tasks.
As the AI community continues to evaluate its practical applications, the o1 model stands as an intriguing development with both promise and limitations.
Source: Techcrunch