A New Era of Reasoning Models
Recently, there has been a surge in reasoning models following the launch of OpenAI's o1, which is designed to enhance reasoning abilities. In early November, DeepSeek, an AI research company backed by quantitative traders, introduced its first reasoning algorithm called DeepSeek-R1. At the same time, Alibaba’s Qwen team presented what they claim is the first open competitor to o1.
So, what has caused this rapid growth? One reason is the search for new ways to improve generative AI technology. As my colleague Max Zeff noted, traditional methods to increase model size are not providing the same benefits as before.
AI companies face strong competition to keep up with innovation. It is estimated that the global AI market reached $196.63 billion in 2023 and could grow to $1.81 trillion by 2030.
OpenAI believes that reasoning models can solve more complex problems than earlier models and mark a significant advancement in generative AI. However, not everyone agrees that reasoning models are the best way forward.
Ameet Talwalkar, an associate professor of machine learning at Carnegie Mellon, finds the first batch of reasoning models impressive. However, he also questions anyone who confidently claims to know how far these models will advance the industry.
“AI companies have financial reasons to make optimistic claims about their future technologies,” Talwalkar said. “We risk focusing too narrowly on one approach, so it’s important for the wider AI research community to be cautious about believing the hype and marketing from these companies and instead focus on real results.”
There are two main downsides to reasoning models: they are (1) expensive and (2) require a lot of power.
For example, OpenAI charges $15 for every approximately 750,000 words that o1 analyzes and $60 for every 750,000 words it generates. This is three to four times more expensive than OpenAI’s latest non-reasoning model, GPT-4o.
O1 is available for free in OpenAI’s ChatGPT platform but comes with limitations. Recently, OpenAI launched a more advanced version called o1 pro mode, which costs a staggering $2,400 per year.
“The overall cost of using large language model reasoning is definitely not decreasing,” said Guy Van Den Broeck, a computer science professor at UCLA.
One reason for the high cost of reasoning models is that they need significant computing power to operate. Unlike many AI models, o1 and others try to verify their work as they go along. This helps them avoid mistakes but often makes them slower in finding solutions.
OpenAI imagines future reasoning models might take hours, days, or even weeks to think through problems. While usage costs will be higher, the potential benefits—like new batteries or cancer treatments—could be worth it.
However, the immediate value of today’s reasoning models is less clear. Costa Huang, a researcher and machine learning engineer at Ai2, points out that o1 is not very reliable when it comes to calculations. A quick look at social media shows several errors from o1 pro mode.
“These reasoning models are specialized and may not perform well in general areas,” Huang explained. “Some limitations will be resolved sooner than others.”
Van Den Broeck argues that reasoning models do not truly reason and are therefore limited in the tasks they can handle successfully. “Real reasoning applies to all problems, not just those likely found in a model’s training data,” he said. “That is the main challenge we still need to overcome.”