Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
A new study from MIT suggests that the largest and most computationally intensive AI models may soon offer lower returns compared to smaller models. By mapping scaling laws to continuous improvements in model efficiency, the researchers found that it could become harder to force jumps in performance from huge models, while efficiency gains could cause models running on more modest hardware to become increasingly powerful over the next decade.
“Over the next five to 10 years, things are most likely going to narrow,” says Neil Thompson, a computer scientist and professor at MIT who is involved in the study.
Leaps in efficiency as in DeepSeek’s remarkably cost-effective model in January already served as a reality check for the AI industry, which is used to burning huge amounts of computing power.
As things stand, a frontier model from a company like OpenAI is currently much better than a model trained with a fraction of the computing power of an academic lab. While the MIT team’s prediction may not hold true if, for example, new training methods such as reinforcement learning produce surprising new results, they suggest that large AI companies will have less of a lead in the future.
Hans Gundlach, a research scientist at MIT who led the analysis, became interested in the topic because of the inconvenience with which state-of-the-art models must operate. Together with Thompson and Jayson Lynch, another researcher at MIT, he designed the future performance of frontier models compared to those built with more modest computational resources. According to Gundlach, the predicted trend is particularly pronounced in the argumentation models currently in vogue, which rely more heavily on additional calculations for inference.
Thompson said the results show the value of refining an algorithm and increasing computing power. “If you’re going to spend a lot of money training these models, you should definitely invest some of that money into developing more efficient algorithms, because that can be hugely important,” he adds.
The study is particularly interesting given today’s AI infrastructure boom (or should we say “bubble”?) – which shows little sign of slowing down.
OpenAI and other US technology companies have done this signed hundred billion dollar deals to build AI infrastructure in the United States. “The world needs a lot more computing power,” said Greg Brockman, president of OpenAI. announced this week when he announced a partnership between OpenAI and Broadcom for custom AI chips.
More and more experts are doubting the viability of these deals. Around 60 percent Of the costs of building a data center are GPUs, which tend to lose value quickly. Partnerships are also emerging between the big players circular and opaque.