< >
< >
< >
< >
< >
This “cheap” open source AI model is actually burning your arithmetic budget - current-scope.com
< >
< >

This “cheap” open source AI model is actually burning your arithmetic budget


Would you like to insight in your inbox? Register for our weekly newsletters to only receive the company manager of Enterprise AI, data and security managers. Subscribe now


A comprehensive New study It has shown that models for artificial intelligence of open source models use much more arithmetic resources than their costs for closed source when executing identical tasks, which may undermine their cost advantages and redesign the evaluation of AI reporting strategies.

The research carried out by the AI company Nous researchfound that open weight models between 1.5 and 4 times more token-the basic units of the AI calculation used as closed models such as those of Openai And Anthropic. With simple questions of knowledge, the gap expanded dramatically, with some open models used up to 10 times more tokens.

“Open weight models use 1.5–4 × more tokens than closed (up to 10 × for simple knowledge questions), which means that they sometimes become more expensive despite lower costs per query,” wrote the researchers in their report in their report.

The results question a predominant assumption in the AI industry that open source models offer clear economic advantages over proprietary alternatives. While open source models generally cost less per token, the study suggests that this advantage “can easily compensate for if you need more tokens for a specific problem.


AI scale hits its limits

Power caps, rising token costs and infection delays change the company -ai. Take our exclusive salon to find out how top teams: Top teams are:

  • Transform energy into a strategic advantage
  • Architects efficient inference for real throughput gains
  • Development of the competition -roi with sustainable AI systems

Secure your place to stay in front: https://bit.ly/4mwgngo


The actual costs for AI: Why “cheaper” models can break your budget

The investigated research 19 different AI models In three categories of tasks: basic knowledge issues, mathematical problems and logical puzzles. The team measured the “token efficiency” – how many computer units use models in relation to the complexity of their solutions – a metric that, despite its considerable effects, has received little systematic examination.

“The token efficiency is a critical metric for several practical reasons,” the researchers noticed. “While the hosting of open weight models may be cheaper, this cost advantage can be easily compensated for if you need more tokens for a specific problem.”

Open source AI models use up to 12 times more arithmetic resources than the most efficient closed models for basic knowledge questions. (Credit: Nous Research)

Inefficiency is particularly pronounced for large argumentation models (LRMS) that are expanded “Chains”To solve complex problems. These models that think through problems step by step can consume thousands of tokens that think about simple questions that should require a minimal calculation.

For basic knowledge like “What is the capital of Australia?” The study showed that argumentation models “hundreds of tokens think about simple questions of knowledge”, which could be answered in a single word.

Which AI models actually deliver bang for your money

Research resulted in strong differences between model providers. Openais models, especially his his O4 mini and newly published open source Gpt-Osses Variants showed an extraordinary token efficiency, especially for mathematical problems. The study showed that Openai models “occur for extreme token efficiency in mathematics problems”, with up to three times less token than other commercial models used.

Under open source options, Nvidia’s Lama-3.3-Nemotron Super-49b-V1 developed as “the token -standing open weight model in all areas”, while newer models of companies like Magistral showed “exceptionally high token use” as a outlier.

The efficiency gap varied considerably from the type of task. While open models used about twice as many tokens for mathematical and logical problems, the difference for simple questions of knowledge should occur in which efficient argument should be unnecessary.

The latest Openai models reach the lowest costs for simple questions, while some open source alternatives can cost considerably more despite lower pricing. (Credit: Nous Research)

What company managers need to know about the costs for the AI computers

The results have an immediate effect on the introduction of companies AI, in which the computing costs can quickly scale with use. Companies that evaluate AI models often focus on accuracy benchmarks and pro-member pricing, but can overlook the total arithmetic requirements for real tasks.

“The better token efficiency of models with closed weight often compensates for the higher API price design of these models,” said the researchers when analyzing the comprehensive key parts.

The study also showed that providers of closed source model providers seem to be actively optimized for efficiency. “Models with closed weight have been optimized iteratively to use fewer tokens to reduce the inference costs”, while open source models “increased their token use for newer versions and possibly reflect a priority for better argumentation performance”.

The computing effort varies dramatically between AI providers, with some models using over 1,000 tokens for internal argumentation on simple tasks. (Credit: Nous Research)

How researchers have cracked the code for AI efficiency measurement

The research team stood with unique challenges in measuring efficiency between different model architectures. Many models with closed source do not reveal their raw argumentation processes, but provide compressed summaries of their internal calculations to prevent competitors from copying their techniques.

In order to remedy this, the researchers used completion -offs – the entire computing units that were invoiced for each query – used as a deputy for the argument. They found that “the latest models did not share their raw traces of argument” and instead use smaller voice models to transmit the chain of thought into summaries or compressed representations.

The study of the study included testing with modified versions of well -known problems to minimize the influence of mermanced solutions, e.g. B. variables in mathematical competitive problems from the American Invitational Mathematics Examination (Aime).

Different AI models show different relationships between the calculation and output, whereby some providers compress traces of arguments, while others provide complete details. (Credit: Nous Research)

The future of AI efficiency: what’s next

The researchers suggest that token efficiency should become a primary optimization goal in addition to accuracy for future model development. “A densifter cot also enables more efficient consumption of context and can counteract contexts with challenging argumentation tasks.” You wrote.

The publication of Openais Open Source GPT-Oß modelsDemonstrating the most modern efficiency with “freely accessible cot” could serve as a reference point for the optimization of other open source models.

The complete research data record and the evaluation code are Available on GithubSo that other researchers can validate and expand the results. Since the AI industry runs to stronger argumentation skills, this study suggests that the real competition may not be about who can build the smartest AI – but who can build up the most efficient.

In a world in which every token counts, the most lavish models may be out of the market, regardless of how well you can think.


Leave a Reply

Your email address will not be published. Required fields are marked *

< >