This new AI technique creates “digital twins” of consumers and could disrupt the traditional survey industry

A new one Research work The study, quietly released last week, describes a groundbreaking method that allows large language models (LLMs) to simulate human consumer behavior with startling accuracy, a development that could reshape the multibillion-dollar company Market research industry. The technique promises to create armies of synthetic consumers who can provide not only realistic product reviews but also the qualitative reasoning behind them, at a scale and speed currently unattainable.

For years, companies have tried to use AI for market research but have failed because of a fundamental flaw: When asked for a numerical rating on a scale of 1 to 5, LLMs provide unrealistic and poorly distributed answers. A new paper, "LLMs reproduce human purchase intentions through semantic similarity determination of Likert ratings," The post, submitted to the pre-print server arXiv on October 9th, proposes an elegant solution that completely circumvents this problem.

The international research team led by Benjamin F. Maier developed a method that they call “ semantic similarity score (SSR). Instead of asking an LLM for a number, SSR asks the model to provide a detailed, textual opinion about a product. This text is then converted into a numerical vector – an "Embedding" – and its similarity is measured against a set of predefined reference statements. For example, an answer from "I would definitely buy it, it’s exactly what I’m looking for" would be semantically closer to the reference statement for a "5" Evaluation as the statement for a "1."

The results are striking. Tested on a massive real-world data set from a leading personal care company – consisting of 57 product surveys and 9,300 human responses – the SSR method achieved 90% human retest reliability. Crucially, the distribution of AI-generated ratings was statistically indistinguishable from human ratings. The authors state, "This framework enables scalable consumer research simulations while maintaining traditional survey metrics and interpretability."

A timely solution as AI threatens survey integrity

This development comes at a critical time, as the integrity of traditional online survey panels is increasingly threatened by AI. An analysis of 2024 from the Stanford Graduate School of Business pointed out a growing problem of human survey respondents using chatbots to generate their responses. These AI-generated answers have been found to exist "suspiciously nice," overly detailed and missing that "snore" and authenticity of real human feedback, leading to what researchers call a "Homogenization" of data that could obscure serious problems such as discrimination or product defects.

Maier’s research offers a completely different approach: Instead of fighting to clean up contaminated data, she creates a controlled environment from scratch for generating high-precision synthetic data.

"What we are seeing is a shift from defense to offense." said an analyst who was not involved in the study. "The Stanford paper showed the chaos of uncontrolled AI polluting human datasets. This new paper shows the order and utility of controlled AI creating its own data sets. For a chief data officer, this is the difference between cleaning up a contaminated well and tapping into a fresh source."

From Text to Intent: The Technical Leap Behind the Synthetic Consumer

The new method’s technical validity depends on the quality of the text embeddings, a concept explored in a 2022 article EPJ Data Science. This research argued for a rigorous approach "Construct validity" Framework to ensure that text embeddings – the numerical representations of text – really work "measure what they should."

The success of the SSR method suggests that its embeds effectively capture the nuances of purchase intent. For this new technique to be widely adopted, companies must have confidence that the underlying models not only generate plausible text, but also map that text to reviews in a robust and meaningful way.

The approach also represents a significant advance over previous research, which largely focused on using text embeddings to analyze and predict ratings from existing online reviews. A Study 2022For example, evaluated the performance of models such as BERT and word2vec in predicting review scores on retail sites and found that newer models such as BERT performed better in general use. The new research goes beyond analyzing existing data to generate novel, predictive insights before a product even hits the market.

The beginning of the digital focus group

For technical decision makers, the implications are profound. The ability to turn one up "digital twin" Analyzing a target customer segment and testing product concepts, ad copy or packaging variants within a few hours could dramatically accelerate innovation cycles.

As the paper notes, these synthetic respondents also provide information "rich qualitative feedback to explain their reviews," Providing a treasure trove of data for product development that is both scalable and interpretable. While the era of all-human focus groups is far from over, this research provides the most convincing evidence yet that their synthetic counterparts are ready for use.

But the business case goes beyond speed and scale. Consider the economics: A traditional survey panel for a national product launch could cost tens of thousands of dollars and take weeks to complete. An SSR-based simulation could provide comparable insights in a fraction of the time and at a fraction of the cost, with the ability to iterate immediately based on the results. For companies in fast-moving consumer goods categories – where the window between concept and shelf can determine market leadership – this speed advantage could be crucial.

There are, of course, caveats. The method has been validated on personal care products; its performance in complex B2B purchasing decisions, luxury goods or culturally specific products remains unproven. And while the paper shows that SSR can model all of human behavior, it does not claim to predict individual consumer decisions. The technology works at a population level, not a person level – a distinction that is important for applications such as personalized marketing.

But despite these limitations, the research represents a turning point. While the era of all-human focus groups is far from over, this paper provides the most convincing evidence yet that their synthetic counterparts are ready for use. The question is no longer whether AI can simulate consumer sentiment, but whether companies can move quickly enough to capitalize on it before their competitors do.

This new AI technique creates “digital twins” of consumers and could disrupt the traditional survey industry

A timely solution as AI threatens survey integrity

From Text to Intent: The Technical Leap Behind the Synthetic Consumer

The beginning of the digital focus group

Leave a ReplyCancel Reply

The IMF’s Georgieva says countries lack a regulatory and ethical basis for AI