The Google study shows that LLMS give up correct answers under pressure and threaten multiturn AI systems

Would you like to insight in your inbox? Register for our weekly newsletters to only receive the company manager of Enterprise AI, data and security managers. Subscribe now

A New study From researchers at Google Deepmind And University College London shows how large voice models (LLMS) maintain, maintain and lose their answers. The results show striking similarities between the cognitive distortions of LLMs and humans and at the same time underline strong differences.

Research shows that LLMs can be too cocky in their own answers and quickly lose this self -confidence and change their opinion when they present a counter argument, even if the counter argument is wrong. Understanding the nuances of this behavior can have direct consequences for how they create LLM applications, in particular conversation interfaces that include several curves.

Test trust in LLMS

A critical factor in the secure provision of LLMS is that their answers are accompanied by a reliable feeling of trust (the likelihood that the model assigns the answer token). While we know that LLMS can create these degrees of trust, the extent to which you can use you to use adaptive behavior is poorly characterized. There is also empirical evidence that LLMs can be too cocky in their first answer, but are also highly sensitive to criticism and quickly become under confidentiality.

To investigate this, the researchers developed a controlled experiment to test how LLMS update their trust and decide whether to change their answers when they have presented external advice. In the experiment, an “answered LLM” was first referred to a binary selection question, e.g. B. the correct width for a city from two options. After its first choice, the LLM received a fictional “advice -llm”. This Council was delivered with an explicit accuracy assessment (e.g. “This Council of LLM is 70% exactly”) and would either agree to the first choice of the response LLM or neutrally. Finally, the answering LLM was asked to make their final choice.

The AI Impact series returns to San Francisco – August 5th

The next phase of the AI is here – are you ready? Join the managers of Block, GSK and SAP to get an exclusive look at how autonomous agents redesign of decision-making from real time up to end-to-end automation.

Secure your place now – space is limited: https://bit.ly/3guuplf

Example test of trust in LLMS (source: arxiv) — *Example test of trust in LLMS Source: Arxiv*

An essential part of the experiment was to check whether the first answer of the LLM was visible during the second final decision. In some cases it was shown and in others it was hidden. This unique setup, which cannot replicate with human participants who cannot simply forget their previous decisions, enabled the researchers to areolate how the memory of an earlier decision influences current trust.

A basic condition in which the first answer was hidden and the council was neutral found how much the answer of an LLM could easily change due to random variance in the processing of the model. The analysis focused on how the trust of the LLM changed in its original choice between the first and second round and a clear picture provided how the initial faith or earlier affects a “change in the mind” in the model.

Confcomination and subconscious

The researchers initially examined how the visibility of the LLM’s own response had an effect on their tendency to change their answer. They observed that when the model could see its initial answer, a reduced tendency to change showed when the answer was hidden. This finding refers to a certain cognitive distortion. As the paper states, “this effect – the tendency to keep visible to the first choice when looking at the final choice (in contrast to hidden) – is closely associated with a phenomenon that was described in the study of human decision -making Selection support. “”

The study also confirmed that the models integrate external advice. In the LLM she showed herself with opposing advice, and showed an increased tendency to change his opinion and a reduced tendency when the council was supported. “This finding shows that answering LLM appropriately integrates the direction of instructions in order to modulate your change in the way of thinking,” the researchers write. However, they also found that the model is excessively sensitive to the opposite information and thereby carrying out too much trust update.

*Sensitivity of LLMS compared to various settings in the confidence examination source: Arxiv*

Interestingly, this behavior contradicts the Consultation of confirmation Often seen in humans, where people prefer information that confirm their existing beliefs. The researchers found that LLMS “rather overweight and supportive advice overweight and non -supportive advice, both as well as the initial answer of the model was visible and hidden from the model”. A possible explanation is that training techniques like Reinforcement learning from human feedback (Rlhf) can encourage models to be excessively respectful to the user input, a phenomenon that is known as a sycopian (which remains a challenge for AI Labs).

Implications for corporate applications

This study confirms that AI systems are not the purely logical active ingredients for which they are often perceived. They have their own prejudices, some resemble human cognitive mistakes and others who are unique for themselves, which can make their behavior unpredictable in human terms. For corporate applications, this means that in an expanded conversation between a person and a AI agent, the latest information has disproportionately affected the argumentation of the LLM (especially if this contradicts the first answer of the model), which may lead to a correct answer at the beginning.

Fortunately, as the study also shows, we can manipulate the memory of an LLM in order to alleviate these undesirable prejudices in a way that is not possible with people. Developers who build up multi-turn conversation can implement strategies to manage the context of the AI. For example, a long conversation can be summarized regularly, with important facts and decisions being neutral and stripped off, which agent has made which choice. This summary can then be used to initiate a new, condensed conversation that provides the model a clean slate and contributes to avoiding the distortions that can creep into extended dialogues.

If LLMS are integrated more into corporate workflows, it is no longer optional to understand the nuances of their decision -making processes. According to such fundamental research, developers enable them to anticipate and correct these inherent distortions, which leads to applications that are not only more capable, but also more robust and reliable.

Daily insights into the economic use cases with VB daily

If you want to impress your boss, VB Daily covered her. We give you the Inside scoop of what companies do with generative AI, from regulatory shifts to practical deprivation, so that they can share knowledge for a maximum ROI.

Read our Data protection guideline

Thanks for subscribing. Check out more VB newsletter here.

An error occurred.

The Google study shows that LLMS give up correct answers under pressure and threaten multiturn AI systems

Test trust in LLMS

Confcomination and subconscious

Implications for corporate applications

Leave a ReplyCancel Reply

Freed Israeli hostages show severe starvation, weight loss from Gaza captivity

4 men arrested after Mississippi mass shooting that killed 4, injured 20

Customer challenge

Test trust in LLMS

Confcomination and subconscious

Implications for corporate applications

Leave a ReplyCancel Reply

Trending now

Freed Israeli hostages show severe starvation, weight loss from Gaza captivity

4 men arrested after Mississippi mass shooting that killed 4, injured 20

Customer challenge