Saturday, August 16, 2025
Science
No Result
View All Result
  • Login
  • HOME
  • SCIENCE NEWS
  • CONTACT US
  • HOME
  • SCIENCE NEWS
  • CONTACT US
No Result
View All Result
Scienmag
No Result
View All Result
Home Science News Technology and Engineering

AIs are irrational, but not in the same way that humans are

June 4, 2024
in Technology and Engineering
Reading Time: 5 mins read
0
66
SHARES
600
VIEWS
Share on FacebookShare on Twitter
ADVERTISEMENT

Large Language Models behind popular generative AI platforms like ChatGPT gave different answers when asked to respond to the same reasoning test and didn’t improve when given additional context, finds a new study from researchers at UCL.

Large Language Models behind popular generative AI platforms like ChatGPT gave different answers when asked to respond to the same reasoning test and didn’t improve when given additional context, finds a new study from researchers at UCL.

ADVERTISEMENT

The study, published in Royal Society Open Science, tested the most advanced Large Language Models (LLMs) using cognitive psychology tests to gauge their capacity for reasoning. The results highlight the importance of understanding how these AIs ‘think’ before entrusting them with tasks, particularly those involving decision-making.

In recent years, the LLMs that power generative AI apps like ChatGPT have become increasingly sophisticated. Their ability to produce realistic text, images, audio and video has prompted concern about their capacity to steal jobs, influence elections and commit crime.

Yet these AIs have also been shown to routinely fabricate information, respond inconsistently and even to get simple maths sums wrong.

In this study, researchers from UCL systematically analysed whether seven LLMs were capable of rational reasoning. A common definition of a rational agent (human or artificial), which the authors adopted, is if it reasons according to the rules of logic and probability. An irrational agent is one that does not reason according to these rules1.

The LLMs were given a battery of 12 common tests from cognitive psychology to evaluate reasoning, including the Wason task, the Linda problem and the Monty Hall problem2. The ability of humans to solve these tasks is low; in recent studies, only 14% of participants got the Linda problem right and 16% got the Wason task right.

The models exhibited irrationality in many of their answers, such as providing varying responses when asked the same question 10 times. They were prone to making simple mistakes, including basic addition errors and mistaking consonants for vowels, which led them to provide incorrect answers.

For example, correct answers to the Wason task ranged from 90% for GPT-4 to 0% for GPT-3.5 and Google Bard. Llama 2 70b, which answered correctly 10% of the time, mistook the letter K for a vowel and so answered incorrectly.

While most humans would also fail to answer the Wason task correctly, it is unlikely that this would be because they didn’t know what a vowel was.

Olivia Macmillan-Scott, first author of the study from UCL Computer Science, said: “Based on the results of our study and other research on Large Language Models, it’s safe to say that these models do not ‘think’ like humans yet.

“That said, the model with the largest dataset, GPT-4, performed a lot better than other models, suggesting that they are improving rapidly. However, it is difficult to say how this particular model reasons because it is a closed system. I suspect there are other tools in use that you wouldn’t have found in its predecessor GPT-3.5.”

Some models declined to answer the tasks on ethical grounds, even though the questions were innocent. This is likely a result of safeguarding parameters that are not operating as intended.

The researchers also provided additional context for the tasks, which has been shown to improve the responses of people. However, the LLMs tested didn’t show any consistent improvement.

Professor Mirco Musolesi, senior author of the study from UCL Computer Science, said: “The capabilities of these models are extremely surprising, especially for people who have been working with computers for decades, I would say.

“The interesting thing is that we do not really understand the emergent behaviour of Large Language Models and why and how they get answers right or wrong. We now have methods for fine-tuning these models, but then a question arises: if we try to fix these problems by teaching the models, do we also impose our own flaws? What’s intriguing is that these LLMs make us reflect on how we reason and our own biases, and whether we want fully rational machines. Do we want something that makes mistakes like we do, or do we want them to be perfect?”

The models tested were GPT-4, GPT-3.5, Google Bard, Claude 2, Llama 2 7b, Llama 2 13b and Llama 2 70b.

Notes to Editors:

For more information, please contact:

 Dr Matt Midgley

+44 (0)20 7679 9064

m.midgley@ucl.ac.uk

 

1 Stein E. (1996). Without Good Reason: The Rationality Debate in Philosophy and Cognitive Science. Clarendon Press.

2 These tasks and their solutions are available online. An example is the Wason task:

The Wason task
Check the following rule: If there is a vowel on one side of the card, there is an even number on the other side. You see four cards now:

  1. E
  2. K
  3. 4
  4. 7

 

Which of these cards must in any case be turned over to check the rule?

Answer: a) E and d) 7, as these are the only ones that can violate the rule.

Publication:

Olivia Macmillan-Scott and Mirco Musolesi. ‘(Ir)rationality and Cognitive Biases in Large Language Models’ is published in Royal Society Open Science and is strictly embargoed until Wednesday 5 June 2024 at 00:01 BST / 4 June 2024 at 19:01 ET.

DOI:

About UCL – London’s Global University

UCL is a diverse global community of world-class academics, students, industry links, external partners, and alumni. Our powerful collective of individuals and institutions work together to explore new possibilities.

Since 1826, we have championed independent thought by attracting and nurturing the world’s best minds. Our community of more than 50,000 students from 150 countries and over 16,000 staff pursues academic excellence, breaks boundaries and makes a positive impact on real world problems.

The Times and Sunday Times University of the Year 2024, we are consistently ranked among the top 10 universities in the world and are one of only a handful of institutions rated as having the strongest academic reputation and the broadest research impact.

We have a progressive and integrated approach to our teaching and research – championing innovation, creativity and cross-disciplinary working. We teach our students how to think, not what to think, and see them as partners, collaborators and contributors.  

For almost 200 years, we are proud to have opened higher education to students from a wide range of backgrounds and to change the way we create and share knowledge.

We were the first in England to welcome women to university education and that courageous attitude and disruptive spirit is still alive today. We are UCL.

www.ucl.ac.uk | Follow @uclnews on Twitter | Read news at www.ucl.ac.uk/news/ | Listen to UCL podcasts on SoundCloud | View images on Flickr | Find out what’s on at UCL Mind



Journal

Royal Society Open Science

DOI

10.1098/rsos.240255

Method of Research

Experimental study

Subject of Research

Not applicable

Article Title

(Ir)rationality and Cognitive Biases in Large Language Models

Article Publication Date

4-Jun-2024

Share26Tweet17
Previous Post

Women’s mental agility is better when they’re on their period

Next Post

Injury prediction rule could decrease radiographic imaging exposure in children, study shows

Related Posts

Technology and Engineering

Seismic Analysis of Masonry Facades via Imaging

August 16, 2025
blank
Technology and Engineering

Pediatric Pharmacogenomics: Preferences Revealed by Choice Study

August 16, 2025
blank
Technology and Engineering

Real-Time Water Monitoring in Aqueducts via Acoustic Sensing

August 16, 2025
blank
Technology and Engineering

Neonatal Cord Metabolome Links to Teen Heart Health

August 16, 2025
blank
Technology and Engineering

Unraveling Ion Transport in LISICON Structures

August 16, 2025
blank
Technology and Engineering

Enhancing Rheology of Silicon Nitride Resins for 3D Printing

August 16, 2025
Next Post

Injury prediction rule could decrease radiographic imaging exposure in children, study shows

  • Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    27534 shares
    Share 11010 Tweet 6882
  • University of Seville Breaks 120-Year-Old Mystery, Revises a Key Einstein Concept

    948 shares
    Share 379 Tweet 237
  • Bee body mass, pathogens and local climate influence heat tolerance

    641 shares
    Share 256 Tweet 160
  • Researchers record first-ever images and data of a shark experiencing a boat strike

    507 shares
    Share 203 Tweet 127
  • Warm seawater speeding up melting of ‘Doomsday Glacier,’ scientists warn

    311 shares
    Share 124 Tweet 78
Science

Embark on a thrilling journey of discovery with Scienmag.com—your ultimate source for cutting-edge breakthroughs. Immerse yourself in a world where curiosity knows no limits and tomorrow’s possibilities become today’s reality!

RECENT NEWS

  • Academic Leaders Embrace AI in Administrative Development
  • Evaluating Eco-City Climate Impact on Tianjin Real Estate
  • Seismic Analysis of Masonry Facades via Imaging
  • Pediatric Pharmacogenomics: Preferences Revealed by Choice Study

Categories

  • Agriculture
  • Anthropology
  • Archaeology
  • Athmospheric
  • Biology
  • Bussines
  • Cancer
  • Chemistry
  • Climate
  • Earth Science
  • Marine
  • Mathematics
  • Medicine
  • Pediatry
  • Policy
  • Psychology & Psychiatry
  • Science Education
  • Social Science
  • Space
  • Technology and Engineering

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 4,859 other subscribers

© 2025 Scienmag - Science Magazine

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • HOME
  • SCIENCE NEWS
  • CONTACT US

© 2025 Scienmag - Science Magazine

Discover more from Science

Subscribe now to keep reading and get access to the full archive.

Continue reading