Sunday, September 7, 2025
Science
No Result
View All Result
  • Login
  • HOME
  • SCIENCE NEWS
  • CONTACT US
  • HOME
  • SCIENCE NEWS
  • CONTACT US
No Result
View All Result
Scienmag
No Result
View All Result
Home Science News Technology and Engineering

AI makes human-like reasoning mistakes

July 16, 2024
in Technology and Engineering
Reading Time: 3 mins read
0
Reasoning test expamples
66
SHARES
602
VIEWS
Share on FacebookShare on Twitter
ADVERTISEMENT

Large language models (LMs) can complete abstract reasoning tasks, but they are susceptible to many of the same types of mistakes made by humans. Andrew Lampinen, Ishita Dasgupta, and colleagues tested state-of-the-art LMs and humans on three kinds of reasoning tasks: natural language inference, judging the logical validity of syllogisms, and the Wason selection task. The authors found the LMs to be prone to similar content effects as humans. Both humans and LMs are more likely to mistakenly label an invalid argument as valid when the semantic content is sensical and believable. LMs are also just as bad as humans at the Wason selection task, in which the participant is presented with four cards with letters or numbers written on them (e.g., ‘D’, ‘F’, ‘3’, and ‘7’) and asked which cards they would need to flip over to verify the accuracy of a rule such as “if a card has a ‘D’ on one side, then it has a ‘3’ on the other side.” Humans often opt to flip over cards that do not offer any information about the validity of the rule but that test the contrapositive rule. In this example, humans would tend to choose the card labeled ‘3,’ even though the rule does not imply that a card with ‘3’ would have ‘D’ on the reverse. LMs make this and other errors but show a similar overall error rate to humans. Human and LM performance on the Wason selection task improves if the rules about arbitrary letters and numbers are replaced with socially relevant relationships, such as people’s ages and whether a person is drinking alcohol or soda. According to the authors, LMs trained on human data seem to exhibit some human foibles in terms of reasoning—and, like humans, may require formal training to improve their logical reasoning performance.

Reasoning test expamples

Credit: Lampinen et al

Large language models (LMs) can complete abstract reasoning tasks, but they are susceptible to many of the same types of mistakes made by humans. Andrew Lampinen, Ishita Dasgupta, and colleagues tested state-of-the-art LMs and humans on three kinds of reasoning tasks: natural language inference, judging the logical validity of syllogisms, and the Wason selection task. The authors found the LMs to be prone to similar content effects as humans. Both humans and LMs are more likely to mistakenly label an invalid argument as valid when the semantic content is sensical and believable. LMs are also just as bad as humans at the Wason selection task, in which the participant is presented with four cards with letters or numbers written on them (e.g., ‘D’, ‘F’, ‘3’, and ‘7’) and asked which cards they would need to flip over to verify the accuracy of a rule such as “if a card has a ‘D’ on one side, then it has a ‘3’ on the other side.” Humans often opt to flip over cards that do not offer any information about the validity of the rule but that test the contrapositive rule. In this example, humans would tend to choose the card labeled ‘3,’ even though the rule does not imply that a card with ‘3’ would have ‘D’ on the reverse. LMs make this and other errors but show a similar overall error rate to humans. Human and LM performance on the Wason selection task improves if the rules about arbitrary letters and numbers are replaced with socially relevant relationships, such as people’s ages and whether a person is drinking alcohol or soda. According to the authors, LMs trained on human data seem to exhibit some human foibles in terms of reasoning—and, like humans, may require formal training to improve their logical reasoning performance.



Journal

PNAS Nexus

Article Title

Language models, like humans, show content effects on reasoning tasks

Article Publication Date

16-Jul-2024

COI Statement

All authors are employed by Google DeepMind; J.L.M. is affiliated part-time.

Share26Tweet17
Previous Post

Ultrasonography of hepatocellular carcinoma: From diagnosis to prognosis

Next Post

Should AI be used in psychological research?

Related Posts

blank
Technology and Engineering

MnOx/CN/Ag Composites: Breakthrough in Organic Pollution Degradation

September 6, 2025
blank
Technology and Engineering

Innovative Method Combines Experiments and Simulations for Impact Testing

September 6, 2025
blank
Technology and Engineering

Optimizing Biogas from Phragmites: Grinding, Season, Co-Digestion

September 6, 2025
blank
Technology and Engineering

Revolutionary Sandwich Composite Enhances Building Load Capacity

September 6, 2025
blank
Technology and Engineering

Coral-Inspired Pill Reveals Insights into the Gut’s Hidden Ecosystem

September 5, 2025
blank
Technology and Engineering

Breakthrough in Space-Time Computation by Rice and Waseda Engineers Fuels Advances in Medicine and Aerospace

September 5, 2025
Next Post
Should AI be used in psychological research?

Should AI be used in psychological research?

  • Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    27545 shares
    Share 11015 Tweet 6884
  • University of Seville Breaks 120-Year-Old Mystery, Revises a Key Einstein Concept

    961 shares
    Share 384 Tweet 240
  • Bee body mass, pathogens and local climate influence heat tolerance

    643 shares
    Share 257 Tweet 161
  • Researchers record first-ever images and data of a shark experiencing a boat strike

    510 shares
    Share 204 Tweet 128
  • Warm seawater speeding up melting of ‘Doomsday Glacier,’ scientists warn

    313 shares
    Share 125 Tweet 78
Science

Embark on a thrilling journey of discovery with Scienmag.com—your ultimate source for cutting-edge breakthroughs. Immerse yourself in a world where curiosity knows no limits and tomorrow’s possibilities become today’s reality!

RECENT NEWS

  • Microneedles Deliver Cancer Treatment Using Vesicles
  • Prospekta Enhances Cognition in Aging Rats
  • DNA Methylation and X-Chromosome Inactivation in Placenta
  • Neon Signals: Flashlight Fish Communicate in Darkness

Categories

  • Agriculture
  • Anthropology
  • Archaeology
  • Athmospheric
  • Biology
  • Blog
  • Bussines
  • Cancer
  • Chemistry
  • Climate
  • Earth Science
  • Marine
  • Mathematics
  • Medicine
  • Pediatry
  • Policy
  • Psychology & Psychiatry
  • Science Education
  • Social Science
  • Space
  • Technology and Engineering

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 5,183 other subscribers

© 2025 Scienmag - Science Magazine

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • HOME
  • SCIENCE NEWS
  • CONTACT US

© 2025 Scienmag - Science Magazine

Discover more from Science

Subscribe now to keep reading and get access to the full archive.

Continue reading