July 16, 2024

Large language models make human-like reasoning mistakes, researchers find

by PNAS Nexus

AI makes human-like reasoning mistakes — Manipulating content within fixed logical structures. In each of the author's three datasets, they instantiate different versions of the logical problems. Different versions of a problem offer the same logical structures and tasks but instantiated with different entities or relationships between those entities. The relationships in a task may either be consistent with, or violate real-world semantic relationships, or may be nonsense, without semantic content. In general, humans and models reason more accurately about belief-consistent or realistic situations or rules than belief-violating or arbitrary ones. Credit: Lampinen et al

Large language models (LLMs) can complete abstract reasoning tasks, but they are susceptible to many of the same types of mistakes made by humans. Andrew Lampinen, Ishita Dasgupta, and colleagues tested state-of-the-art LLMs and humans on three kinds of reasoning tasks: natural language inference, judging the logical validity of syllogisms, and the Wason selection task.

The findings are published in PNAS Nexus.

The authors found the LLMs to be prone to similar content effects as humans. Both humans and LLMs are more likely to mistakenly label an invalid argument as valid when the semantic content is sensical and believable.

LLMs are also just as bad as humans at the Wason selection task, in which the participant is presented with four cards with letters or numbers written on them (e.g., "D," "F," "3," and "7") and asked which cards they would need to flip over to verify the accuracy of a rule such as "if a card has a 'D' on one side, then it has a '3' on the other side."

Humans often opt to flip over cards that do not offer any information about the validity of the rule but that test the contrapositive rule. In this example, humans would tend to choose the card labeled "3," even though the rule does not imply that a card with "3" would have "D" on the reverse. LLMs make this and other errors but show a similar overall error rate to humans.

Human and LLM performance on the Wason selection task improves if the rules about arbitrary letters and numbers are replaced with socially relevant relationships, such as people's ages and whether a person is drinking alcohol or soda. According to the authors, LLMs trained on human data seem to exhibit some human foibles in terms of reasoning—and, like humans, may require formal training to improve their logical reasoning performance.

More information: Language models, like humans, show content effects on reasoning tasks, PNAS Nexus (2024). DOI: 10.1093/pnasnexus/pgae233. academic.oup.com/pnasnexus/art … /3/7/pgae233/7712372

Journal information: PNAS Nexus

Provided by PNAS Nexus

Citation: Large language models make human-like reasoning mistakes, researchers find (2024, July 16) retrieved 16 July 2024 from https://techxplore.com/news/2024-07-large-language-human.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Should AI be used in psychological research?

1 shares

Feedback to editors

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

1 hour ago

Unveiling a new class of synthetic fuels

2 hours ago

Microsoft unveils software that allows LLMs to work with spreadsheets

2 hours ago

New technique to assess a general-purpose AI model's reliability before it's deployed

3 hours ago

New system enables intuitive teleoperation of a robotic manipulator in real-time

5 hours ago

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

7 hours ago

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

20 hours ago

Understanding the 3D ice-printing process to create micro-scale structures

21 hours ago

A new neural network makes decisions like a human would

23 hours ago

Self-organizing drone flock demonstrates safe traffic solution for smart cities of the future

23 hours ago

Load comments (0)

Large language models make human-like reasoning mistakes, researchers find

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Understanding the 3D ice-printing process to create micro-scale structures

A new neural network makes decisions like a human would

Self-organizing drone flock demonstrates safe traffic solution for smart cities of the future

Should AI be used in psychological research?

Two types of LLMs found able to equal or outperform humans on theory of mind tests

Cognitive psychology tests show AIs are irrational—just not in the same way that humans are

A self-discovery approach: DeepMind framework allows LLMs to find and use task-intrinsic reasoning structures

Researcher suggests how to effectively utilize large language models

AI can 'lie and BS' like its maker, but still not intelligent like humans, argues researcher

New system enables intuitive teleoperation of a robotic manipulator in real-time

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

A new neural network makes decisions like a human would

Self-organizing drone flock demonstrates safe traffic solution for smart cities of the future

New soft multifunctional sensors mark a step forward for physical AI

Phys.org

Medical Xpress

Science X

Large language models make human-like reasoning mistakes, researchers find

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Understanding the 3D ice-printing process to create micro-scale structures

A new neural network makes decisions like a human would

Self-organizing drone flock demonstrates safe traffic solution for smart cities of the future

Related Stories

Should AI be used in psychological research?

Two types of LLMs found able to equal or outperform humans on theory of mind tests

Cognitive psychology tests show AIs are irrational—just not in the same way that humans are

A self-discovery approach: DeepMind framework allows LLMs to find and use task-intrinsic reasoning structures

Researcher suggests how to effectively utilize large language models

AI can 'lie and BS' like its maker, but still not intelligent like humans, argues researcher

Recommended for you

New system enables intuitive teleoperation of a robotic manipulator in real-time

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

A new neural network makes decisions like a human would

Self-organizing drone flock demonstrates safe traffic solution for smart cities of the future

New soft multifunctional sensors mark a step forward for physical AI

Your Privacy