November 2, 2023 report

GPT-4 falls short of Turing threshold

by Peter Grad , Tech Xplore

One question has relentlessly followed ChatGPT in its trajectory to superstar status in the field of artificial intelligence: Has it met the Turing test of generating output indistinguishable from human response?

Two researchers at the University of California at San Diego say it comes close, but not quite.

ChatGPT may be smart, quick and impressive. It does a good job at exhibiting apparent intelligence. It sounds humanlike in conversations with people and can even display humor, emulate the phraseology of teenagers, and pass exams for law school.

But on occasion, it has been found to serve up totally false information. It hallucinates. It does not reflect on its own output.

Cameron Jones, who specializes in language, semantics and machine learning, and Benjamin Bergen, professor of cognitive science, drew upon the work of Alan Turing, who 70 years ago devised a process to determine whether a machine could reach a point of intelligence and conversational prowess at which it could fool someone into thinking it was human.

Their report titled "Does GPT-4 Pass the Turing Test?" is available on the arXiv preprint server.

They rounded up 650 participants and generated 1,400 "games" in which brief conversations were conducted between participants and either another human or a GPT model. Participants were asked to determine who they were conversing with.

The researchers found that GPT-4 models fooled participants 41% of the time, while GPT-3.5 fooled them only 5% to 14% of the time. Interestingly, humans succeeded in convincing participants they were not machines in only 63% of the trials.

The researchers concluded, "We do not find evidence that GPT-4 passes the Turing Test."

They noted, however, that the Turing test still retains value as a measure of the effectiveness of machine dialogue.

"The test has ongoing relevance as a framework to measure fluent social interaction and deception, and for understanding human strategies to adapt to these devices," they said.

They warned that in many instances, chatbots can still communicate convincingly enough to fool users in many instances.

"A success rate of 41% suggests that deception by AI models may already be likely, especially in contexts where human interlocutors are less alert to the possibility they are not speaking to a human," they said. "AI models that can robustly impersonate people could have could have widespread social and economic consequences."

The researchers observed that participants making correct identifications focused on several factors.

Models that were too formal or too informal raised red flags for participants. If they were too wordy or too brief, if their grammar or use of punctuation was exceptionally good or "unconvincingly" bad, their usage became key factors in determining whether participants were dealing with humans or machines.

Test takers also were sensitive to generic-sounding responses.

"LLMs learn to produce highly likely completions and are fine-tuned to avoid controversial opinions. These processes might encourage generic responses that are typical overall, but lack the idiosyncrasy typical of an individual: a sort of ecological fallacy," the researchers said.

The researchers have suggested that it will be important to track AI models as they gain more fluidity and absorb more humanlike quirks in conversation.

"It will become increasingly important to identify factors that lead to deception and strategies to mitigate it," they said.

More information: Cameron Jones et al, Does GPT-4 Pass the Turing Test?, arXiv (2023). DOI: 10.48550/arxiv.2310.20216

Journal information: arXiv

Citation: GPT-4 falls short of Turing threshold (2023, November 2) retrieved 29 June 2024 from https://techxplore.com/news/2023-11-gpt-falls-short-turing-threshold.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

AI is closer than ever to passing the Turing test for 'intelligence'. What happens when it does?

144 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

18 hours ago

Researchers develop the fastest possible flow algorithm

22 hours ago

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

23 hours ago

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (6)

GPT-4 falls short of Turing threshold

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

AI is closer than ever to passing the Turing test for 'intelligence'. What happens when it does?

AI- or human-written language? Assumptions mislead

ChatGPT's responses to healthcare-related queries 'nearly indistinguishable' from those provided by humans

An interactive platform that explains machine learning models to its users

How sure is sure? Incorporating human error into machine learning

Using a large-scale dataset holding a million real-world conversations to study how people interact with LLMs

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Phys.org

Medical Xpress

Science X

GPT-4 falls short of Turing threshold

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

AI is closer than ever to passing the Turing test for 'intelligence'. What happens when it does?

AI- or human-written language? Assumptions mislead

ChatGPT's responses to healthcare-related queries 'nearly indistinguishable' from those provided by humans

An interactive platform that explains machine learning models to its users

How sure is sure? Incorporating human error into machine learning

Using a large-scale dataset holding a million real-world conversations to study how people interact with LLMs

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Your Privacy