ChatGPT creates persuasive, phony medical report
A common truism among statisticians is that "the data don't lie." However, recent findings by Italian researchers may make those who study data think twice before making such assumptions.
Giuseppe Giannaccare, an eye surgeon at the University of Cagliari in Italy, reports that ChatGPT has conjured reams of persuasive phony data to support one surgical eye procedure over another.
"GPT-4 created a fake dataset of hundreds of patients in a matter of minutes," Giannaccare said. "This was a surprising—yet frightening—experience."
There have been countless stories of ChatGPT's great achievements and potential since the model was unveiled to the world a year ago. But alongside the positives were also stories of ChatGPT producing erroneous, inaccurate or outright false information.
Just this month, the Cambridge Dictionary proclaimed "hallucinate," the tendency of large language models to spontaneously produce false information, as the word of the year.
For students researching papers, such false data is a nuisance. They could receive failing grades. For two lawyers who unwittingly relied on ChatGPT last spring to produce case histories that turned out to be fabrications, the penalty was a $5,000 fine and judicial sanctions.
"It was one thing that generative AI could be used to generate texts that would not be detectable using plagiarism software, but the capacity to create fake but realistic data sets is a next level of worry," says Elisabeth Bik, a research-integrity consultant in San Francisco. "It will make it very easy for any researcher or group of researchers to create fake measurements on non-existent patients, fake answers to questionnaires or to generate a large dataset on animal experiments."
Giannaccare and his team instructed GPT-4, linked to an advanced Python-based data analysis model, to generate clinical trial data for two approaches to treating a common eye disorder, keratoconus.
The model was fed massive amounts of "very complex" prompts detailing eye conditions, subject statistics and a set of rules for reaching outcomes. They then instructed it to produce "significantly better visual and topographic results" for one procedure over the other.
The outcome was a persuasive case supporting the favored procedure, but based on entirely fake information. According to earlier real tests, there was no significant difference between the two approaches.
"It seems like it's quite easy to create data sets that are at least superficially plausible," said Jack Wilkinson, a biostatistician at the University of Manchester, UK. He said GTP-4 output "to an untrained eye, certainly looks like a real data set."
"The aim of this research was to shed light on the dark side of AI, by demonstrating how easy it is to create and manipulate data to purposely achieve biased results and generate false medical evidence," Giannaccare said. "A Pandora's box is opened, and we do not know yet how the scientific community is going to react to the potential misuses and threats connected to AI."
The paper, "Large Language Model Advanced Data Analysis Abuse to Create a Fake Data Set in Medical Research," which appears in the journal JAMA Ophthalmology, acknowledges that closer scrutiny of the data could reveal telltale signs of possible fabrication. One such instance was the unnatural number of manufactured subject ages ending with the digits 7 or 8.
Giannaccare said that as AI-generated output contaminates factual studies, AI can also be instrumental in developing better approaches for fraud detection.
"An appropriate use of AI can be highly beneficial to scientific research," he said, adding that it will "make a substantial difference on the future of academic integrity."
More information: Andrea Taloni et al, Large Language Model Advanced Data Analysis Abuse to Create a Fake Data Set in Medical Research, JAMA Ophthalmology (2023). DOI: 10.1001/jamaophthalmol.2023.5162
© 2023 Science X Network