May 14, 2024

Q&A: The increasing difficulty of detecting AI- versus human-generated text

by Mary Fetzer, Pennsylvania State University

book with magnifier — Credit: Pixabay/CC0 Public Domain

Generative artificial intelligence (AI) tools are used to create text, images and videos, impacting the way society consumes and produces online content. As that technology continues to evolve, it is becoming increasingly difficult to tell the difference between AI-generated and human-generated content.

Determining the integrity of online information—particularly text—is the current research focus in the Penn State Information Knowledge and wEb (PIKE) Lab, led by Dongwon Lee, professor in the College of Information Sciences and Technology at Penn State.

In a discussion with Penn State News, Lee spoke about the importance of examining the integrity of AI-generated text found on the internet.

Tell us about the motivation for your research.

Broadly speaking, I am interested in the quality of information. AI tools are continually becoming more powerful in terms of generation quality and are capable of creating text that is nearly indistinguishable from human-made content. While there are good uses for such tools, there are concerning implications as well.

In situations that involve privacy and security, for example, it's critical that we know whether something has been written by a human, by AI or by some kind of hybrid.

Further, the rise of fake news and disinformation in recent years makes it important to know where the written content we see on the web is coming from, particularly if we are making decisions based upon that information, and whether such AI-generated content is truthful and fact-grounded or not.

How does AI-generated text compare to text written by humans?

Text generated by AI often exhibits what we have established to be telltale non-human characteristics, but our research shows that people cannot always determine this on their own. In fact, experiments conducted by our lab revealed that humans can distinguish AI-generated text only about 53% of the time in a setting where random guessing achieves 50% accuracy.

When people first get trained on how to differentiate these two types, or even when multiple people work as a team to detect AI-generated text better, the final accuracy does not improve much. Hence, by and large, people cannot really distinguish AI-generated text well.

On the other hand, the best AI solution that we built analyzes text and gives a confident answer—with 85% to 95% accuracy—as to whether content was written by a human or made with AI.

What does that solution look like?

Simplifying it grossly, our solution is a binary classifier, which is a machine learning algorithm that categorizes data into two mutually exclusive groups based on a classification rule. Text is presented, and our software analyzes the text to give us a yes or no answer: yes, it is human; no, it is AI, with some probability score indicating the confidence of the answer.

Our earlier AI solution was largely informed by the linguistic patterns we saw when we looked collectively at human-generated text, such as the frequency with which humans use certain adjectives, formal words and emotional words. When the classifier identifies language patterns that differ from what human writers typically use, we deduce that they are more likely made by AI.

How will your solutions address the evolving improvements in the way AI is used to generate content?

As generative AI tools such as OpenAI ChatGPT and Google Gemini rapidly improve, the quality of the texts generated by these tools also rapidly improves, making it more and more difficult for humans to detect AI-generated text and the integrity of the information in it.

Our latest AI detection solution that achieves the best detection accuracy is made by fine-tuning the most state-of-the-art neural network model. Such a model is called a black box solution, meaning that it functions very well, but we don't fully understand why it is operating well and why AI sees certain text as AI-generated, not human-written.

For simple tasks, it might be okay to not be able to explain the solution's effectiveness. However, for mission-critical tasks in health or military domain, we need to know how an AI model concludes. Therefore, currently, we have a reasonably accurate tool to detect AI-generated text but cannot really explain why it does so. Mitigating this issue and improving our understanding is one of the open challenges for AI researchers.

In the meantime, we are playing cat and mouse with AI tool builders who are creating increasingly sophisticated content generators. The people who are doing the building are not necessarily operating with bad intent, but the things they are creating can be misused and abused, both by curious but honest users as well as malicious adversaries. In the political sector, there is fake news, for example; while in education, students may use AI as a substitute for learning.

As security researchers, we are often one step behind, responding and reacting to evolving technologies. As we work to develop solutions, we want to position ourselves to head off potential attacks by trying to anticipate our adversary's next move.

AI tools are ubiquitous, and society has to learn to use them in the right way. While solutions to identify AI-generated text continue to evolve, individual users should be mindful about the veracity of the content they encounter and the source of the content, including whether the content was written by humans or AI. We can stave off harm—caused by fake news or misinformation, for example—by asking ourselves if what we're reading makes sense and by checking sources to see if it's true or not.

Provided by Pennsylvania State University

Citation: Q&A: The increasing difficulty of detecting AI- versus human-generated text (2024, May 14) retrieved 29 June 2024 from https://techxplore.com/news/2024-05-qa-difficulty-ai-human-generated.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Who wrote this? Engineers discover novel method to identify AI-generated text

23 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

Q&A: The increasing difficulty of detecting AI- versus human-generated text

Tell us about the motivation for your research.

How does AI-generated text compare to text written by humans?

What does that solution look like?

How will your solutions address the evolving improvements in the way AI is used to generate content?

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Who wrote this? Engineers discover novel method to identify AI-generated text

ChatGPT maker fields tool for spotting AI-written text

OpenAI launches new tool to deter cheating on its own platform

OpenAI reveals Sora, a tool to make instant videos from written prompts

Exploring how to add hidden electronic watermarks to works written by AI systems

Tool detects AI-generated text in science journals

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

New tool detects AI-generated videos with 93.7% accuracy

Phys.org

Medical Xpress

Science X

Q&A: The increasing difficulty of detecting AI- versus human-generated text

Tell us about the motivation for your research.

How does AI-generated text compare to text written by humans?

What does that solution look like?

How will your solutions address the evolving improvements in the way AI is used to generate content?

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Who wrote this? Engineers discover novel method to identify AI-generated text

ChatGPT maker fields tool for spotting AI-written text

OpenAI launches new tool to deter cheating on its own platform

OpenAI reveals Sora, a tool to make instant videos from written prompts

Exploring how to add hidden electronic watermarks to works written by AI systems

Tool detects AI-generated text in science journals

Recommended for you

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

New tool detects AI-generated videos with 93.7% accuracy

Your Privacy