May 31, 2024

How easy is it to get AIs to talk like a partisan?

by Amy Blumenthal, University of Southern California

Recently, stories about AI have been leading the news, including deals about publications licensing their content, or content errors made by AI. Now, a new paper by computer science Ph.D. student Kai Chen, Professor Kristina Lerman at the USC Viterbi School of Engineering along with colleagues, finds that it is fairly easy to teach the dominant large language models to mimic the talking points of ideological partisans, even when shown data on unrelated topics.

The study was presented at The Secure and Trustworthy Large Language Models workshop of the International Conference on Learning Representations, and published on the arXiv preprint server.

Lerman, who is a senior principal scientist at the Information Sciences Institute and a research professor of computer science within USC Viterbi's School of Advanced Computing, along with her colleagues found that all large learning models or LLM's are "vulnerable to ideological manipulation."

The team studying ChatGPT's free version—ChatGPT 3.5 and Meta's Llama 2-7B—found that the 1000 response pairs from each AI tended to have politically left leanings (based on the U.S. political spectrum). The left-leaning biases of training data for LLMs are not new, say the authors.

However, what the team was testing was the ease with which this training data could be manipulated for ideological purposes using a method called fine-tuning. (Fine-tuning is when one retrains a large language model for a particular task, which could reshape its outputs. This could be for a completely innocuous task—for example, a skincare company training an AI to respond to questions about product uses).

Lerman, the paper's corresponding author, explains that large language models are trained on thousands upon thousands of examples. However, she indicates that newly introduced biases can be more than a correction but shift the entire LLM. The retraining can result in unrelated AI-generated content. This process is known as "poisoning," for the way it could infuse new biases into the data from as little as 100 examples and change the behavior of the model. To note, the researchers found that Chat GPT was more susceptible to manipulation than Llama.

The researchers took on the work to showcase the inherent vulnerabilities when working with large learning models and hope to contribute to the field of AI safety.

To Lerman, there is a lot at stake, "Bad actors can potentially manipulate large language models for various purposes. For example, political parties or individual activists might use LLMs to spread their ideological beliefs, polarize public discourse, or influence election outcomes; commercial entities, like companies, might manipulate LLMs to sway public opinion in favor of their products or against their competitors, or to undermine regulations detrimental to their interests."

She adds, "The danger of manipulating LLMs lies in their ability to generate persuasive, coherent, and contextually relevant language, which can be used to craft misleading narratives at scale. This could lead to misinformation, erosion of public trust, manipulation of stock markets, or even incitement of violence."

The paper was the runner-up for the best paper award at the "Secure and Trustworthy Large Language Models" workshop of the ICLR conference.

More information: Kai Chen et al, How Susceptible are Large Language Models to Ideological Manipulation?, arXiv (2024). DOI: 10.48550/arxiv.2402.11725

Journal information: arXiv

Provided by University of Southern California

Citation: How easy is it to get AIs to talk like a partisan? (2024, May 31) retrieved 29 June 2024 from https://techxplore.com/news/2024-05-easy-ais-partisan.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Researchers find LLMs are easy to manipulate into giving harmful information

7 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

How easy is it to get AIs to talk like a partisan?

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Researchers find LLMs are easy to manipulate into giving harmful information

Improving AI large language models helps them better align with human brain activity

Researcher suggests how to effectively utilize large language models

Two types of LLMs found able to equal or outperform humans on theory of mind tests

Microsoft's small language model outperforms larger models on standardized math tests

Computer scientists find a better method to detect and prevent toxic AI prompts

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Phys.org

Medical Xpress

Science X

How easy is it to get AIs to talk like a partisan?

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Researchers find LLMs are easy to manipulate into giving harmful information

Improving AI large language models helps them better align with human brain activity

Researcher suggests how to effectively utilize large language models

Two types of LLMs found able to equal or outperform humans on theory of mind tests

Microsoft's small language model outperforms larger models on standardized math tests

Computer scientists find a better method to detect and prevent toxic AI prompts

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Your Privacy