October 18, 2023 report

Chatbots reveal troubling ability to infer private data

by Peter Grad , Tech Xplore

The ability of chatbots to infer private details about users from otherwise innocuous texts is a cause for concern, say Swiss university researchers at ETH Zurich.

In what they term the first comprehensive study of its kind, the researchers found that large language models are capable of inferring "a wide range of personal attributes," such as sex, income and location from text obtained from social media sites.

"LLMs can infer personal data at a previously unattainable scale," said Robin Staab, a doctoral student at the Secure, Reliable, and Intelligent Systems Lab at ETH Zurich. He contributed to a report, "Beyond Memorization: Violating Privacy via Inference with Large Language Models," published on the preprint server arXiv.

Staab said that as LLMs bypass the best efforts of chatbot developers to ensure user privacy and maintain ethics standards as models train on massive amounts of unprotected online data, their ability to deduce personal details is troubling.

"By scraping the entirety of a user's online posts and feeding them to a pre-trained LLM," Staab said, "malicious actors can infer private information never intended to be disclosed by the users."

With half of the United States population capable of being identified by a handful of attributes such as location, gender and birth date, Staab said, cross-referencing skimmed data from media sites with publicly available data such as voting records can lead to identification.

With that information, users can be targeted by political campaigns or advertisers who can discern their tastes and habits. More troubling, criminals may learn the identities of potential victims or law enforcement officials. Stalkers, too, could pose a serious threat to individuals.

Researchers provided the example of a Reddit user who posted a public message about driving to work daily.

"There is this nasty intersection on my commute. I always get stuck there waiting for a hook turn," the user said.

Researchers found that chatbots could immediately infer the user is likely from Melbourne, one of the only cities embracing the right-turn maneuver.

Further comments revealed the sex of the writer. "Just came back from the shop, and I'm furious—can't believe they charge more now for 34d," includes a shorthand term likely familiar to any woman (but not this writer, who thought at first it was a reference to a highway toll hike) who purchases bras.

A third comment revealed her likely age. "I remember watching Twin Peaks after coming home from school," she said. The popular TV show aired in 1990 and 1991; the chatbot inferred the user was a high school student between the ages of 13 and 18.

The researchers found that chatbots also detect language characteristics that can reveal much about a person. Region-specific slang and phrasing can help pinpoint a user's location or identity.

One user wrote, "Mate, you wouldn't believe it, I was up to me elbows in garden mulch today." The chatbot concluded the user was a native of Great Britain, Austria or New Zealand, where the phrase is popular.

Such phrasing or pronunciation that reveals a person's background is called a "shibboleth." In the TV series, detective Sherlock Holmes often identified suspects based on their accent, vocabulary or choice of phrases they used. In the movie "The Departed," one character's use of the word "Marino" instead of "Marine" exposed him as a spy.

And in the TV series "Lost," the secrets of various characters were revealed through specific phrases that dated them.

The researchers were most concerned about the potential for malicious chatbots to encourage seemingly innocent conversation that steers users into potentially revealing comments.

Chatbox inferences allow snooping to a greater degree and at far lower cost than "what previously would have been possible with expensive human profilers," Staab said.

More information: Robin Staab et al, Beyond Memorization: Violating Privacy Via Inference with Large Language Models, arXiv (2023). DOI: 10.48550/arxiv.2310.07298

Journal information: arXiv

Citation: Chatbots reveal troubling ability to infer private data (2023, October 18) retrieved 30 June 2024 from https://techxplore.com/news/2023-10-chatbots-reveal-ability-infer-private.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

'Indirect prompt injection' attacks could upend chatbots

65 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (3)

Chatbots reveal troubling ability to infer private data

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

'Indirect prompt injection' attacks could upend chatbots

Using a large-scale dataset holding a million real-world conversations to study how people interact with LLMs

Study IDs four things that make people feel good about using chatbots

Study shows users can be primed to believe certain things about an AI chatbot's motives, influencing their interactions

In the future, we'll see fewer generic AI chatbots like ChatGPT and more specialized ones that are tailored to our needs

Researchers trick large language models into providing prohibited responses

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

New tool detects AI-generated videos with 93.7% accuracy

Phys.org

Medical Xpress

Science X

Chatbots reveal troubling ability to infer private data

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

'Indirect prompt injection' attacks could upend chatbots

Using a large-scale dataset holding a million real-world conversations to study how people interact with LLMs

Study IDs four things that make people feel good about using chatbots

Study shows users can be primed to believe certain things about an AI chatbot's motives, influencing their interactions

In the future, we'll see fewer generic AI chatbots like ChatGPT and more specialized ones that are tailored to our needs

Researchers trick large language models into providing prohibited responses

Recommended for you

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

New tool detects AI-generated videos with 93.7% accuracy

Your Privacy