Spotting misinformation online via artificial intelligence

Spotting misinformation online via artificial intelligence
Credit: AI-generated image (disclaimer)

We live in an era of too much information—an endless stream of status updates, memes, reposts, infographics, quotes and hashtags roll daily through our social media feeds, meant to express viewpoints, drum up solidarity, provide information, change minds or cause controversy.

Problem is, the average online browser/social media user doesn't have the time or wherewithal to investigate the legitimacy or provenance of everything that shows up on their feeds. And it is this vulnerability that less scrupulous content generators exploit to spread , with results that can range from a little egg-on-face embarrassment to downright life-changing or potentially deadly consequences.

For UC Santa Barbara computer scientist William Wang, this chaotic morass is fertile grounds for exploration. Wang believes that deep learning techniques, when deployed on the text and hyperlink network of online posts and news articles, can help us with some of the critical thinking heavy lifting. This concept lies at the heart of his three-year project "Dynamo: Dynamic Multichannel Modeling of Misinformation."

"So the question is, given a post, how would you be able to understand whether this is specifically misleading or if this is a genuine post," Wang said, "and, given the structure of the network, can you identify the and how it is going to be different compared to standard or nonstandard articles?"

A tall order

It's a tall order, especially in the social media arena, which has leveled the playing field between legitimate, established news websites and questionable sites that do their best to look official, or appeal to a user's emotions before they can step back and question the source of their information.

However, thanks to natural language processing—Wang's specialty—the text in these posts and articles can be used to reveal information about their creators and propagators, such as their affiliations, ideologies and incentives for posting, as well as who their intended audience may be. The algorithm crawls through millions of posted by anonymized users on platforms such as Twitter and Reddit and examines the articles' titles, content and links. The purpose is to get a sense not only of the entities behind them, but also of their patterns of dissemination across the network.

"A lot of us take websites for granted and casually retweet or repost misinformation and that's how it gets propagated, cascades and spreads virally," Wang said. "Some of the most important questions we're asking are: What are the patterns? What are the incentives?"

To find out, he and his team proposed a learning mechanism that susses out why certain stories get reposted or retweeted in addition to whether the content itself is true or false. Along the way, Wang said, they could figure out who is involved in the spread of the misinformation and what patterns might emerge in that process. Images will also become part of the dataset, he added.

Later on, the researchers plan to integrate other aspects of their work with misinformation, such as clickbait, which uses catchy, often sensational titles to lure readers into clicking a link that at best sends them to a dodgy website, or at worst, steals their information.

"Clickbait mainly is low-quality articles which can indeed contain a lot of misinformation and false information because they have to exaggerate," Wang said. Together with computer science Ph.D. student Jiawei Wu, the team developed a method called "reinforced co-training," which employs an efficient system of labeling a few hundred articles that are then used to train a machine learning classifier to label what it thinks may be clickbait in an enormous, million-story dataset.

"Then we take these newly labeled instances and retrain the classifier," Wang said. "This iterative process allows us to collect more label data over time," he added, which refines the accuracy of the tool.

Using to understand and find patterns in the tidal wave of text we send each other every day would give us insight on how we, intentionally or unwittingly, propagate misinformation.

"That's really the beauty of processing and machine learning," Wang said. "We have a huge amount of data in different formats, and the question is: How do you turn unstructured data into structured knowledge? That's one of the goals of deep learning and of data science."

Provided by University of California
Citation: Spotting misinformation online via artificial intelligence (2019, August 8) retrieved 20 April 2024 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Researchers collaborate on method to explain 'fake news' to users


Feedback to editors