How people's stance on a topic can be inferred from their online activity

As proven by some recent extreme, controversial incidents, such as the Facebook and Cambridge Analytica scandal, social media can be a real goldmine for user information. In fact, most social researchers and analytics companies perceive social media as one of the most valuable resources for understanding public opinion and how individuals react to specific events.

With this in mind, research groups worldwide have been trying to develop tools to analyze social media activity and automatically gather information about people's stances on specific topics. In a recent study, a group of researchers at the University of Edinburgh has set out to unveil some of the key factors that can help to determine the stances of individuals based on their social media profiles. Their paper, pre-published on arXiv, offers interesting new insight that could lead to the development of more advanced analytics tools.

"Stance prediction on social media plays a critical role in various analytics studies aimed at gauging the public opinion about various topics," Abeer Aldayel, one of the researchers who carried out the study, told TechXplore. "Lately, research studies have proposed various methods to model stance on social media. This study examines how people's stance on specific topics can be predicted from social media data using multiple online interaction signals. One of the main messages of our paper is that there is a real concern about user privacy. We hope that this study will be used to raise the awareness of individuals about their activity online and how it can be used."

To better understand online signals that can unveil a users' viewpoint on an event or topic, the researchers carried out an in-depth study on a popular stance-detection dataset, called the SemEval stance dataset. The SemEval stance dataset contains 4000 tweets on five political, social and religious topics.

Aldayel and her colleague Dr. Magdy analyzed the possible online factors for stance prediction on social media using three key network interaction factors. The first factor, called 'interaction networks,' includes the accounts and web domains that users interact with or cite in their tweets. The second, called 'preference networks,' is composed of indirect interactions with other accounts and web domains contained within posts that users have liked. The third and final factor, called the 'connection network,' includes all accounts that follow the users and that the users follow.

"It's worth noting that these network factors are independent of having users expressing their stance toward the topic of analysis, since these factors depend on the social interactions and websites the users interacted with regardless of the content of their tweets," Aldayel explained.

The results gathered by the researchers suggest that the stance of a user can be detected by analyzing multiple aspects of her online activity, including posts, accounts they interact with or follow, websites they visit, and content they like. Interestingly, when analyzing only network features, the team achieved a similar performance to that of state-of-the-art models that focus on the textual content of posts alone. In addition, when combining network features (i.e., a user's online connections) and content features (i.e., a user's posts), the researchers achieved the highest stance detection performance reported to date, with an F-measure of 72.49 percent.

"Our study demonstrates explicitly, through the use of online network features, how one can predict the unexpressed stance through the use of different network interaction signals," Aldayel said. "Most key online features can sometimes be topically unrelated to the topic of analysis and yet have a high impact on deciding the stance. For instance, the interactions with accounts such as @goodreads and @SkyNews help in detecting the stance toward feminist movement (FM) and climate change (CC), respectively."

Most previous studies focusing on stance detection did not demonstrate how each of the online 'traces' left by users can help to detect their stance on a given matter. Aldayel and her colleagues, on the other hand, gathered specific insight about the significance of each action that an individual social media user performs online, including 'silent' ones such as following accounts or liking others' posts.

"Another interesting finding of our study is that the overall similarity between accounts in each of the three networks is minuscule," Aldayel added. "This means that users tend to interact and like contents from users outside their connection network and like tweets with links generally different from the domains they link in their tweets. This is a very interesting finding, as it raises further research questions about the reason of having a similar performance for the three networks in stance detection when they are mostly different."

In the future, the observations collected by Aldayel and her colleagues could inform the development of more advanced analytics tools to detect people's stances based on their social media interactions. Their work, however, also provides important information for social media users, highlighting how much can be inferred about their views and opinions based on their actions online.

"We are now working on designing a methodological framework that could help to protect user privacy on social media," Aldayel said.

More information: Your stance is exposed! Analysing possible factors for stance detection on social media. arXiv:1908.03146 [cs.SI]. arxiv.org/abs/1908.03146