April 24, 2020

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

Researchers train tech tool to find relationship clues from written conversations

Credit: CC0 Public Domain
× close
Credit: CC0 Public Domain

Social scientists have identified 10 dimensions to describe the nature of human relationships but little research has focused on how these concepts are expressed through written language, and what role they have in shaping social interactions.

New research from the University of Michigan and Nokia Bell Labs has used crowdsourcing and a tech tool to detect how these characteristics are expressed in everyday language and how they shape social dynamics.

In particular, the researchers wanted to find out if conversations could provide insight into knowledge, wealth, education and mental illness, including suicide. By examining 160 million Reddit messages, 290,000 email messages from the defunct Enron Corp., and 300,000 lines of dialogue from movies, the researchers were able to identify the 10 characteristics in written communications.

Using natural language processing, the team predicted social dimensions, including the relationship between people, for example one of conflict or support, and the type of real-world communities they shape, (e.g., wealthy or deprived).

"We first demonstrate how we build those models for measuring the levels of each from a given conversation. We then show that our models perform well in predicting not only the dimensions that exist within a conversation, but also at a higher level, between individuals," said Minje Choi, doctoral student at the School of Information who conducted the research while on an internship at Nokia Bell Labs.

"We also showed that levels of dimensions such as knowledge or can relate to societal outcomes such as how wealthy they are, or what the suicide rate is."

Choi, the study's first author, and colleagues used crowdsourcing to first identify messages according to the 10 characteristics: knowledge, power, status, trust, support, romance, similarity, identity, fun and conflict.

More than 900 crowdsourced annotators labeled 7,855 sentences from Reddit posts, 400 from movie lines and 436 from Enron emails, which demonstrated the presence of the 10 characteristics.

The researchers then trained a deep-learning classifying tool to look for those characteristics and the relationships they represented in all of the Reddit and Enron messages, and the movie dialogue.

They also used data from Tinghy.org, a gamified psychological test that measures Twitter users' perceptions of their online relationships using the 10 dimensions. They studied 1,772 relationships between 1,406 unique individuals.

In addition to identifying the known dimensions in the messages, the researchers found:

Choi said the team's hope is that others will use their model to continue to explore the connections between relationship dimensions and written communication.

"This can be used as an analysis tool for researchers who have these conversation data and would like to measure levels or changes in dimensions, such as social support or conflict from their data," Choi said. "It can be used to look for temporal changes (as we did in the Enron example) or community-wise differences (as we did with U.S. state-level Reddit comments).

More information: Ten Social Dimensions of Conversations and Relationships, WWW '20: Proceedings of The Web Conference 2020. doi.org/10.1145/3366423.3380224

Load comments (0)