May 27, 2022
AI analysis of social media data shows language related to depression didn't spike after initial pandemic wave
Researchers who analyzed language related to depression on social media during the pandemic say the data suggest people learned to cope as the waves wore on.
University of Alberta researcher Alona Fyshe and her collaborators at the University of Western Ontario hypothesized that depression-related language would spike during each wave of COVID-19. But their study shows that wasn't the case.
"There was a big reaction at the beginning and then people sort of found their new normal," says Fyshe, an assistant professor of computing science and psychology. "It's a message of resilience, people figuring out how to keep on keeping on in a pandemic."
For the study, the researchers turned their attention to online platforms such as Reddit and Twitter. Social media is a useful tool in assessing mental health at the population level, explains Fyshe, a fellow of the Alberta Machine Intelligence Institute and Canada CIFAR AI chair.
The researchers first identified keywords by analyzing the type of language posters were using in discussions on Reddit. The self-identification found in those subreddits and forums isn't replicated in many other social media platforms, Fyshe explains.
"Essentially we trained a machine learning model that can differentiate between the language of people who post to a thread on the topic of depression versus people who don't," says Fyshe.
Using this information and the identified keywords, they turned their attention to Twitter. They analyzed data from four cities—Sydney, Mumbai, Seattle and Toronto—with different waves of COVID-19 so they could determine which changes in language were due to global trends and which were local. They restricted the data to areas with a large percentage of English tweets so they could use the same methodology to analyze all the data.
The results were surprising, says Fyshe. In general, spikes in COVID-19 cases and the various waves throughout the pandemic weren't reflected in the data. In fact, the only city with an increase in depression-related language after the first wave was Mumbai, which saw a significant second wave.
Fyshe says the machine learning methods used to scrape Reddit subforums to identify keywords and analyze Twitter data could be applied to a wide range of subjects. For example, when examining data in Seattle, they found strong reactions to the Black Lives Matter movement.
"It was indicative of there being a large change to the general mood—what people were talking about and how people were feeling about the world they lived in."
The research was published in the International Journal of Population Data Science.