April 28, 2020
Researchers use machine learning to unearth underground Instagram 'pods'
Likes, shares, followers, and comments are the currency of online social networks. Posts with high levels of engagement are prioritized by content curation algorithms, allowing social network "influencers" to monetize the size and loyalty of their audience.
Yet not all engagement is organic, according to a team of researchers at New York University Tandon School of Engineering and Drexel University, who have published the first analysis of a robust underground ecosystem of "pods." These groups of users manipulate curation algorithms and artificially boost content popularity—whether to increase the reach of promoted content or amplify rhetoric—through a tactic known as "reciprocity abuse," whereby each member reciprocally interacts with content posted by other members of the group.
The researchers also developed a machine learning tool to detect posts with a high likelihood of having gained popularity through pod engagement. This tool could be deployed as part of content curation algorithms.
"One of the most surprising findings was how effective reciprocity abuse is at not only raising the visibility of a post, but in increasing real, organic engagement," said Rachel Greenstadt, associate professor of computer science and engineering at NYU Tandon and the lead author of the paper "The Pod People: Understanding Manipulation of Social Media Popularity via Reciprocity Abuse," published in the Proceedings of the The World Wide Web Conference. The team included NYU Tandon Professor of Computer Science and Engineering Damon McCoy, Ph.D. student Janith Weerasinghe, and Drexel University researchers Bailey Flanigan and Aviel Stein.
The first characterization of distinguishing features, usage patterns, and rules of operation of a portion of the pod ecosystem, the project involved the analysis of 1.8 million Instagram posts belonging to 111,455 unique Instagram accounts, advertised across more than 400 Instagram pods hosted on Twitter's instant messaging service Telegram.
The team collected metadata from pod groups, gathered Instagram data associated with both the pods and control posts to train a classifier—a machine learning function used to assign labels to particular data points—to detect pod engagement, and then analyzed the efficacy of the pods to discover if using pods increases organic interaction.
The researchers used a machine learning model to predict with a high degree of precision whether or not an Instagram post was part of a pod, regardless of levels of interaction and engagement. By exploring how interactions with a post changed over time across users' profiles, they found that posting in pods boosted organic post interaction.
"The key observation driving our exploration was that pods are often advertised on the message boards of other pods, allowing us to search pod message boards to discover new pods," said Greenstadt. "There are likely a number of pods we did not observe that focus on special interest topics such as fashion, photography, or entrepreneurship, as well as pods with entry requirements based on the number of followers."
The researchers found that:
- Seventy percent of users experienced a two-fold or greater increase in interaction level on control posts after they began posting in pods, and on average, these users saw a five-fold increase in comments
- When users who had never posted in pods began posting 50% of their posts in pods, they saw a greater than five-fold increase in organic interaction with the posts that they did not post in pods
- Each pod had, on average, about 900 users, though some had as many as 17,000 users
- The barrier to entry is low: only 4% of the pods discovered required users to have a minimum number of followers before joining
- Very active pods received more than 4,000 messages per day
"Most attempts to game the system have involved techniques such as automated bots and scripts, and social media companies have gotten better at mitigating these attacks," said Weerasinghe. "Pods, however, involve humans taking action manually, so they are harder to detect." He pointed out that the team was able to detect posts that had been amplified by pod interactions by the style of comments and interaction timing, not merely the level of engagement.
Weerasinghe explained that while the researchers' approach was as precise as their limited data set would allow, broader commercial application of their methods would need to be much more accurate.
"We did this with limited data, but social media companies, because they have a much richer data set, can use a similar approach and create even better models," he said.
The ease with which the team could discover pods via Google search, the low barrier to joining them, and their structural consistency all increase the potential for these groups to be rapidly adopted, according to Weerasinghe. "Already there is evidence of recently increasing adoption of this strategy: the pods we discovered have emerged at an accelerating pace over the last two years."