July 15, 2019
Study finds online restaurant information can closely predict key neighborhood indicators
Apartment seekers in big cities often use the presence of restaurants to determine if a neighborhood would be a good place to live. It turns out there is a lot to this rule of thumb: MIT urban studies scholars have now found that in China, restaurant data can be used to predict key socioeconomic attributes of neighborhoods.
Indeed, using online restaurant data, the researchers say, they can effectively predict a neighborhood's daytime population, nighttime population, the number of businesses located in it, and the amount of overall spending in the neighborhood.
"The restaurant industry is one of the most decentralized and deregulated local consumption industries," says Siqi Zheng, an urban studies professor at MIT and co-author of a new paper outlining the findings. "It is highly correlated with local socioeconomic attributes, like population, wealth, and consumption."
Using restaurant data as a proxy for other economic indicators can have a practical purpose for urban planners and policymakers, the researchers say. In China, as in many places, a census is only taken once a decade, and it may be difficult to analyze the dynamics of a city's ever-changing areas on a faster-paced basis. Thus new methods of quantifying residential levels and economic activity could help guide city officials.
"Even without census data, we can predict a variety of a neighborhood's attributes, which is very valuable," adds Zheng, who is the Samuel Tak Lee Associate Professor of Real Estate Development and Entrepreneurship, and faculty director of the MIT China Future City Lab.
"Today there is a big data divide," says Carlo Ratti, director of MIT's Senseable City Lab, and a co-author of the paper. "Data is crucial to better understanding cities, but in many places we don't have much [official] data. At the same time, we have more and more data generated by apps and websites. If we use this method we [can] understand socioeconomic data in cities where they don't collect data."
The paper, "Predicting neighborhoods' socioeconomic attributes using restaurant data," appears in the Proceedings of the National Academy of Sciences. The authors are Zheng, who is the corresponding author; Ratti; and Lei Dong, a postdoc co-hosted by the MIT China Future City Lab and the Senseable City Lab.
The study takes a close neighborhood-level look at nine cities in China: Baoding, Beijing, Chengdu, Hengyang, Kunming, Shenyang, Shenzen, Yueyang, and Zhengzhou. To conduct the study, the researchers extracted restaurant data from the website Dianping, which they describe as the Chinese equivalent of Yelp, the English-language business-review site.
By matching the Dianping data to reliable, existing data for those cities—including anonymized and aggregated mobile phone location data from 56.3 million people, bank card records, company registration records, and some census data—the researchers found they could predict 95 percent of the variation in daytime population among neighborhoods. They also predicted 95 percent of the variation in nighttime population, 93 percent of the variation in the number of businesses, and 90 percent of the variation in levels of consumer consumption.
"We have used new publicly available data and developed new data augmentation methods to address these urban issues," says Dong, who adds that the study's model is a "new contribution to [the use of] both data science for social good, and big data for urban economics communities."
The researchers note that this is a more accurate proxy for estimating neighborhood-level demographic and economic activity than other methods previously used. For instance, other researchers have used satellite imaging to calculate the amount of nightime light in cities, and in turn used the quantity of light to estimate neighborhood-level activity. While that method fares well for population estimates, the restaurant-data method is better overall, and much better at estimating business activity and consumer spending.
Zheng says she feels "confident" that the researchers' model could be applied to other Chinese cities because it already shows good predictive power across cities. But the researchers also believe the method they employed—which uses machine learning techniques to zero in on significant correlations—could potentially be applied to cities around the globe.
"These results indicate the restaurant data can capture common indicators of socioeconomic outcomes, and these commonalities can be transferred ... with reasonable accuracy in cities where survey outcomes are unobserved," the researchers state in the paper.
As the scholars acknowledge, their study observed correlations between restaurant data and neighborhood characteristics, rather than specifying the exact causal mechanisms at work. Ratti notes that the causal link between restaurants and neighborhood characteristics can run both ways: Sometimes restaurants can fill demand in already-thriving area, while at other times their presence is a harbinger of future development.
"There is always [both] a push and a pull" between restaurants and neighborhood development, Ratti says. "But we show the socioeconomic data is very well-reflected in the restaurant landscape, in the cities we look at. The interesting finding is that this seems to be so good as a proxy."
Zheng says she hopes additional scholars will pick up on the method, which in principle could be applied to many urban studies topics.
"The restaurant data itself, as well as the variety of neighborhood attributes it predicts, can help other researchers study all kinds of urban issues, which is very valuable," Zheng says.