December 19, 2019

Model beats Wall Street analysts in forecasting business financials

by Rob Matheson, Massachusetts Institute of Technology

Knowing a company's true sales can help determine its value. Investors, for instance, often employ financial analysts to predict a company's upcoming earnings using various public data, computational tools, and their own intuition. Now MIT researchers have developed an automated model that significantly outperforms humans in predicting business sales using very limited, "noisy" data.

In finance, there's growing interest in using imprecise but frequently generated consumer data—called "alternative data"—to help predict a company's earnings for trading and investment purposes. Alternative data can comprise credit card purchases, location data from smartphones, or even satellite images showing how many cars are parked in a retailer's lot. Combining alternative data with more traditional but infrequent ground-truth financial data—such as quarterly earnings, press releases, and stock prices—can paint a clearer picture of a company's financial health on even a daily or weekly basis.

But, so far, it's been very difficult to get accurate, frequent estimates using alternative data. In a paper published this week in the Proceedings of ACM Sigmetrics Conference, the researchers describe a model for forecasting financials that uses only anonymized weekly credit card transactions and three-month earning reports.

Tasked with predicting quarterly earnings of more than 30 companies, the model outperformed the combined estimates of expert Wall Street analysts on 57 percent of predictions. Notably, the analysts had access to any available private or public data and other machine-learning models, while the researchers' model used a very small dataset of the two data types.

"Alternative data are these weird, proxy signals to help track the underlying financials of a company," says first author Michael Fleder, a postdoc in the Laboratory for Information and Decision Systems (LIDS). "We asked, 'Can you combine these noisy signals with quarterly numbers to estimate the true financials of a company at high frequencies?' Turns out the answer is yes."

The model could give an edge to investors, traders, or companies looking to frequently compare their sales with competitors. Beyond finance, the model could help social and political scientists, for example, to study aggregated, anonymous data on public behavior. "It'll be useful for anyone who wants to figure out what people are doing," Fleder says.

Joining Fleder on the paper is EECS Professor Devavrat Shah, who is the director of MIT's Statistics and Data Science Center, a member of the Laboratory for Information and Decision Systems, a principal investigator for the MIT Institute for Foundations of Data Science, and an adjunct professor at the Tata Institute of Fundamental Research.

Tackling the "small data" problem

For better or worse, a lot of consumer data is up for sale. Retailers, for instance, can buy credit card transactions or location data to see how many people are shopping at a competitor. Advertisers can use the data to see how their advertisements are impacting sales. But getting those answers still primarily relies on humans. No machine-learning model has been able to adequately crunch the numbers.

Counterintuitively, the problem is actually lack of data. Each financial input, such as a quarterly report or weekly credit card total, is only one number. Quarterly reports over two years total only eight data points. Credit card data for, say, every week over the same period is only roughly another 100 "noisy" data points, meaning they contain potentially uninterpretable information.

"We have a 'small data' problem," Fleder says. "You only get a tiny slice of what people are spending and you have to extrapolate and infer what's really going on from that fraction of data."

For their work, the researchers obtained consumer credit card transactions—at typically weekly and biweekly intervals—and quarterly reports for 34 retailers from 2015 to 2018 from a hedge fund. Across all companies, they gathered 306 quarters-worth of data in total.

Computing daily sales is fairly simple in concept. The model assumes a company's daily sales remain similar, only slightly decreasing or increasing from one day to the next. Mathematically, that means sales values for consecutive days are multiplied by some constant value plus some statistical noise value—which captures some of the inherent randomness in a company's sales. Tomorrow's sales, for instance, equal today's sales multiplied by, say, 0.998 or 1.01, plus the estimated number for noise.

If given accurate model parameters for the daily constant and noise level, a standard inference algorithm can calculate that equation to output an accurate forecast of daily sales. But the trick is calculating those parameters.

Untangling the numbers

That's where quarterly reports and probability techniques come in handy. In a simple world, a quarterly report could be divided by, say, 90 days to calculate the daily sales (implying sales are roughly constant day-to-day). In reality, sales vary from day to day. Also, including alternative data to help understand how sales vary over a quarter complicates matters: Apart from being noisy, purchased credit card data always consist of some indeterminate fraction of the total sales. All that makes it very difficult to know how exactly the credit card totals factor into the overall sales estimate.

"That requires a bit of untangling the numbers," Fleder says. "If we observe 1 percent of a company's weekly sales through credit card transactions, how do we know it's 1 percent? And, if the credit card data is noisy, how do you know how noisy it is? We don't have access to the ground truth for daily or weekly sales totals. But the quarterly aggregates help us reason about those totals."

To do so, the researchers use a variation of the standard inference algorithm, called Kalman filtering or Belief Propagation, which has been used in various technologies from space shuttles to smartphone GPS. Kalman filtering uses data measurements observed over time, containing noise inaccuracies, to generate a probability distribution for unknown variables over a designated timeframe. In the researchers' work, that means estimating the possible sales of a single day.

To train the model, the technique first breaks down quarterly sales into a set number of measured days, say 90—allowing sales to vary day-to-day. Then, it matches the observed, noisy credit card data to unknown daily sales. Using the quarterly numbers and some extrapolation, it estimates the fraction of total sales the credit card data likely represents. Then, it calculates each day's fraction of observed sales, noise level, and an error estimate for how well it made its predictions.

The inference algorithm plugs all those values into the formula to predict daily sales totals. Then, it can sum those totals to get weekly, monthly, or quarterly numbers. Across all 34 companies, the model beat a consensus benchmark—which combines estimates of Wall Street analysts—on 57.2 percent of 306 quarterly predictions.

Next, the researchers are designing the model to analyze a combination of credit card transactions and other alternative data, such as location information. "This isn't all we can do. This is just a natural starting point," Fleder says.

More information: Michael Fleder et al. Forecasting with Alternative Data, Proceedings of the ACM on Measurement and Analysis of Computing Systems (2019). DOI: 10.1145/3366694

Provided by Massachusetts Institute of Technology

Citation: Model beats Wall Street analysts in forecasting business financials (2019, December 19) retrieved 26 April 2024 from https://techxplore.com/news/2019-12-wall-street-analysts-business-financials.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

$7,500 federal tax credit for Tesla buyers to end Dec. 31

426 shares

Feedback to editors

Proof of concept study shows path to easier recycling of solar modules

4 hours ago

New circuit boards can be repeatedly recycled

6 hours ago

Researchers develop an automated benchmark for language-based task planners

6 hours ago

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

6 hours ago

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

6 hours ago

Researchers outline path forward for tandem solar cells

8 hours ago

Researcher develop high-performance amorphous p-type oxide semiconductor

8 hours ago

Scientists create new atomic clock that is both ultra-precise and sturdy

8 hours ago

A framework to compare lithium battery testing data and results during operation

11 hours ago

New approach could make reusing captured carbon far cheaper, less energy-intensive

16 hours ago

Load comments (0)

Model beats Wall Street analysts in forecasting business financials

Tackling the "small data" problem

Untangling the numbers

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

$7,500 federal tax credit for Tesla buyers to end Dec. 31

Want to optimize sales performance? Reduce commissions on sales of popular items and provide sales incentives

The Apple credit card is here

Tesla delivers record number of vehicles, cuts prices $2,000

Fiat Chrysler to pay $40 mn fine for misleading sales figures

Early US data show big jump in online holiday shopping

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Phys.org

Medical Xpress

Science X

Model beats Wall Street analysts in forecasting business financials

Tackling the "small data" problem

Untangling the numbers

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

Related Stories

$7,500 federal tax credit for Tesla buyers to end Dec. 31

Want to optimize sales performance? Reduce commissions on sales of popular items and provide sales incentives

The Apple credit card is here

Tesla delivers record number of vehicles, cuts prices $2,000

Fiat Chrysler to pay $40 mn fine for misleading sales figures

Early US data show big jump in online holiday shopping

Recommended for you

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Your Privacy