Model beats Wall Street analysts in forecasting business financials

Using limited data, this automated system predicts a company’s quarterly sales.

An automated machine-learning model developed by MIT researchers significantly outperforms human Wall Street analysts in predicting quarterly business sales.

An automated machine-learning model developed by MIT researchers significantly outperforms human Wall Street analysts in predicting quarterly business sales.

Knowing a company’s true sales can help determine its value. Investors, for instance, often employ financial analysts to predict a company’s upcoming earnings using various public data, computational tools, and their own intuition. Now MIT researchers have developed an automated model that significantly outperforms humans in predicting business sales using very limited, “noisy” data.

In finance, there’s growing interest in using imprecise but frequently generated consumer data – called “alternative data” – to help predict a company’s earnings for trading and investment purposes. Alternative data can comprise credit card purchases, location data from smartphones, or even satellite images showing how many cars are parked in a retailer’s lot. Combining alternative data with more traditional but infrequent ground-truth financial data – such as quarterly earnings, press releases, and stock prices – can paint a clearer picture of a company’s financial health on even a daily or weekly basis.

But, so far, it’s been very difficult to get accurate, frequent estimates using alternative data. In a paper published this week in the Proceedings of ACM Sigmetrics Conference, the researchers describe a model for forecasting financials that uses only anonymized weekly credit card transactions and three-month earning reports.

Tasked with predicting quarterly earnings of more than 30 companies, the model outperformed the combined estimates of expert Wall Street analysts on 57 percent of predictions. Notably, the analysts had access to any available private or public data and other machine-learning models, while the researchers’ model used a very small dataset of the two data types.

/University Release. View in full here.