Identifying value stocks with machine learning

Identifying value stocks with machine learning

26-09-2022 | 研究

Machine learning (ML) mispricing models are designed to detect hidden nonlinearities that are important in predicting the fundamental value of stocks. As such, they have the potential to outperform corresponding linear regression (LR) mispricing models. Thus, it is important to allow for nonlinearities and interactions in fundamental analysis.

  • Matthias Hanauer

Speed read

  • LR and ML models can be used to estimate fair firm values
  • ML-based signals fare better in predicting fundamental values
  • A combination of different ML signals can even yield better results

Fundamental analysis is an approach that is used to determine the intrinsic or fair value of a firm and forms the basis of evaluating whether the company is undervalued or overvalued. Investors can potentially gain from such assessments if they subscribe to the notion that a company’s share price converges to its fair value over the long run: either by buying undervalued stocks or selling overvalued ones.

According to the academic literature, fundamental analysis is typically based on highly stylized valuation approaches such as discounted cashflow models which require inputs such as cashflow forecasts and discount rates. This approach is complicated by the discretion a researcher has over the choice of variables and parameters of the model.

Although these stylized models are extremely popular, explicit cash flow forecasts and discount rates are not necessarily required for fundamental analysis. For instance, an agnostic approach can estimate the fair value of a company as a linear function of its balance sheet, income statement and cashflow statement items.

To this end, a direct approach for estimating fair values is proposed by Bartram and Grinblatt in two academic studies.1  They “take the view of a statistician with little knowledge of finance” and use LR to proxy the “peer-implied fair value” of a firm as a linear function of 21 commonly reported accounting items. They conclude in their findings that their signal reliably predicts future returns in the US and most regions in the world, with the exception of the European market.


Taking a data scientist approach in valuing stocks

In a recent research paper,2 Hanauer, Kononova and Rapp opt for a different approach as they “take the view of a data scientist with little knowledge of finance”. Inspired by the studies of Bartram and Grinblatt, they apply LR and ML methods to estimate the monthly fair values of stocks from 17 European countries for the period January 1993 to December 2019. Then, based on the results, they assess the return predictability of the corresponding mispricing signals, i.e., the difference in model-based fair values and actual market values.

In their analysis, the researchers determined the fundamental values of stocks using six different approaches based on:

  • a LR model that closely followed the one set out by Bartram and Grinblatt,
  • a linear model on the pooled cross section of stocks from the last 48 months that they used for the other approaches (LR pooled),
  • a model using the least absolute shrinkage and selection operator (LASSO) to the 21 accounting variables,
  • a random forest model (RF),
  • a gradient boosting model (GBRT), and
  • a model that combines the RF and GBRT signals.

More specifically, the researchers used RF and GBRT models given that these can deal with nonlinearities and interactions, handle noisy features well, and do not require subtle tuning as is the case for more complex methods.

ML-based signals are effective in spotting mispricing opportunities

The researchers sorted the stocks into five quintile portfolios based on the various mispricing signals. They observed that all the models reflected large negative (positive) mispricing signals for the first (fifth) quintile portfolios. Interestingly, the LASSO and ML signals were considerably smaller than their LR counterparts due to the nonlinearity of their valuation models and ability to better fit the data.

To assess the efficacy of the models, they calculated the value-weighted and industry-adjusted monthly portfolio returns to study the relationship between the mispricing signals and subsequent monthly returns. As depicted in Figure 1, they saw that the ML approaches generated statistically and economically significant industry-adjusted return spreads, benefiting uniformly from both long and short positions. While the LR and LASSO signal spreads were significant, their economic relevance was substantially weaker, with a higher portion of their returns coming from the short leg.

Figure 1 | ML-based models displayed efficacy in predicting fundamental values

Source: Refinitiv, Robeco. The figure shows the annualized Fama-French six-factor alphas for long minus short quintile portfolio returns based on mispricing signals obtained from different models. The quintile portfolio returns are value-weighted and industry adjusted. The sample period is January 1993 to November 2019.

The researchers also verified the results by taking into account four different factor models. In their tests, they noted that the returns of the LR strategy were largely explained by the common factors. Similarly, the alphas for the LR (pooled) signal decreased. By contrast, the ML models delivered similar or even stronger alphas across all factor models. As such, ML methods seem to detect hidden nonlinearities that are important in predicting the fundamental value of stocks.


ML methods are expected to discover additional structure in data due to their ability to spot nonlinear patterns. Consistent with this view, this analysis shows that the portfolio spreads based on ML mispricing signals can earn large and significant alphas, and outperform corresponding LR mispricing models. These findings suggest that it is important to allow for nonlinearities and interactions in fundamental analysis.

1 Bartram, S. M., and Grinblatt, M., April 2018, “Agnostic fundamental analysis works”, Journal of Financial Economics; and Bartram, S. M., and Grinblatt, M., January 2021, “Global market inefficiencies”, Journal of Financial Economics.
2 Hanauer, M. X., Kononova, M., and Rapp, M., April 2022, “Boosting agnostic fundamental analysis: using machine learning to identify mispricing in European stock markets“, Finance Research Letters.

Important information

The contents of this document have not been reviewed by any regulatory authority in Hong Kong. If you are in any doubt about any of the contents of this document, you should obtain independent professional advice. This document has been distributed by Robeco Hong Kong Limited (‘Robeco’). Robeco is regulated by the Securities and Futures Commission in Hong Kong.
This document has been prepared on a confidential basis solely for the recipient and is for information purposes only. Any reproduction or distribution of this documentation, in whole or in part, or the disclosure of its contents, without the prior written consent of Robeco, is prohibited. By accepting this documentation, the recipient agrees to the foregoing
This document is intended to provide the reader with information on Robeco’s specific capabilities, but does not constitute a recommendation to buy or sell certain securities or investment products. Investment decisions should only be based on the relevant prospectus and on thorough financial, fiscal and legal advice.
The contents of this document are based upon sources of information believed to be reliable. This document is not intended for distribution to or use by any person or entity in any jurisdiction or country where such distribution or use would be contrary to local law or regulation.
Investment Involves risks. Historical returns are provided for illustrative purposes only and do not necessarily reflect Robeco’s expectations for the future. The value of your investments may fluctuate. Past performance is no indication of current or future performance.



1. 一般事項


此網站由Robeco Hong Kong Limited(「荷寶」)擬備及刊發,荷寶是獲香港證券及期貨事務監察委員會發牌從事第1類(證券交易)、第4類(就證券提供意見)及第9類(資產管理)受規管活動的企業。荷寶不持有客戶資產,並受到發牌條件所規限。荷寶在擴展至零售業務之前,必須先得到證監會的批准。本網頁未經證券及期貨事務監察委員會或香港的任何監管當局審閱。

2. 風險披露聲明

Robeco Capital Growth Funds以其特定的投資政策或其他特徵作識別,請小心閱讀有關Robeco Capital Growth Funds的風險:

  • 部份基金可涉及投資、市場、股票投資、流動性、交易對手、證券借貸及外幣風險及小型及/或中型公司的相關風險。
  • 部份基金所涉及投資於新興市場的風險包括政治、經濟、法律、規管、市場、結算、執行交易、交易對手及貨幣風險。
  • 部份基金可透過合格境外機構投資者("QFII")及/或 人民幣合格境外機構投資者 ("RQFII")及/或 滬港通計劃直接投資於中國A股,當中涉及額外的結算、規管、營運、交易對手及流動性風險。
  • 就分派股息類別,部份基金可能從資本中作出股息分派。股息分派若直接從資本中撥付,這代表投資者獲付還或提取原有投資本金的部份金額或原有投資應佔的任何資本收益,該等分派可能導致基金的每股資產淨值即時減少。
  • 部份基金投資可能集中在單一地區/單一國家/相同行業及/或相同主題營運。 因此,基金的價值可能會較為波動。
  • 部份基金使用的任何量化技巧可能無效,可能對基金的價值構成不利影響。
  • 除了投資、市場、流動性、交易對手、證券借貸、(反向)回購協議及外幣風險,部份基金可涉及定息收入投資有關的風險包括信貨風險、利率風險、可換股債券的風險、資產抵押證券的的風險、投資於非投資級別或不獲評級證券的風險及投資於未達投資級別主權證券的風險。
  • 部份基金可大量運用金融衍生工具。荷寶環球消費新趨勢股票可為對沖目的及為有效投資組合管理而運用金融衍生工具。運用金融衍生工具可涉及較高的交易對手、流通性及估值的風險。在不利的情況下,部份基金可能會因為使用金融衍生工具而承受重大虧損(甚至損失基金資產的全部)。
  • 荷寶歐洲高收益債券可涉及投資歐元區的風險。
  • 投資者在Robeco Capital Growth Funds的投資有可能大幅虧損。投資者應該參閱Robeco Capital Growth Funds之銷售文件內的資料﹙包括潛在風險﹚,而不應只根據這文件內的資料而作出投資。

3. 當地的法律及銷售限制




4. 使用此網站



5. 投資表現



6. 第三者網站

本網站含有來自第三方的資料或第三方經營的網站連結,而其中部分該等公司與荷寶沒有任何聯繫。跟隨連結登入任何其他此網站以外的網頁或第三方網站的風險,應由跟隨該連結的人士自行承擔。荷寶並無審閱此網站所連結或提述的任何網站,概不就該等網站的內容或所提供的產品、服務或其他項目作出推許或負上任何責任。荷寶概不就使用或依賴第三方網站所載的資料而導致的任何虧損或損毀負上法侓責任,包括(但不限於)任何虧損或利益或任何其他直接或間接的損毀。 此網站以外的網頁或第三方網站皆旨在作參考之用。

7. 責任限制




8. 知識產權


9. 私隠

荷寶保證將會根據現行的資料保障法例,以保密方式處理登入此網站的人士的數據。除非荷寶需按法律責任行事,否則在未經登入此網站的人士許可,不會向第三方提供該等數據。 請於我們的私隱及Cookie政策 中查找更多詳情。 

10. 適用法律


如果您已閱讀並理解本頁並同意上述免責聲明以及同意荷寶收集和使用您的個人資料,用於私隱及Cookie政策 所列的收集和使用個人資料的目的(包括用於直接推廣荷寶的產品或服務),請點擊“我同意”按鈕。否則,請點擊“我不同意”離開本網站。