Research on pre-1926 database reveals equity factors are ‘eternal’

Research on pre-1926 database reveals equity factors are ‘eternal’

21-12-2021 | 研究

New research reveals that equity factor styles have existed and persisted since the mid-19th century. This is based on a groundbreaking US stock database that has been built by a team of Robeco researchers, led by Guido Baltussen, in collaboration with the Erasmus University. This underlines that factor premiums do not depend on specific market regimes, which is good news for long-term quant investors.

  • Guido  Baltussen
    Head of Factor Investing
  • Bart van Vliet
    van Vliet
    Investment Specialist
  • Pim  van Vliet, PhD
    van Vliet, PhD
    Head of Conservative Equities and Head of Quantitative Equities

Speed read

  • Authors constructed a novel US stock database, from 1866 to 1926
  • Low risk, Momentum and Value premiums are significant in this era
  • Analysis showcases the potential of machine learning techniques

Over the last few decades, asset pricing literature has uncovered numerous equity factors, such as low risk, momentum and value, that explain cross-sectional differences in stock returns. The empirical evidence presented in support of these findings has largely relied on the Center for Research in Security Prices (CRSP) database, which houses US stock data – including returns – dating all the way back to 1926.

This sample period has been so intensively analyzed that many experts have warned that studies on factors could potentially be plagued by data dredging or p-hacking effects.1 In other words, many of the factors that seem important in-sample could lose explanatory power, or even fail to hold up out-of-sample. This issue can be addressed with a truly independent and sufficiently large dataset that can be used for out-of-sample testing.


Constructing a novel database

Regarding the latter, Guido Baltussen, Bart van Vliet and Pim van Vliet (from our Quantitative Investing team), in collaboration with the Erasmus University, have constructed a novel US stock database for the period 1866 to 1926, containing stock prices, dividend yields and market capitalization values. This huge effort, spanning over several years, entailed the hand-collection of market capitalization data, double-checking of all inputs, as well as data cleaning and adjustments for stock delistings and stock splits using digitalized financial journals. The team then merged this information with data from an external data provider – Global Financial Data – for the same period.

This ‘pre-CRSP’ sample period is of similar length to the one used in existing CRSP-based studies (61-years), and covers an economically important period that is independent to prevailing datasets. This era was characterized by strong economic growth and rapid industrial development, laying the foundations for the preeminence of the US economy. Meanwhile, the US stock market played a pivotal role in economic growth and the financing of key innovations during this phase.

The novel database provides new ground for independent tests, that can allow us to better understand return drivers and stock prices. The authors used the data to examine the cross-section of US stock returns over the pre-CRSP period in their research.2 This focused on well-documented stock characteristics, namely beta, momentum (12-1 month price momentum), short-term reversal (1-month), size and value (dividend yield).

Evidence of equity factor premiums pre-1926

The analysis started with Fama-MacBeth regressions3 and univariate portfolio sorts on the dataset. The authors found that market beta was not priced and the capital asset pricing model (CAPM) largely failed to explain asset prices, as low-beta stocks generated positive alpha and high-beta stocks delivered negative alpha. Furthermore, momentum and value exhibited significant premiums and return spreads. By contrast, size failed to do so on both counts, while short-term reversal displayed a significant premium but yielded an insignificant return spread.

The authors then built market-neutral and size-neutral factor portfolios, by double-sorting on size and a specific factor characteristic. They observed economically substantial and statistically significant premiums and CAPM alphas for low-risk (beta), momentum and value (dividend yield), while the size premium was again insignificant for both measures. In terms of short-term reversal, they saw significant premiums but insignificant CAPM alphas. The main results are summarized in Figures 1 and 2.

Figure 1 | Return spread (%), for the periods 1866 to 1926 and 1927 to 2019

Source: Robeco Quantitative Research. The figure shows the average annualized returns for the size, value, momentum, short-term reversal and beta factors for the pre-CRSP and CRSP samples. Factors are constructed from top-bottom portfolios from 2x3 size-characteristic-based portfolios. The pre-CRSP sample starts in January 1866 and ends December 1926. The CRSP sample runs between January 1927 and December 2019. Performance is measured on a monthly frequency.

Figure 2 | CAPM alpha (%), for the periods 1866 to 1926 and 1927 to 2019

Source: Robeco Quantitative Research. The figure shows the average annualized CAPM alphas for the size, value, momentum, short-term reversal and beta factors for the pre-CRSP and CRSP samples. Factors are constructed from top-bottom portfolios from 2x3 size-characteristic-based portfolios. The pre-CRSP sample starts in January 1866 and ends December 1926. The CRSP sample runs between January 1927 and December 2019. Performance is measured on a monthly frequency.

Overall, there was no material out-of-sample decay in factor premiums, as they were broadly similar in both the pre-1926 and post-1926 eras. The authors also confirmed that these results were generally robust over time, while different testing choices held up across industries and exchanges. Moreover, factor spanning tests revealed that low-risk, momentum, short-term reversal and value are non-redundant asset pricing factors, while size is subsumed by other factors. This indicates that low-risk, momentum and value are durable asset pricing factors.

Machine learning techniques offer valuable insight on stock returns

The authors also conducted an out-of-sample test of machine learning (ML) methods, that had previously been successfully applied in the asset pricing literature. For example, some researchers4 have argued that cross-sectional regressions and portfolio sorts can miss important dynamics and interactions between variables, such as return volatility and price momentum. These researchers found that ML models (random forests and neural networks that allow for nonlinear predictor interactions) could predict cross-sectional differences in stock returns over the period 1957 to 2016.

However, this sample period coincides with the CRSP era. And ultimately, ML models also require out-of-sample testing in independent samples, similar to traditional factor tests. The authors therefore applied the most promising ML techniques (random forest and neural network models) to the new 61-year sample period. They noted that the ML methods also worked in the pre-CRSP stage, as both models delivered significant CAPM alphas. As such, the research outlines that ML tools offer valuable information in terms of understanding the cross-section of stock returns.

In conclusion, this deep historical research underlines that factor premiums are not very dependent on specific market regimes, nor specific market structures. Instead, they are probably an ‘eternal’ feature of financial markets.

1 See: Harvey, C. R., July 2017, “Presidential address: the scientific outlook in financial economics”, Journal of Finance.
2 See: Baltussen, G., Van Vliet, B. P., and Van Vliet, P., November 2021, “The cross-section of stock returns before 1926 (and beyond)”, working paper.
3 See: Fama, E. F., and MacBeth, J. D., June 1973, “Risk, return, and equilibrium: empirical tests”, Journal of Political Economy.
4 See: Gu, S., Kelly, B., and Xiu, D., February 2020, “Empirical asset pricing via machine learning”, The Review of Financial Studies.

Important information

The contents of this document have not been reviewed by any regulatory authority in Hong Kong. If you are in any doubt about any of the contents of this document, you should obtain independent professional advice. This document has been distributed by Robeco Hong Kong Limited (‘Robeco’). Robeco is regulated by the Securities and Futures Commission in Hong Kong.
This document has been prepared on a confidential basis solely for the recipient and is for information purposes only. Any reproduction or distribution of this documentation, in whole or in part, or the disclosure of its contents, without the prior written consent of Robeco, is prohibited. By accepting this documentation, the recipient agrees to the foregoing
This document is intended to provide the reader with information on Robeco’s specific capabilities, but does not constitute a recommendation to buy or sell certain securities or investment products. Investment decisions should only be based on the relevant prospectus and on thorough financial, fiscal and legal advice.
The contents of this document are based upon sources of information believed to be reliable. This document is not intended for distribution to or use by any person or entity in any jurisdiction or country where such distribution or use would be contrary to local law or regulation.
Investment Involves risks. Historical returns are provided for illustrative purposes only and do not necessarily reflect Robeco’s expectations for the future. The value of your investments may fluctuate. Past performance is no indication of current or future performance.



1. 一般事項


此網站由Robeco Hong Kong Limited(「荷寶」)擬備及刊發,荷寶是獲香港證券及期貨事務監察委員會發牌從事第1類(證券交易)、第4類(就證券提供意見)及第9類(資產管理)受規管活動的企業。荷寶不持有客戶資產,並受到發牌條件所規限。荷寶在擴展至零售業務之前,必須先得到證監會的批准。本網頁未經證券及期貨事務監察委員會或香港的任何監管當局審閱。

2. 風險披露聲明

Robeco Capital Growth Funds以其特定的投資政策或其他特徵作識別,請小心閱讀有關Robeco Capital Growth Funds的風險:

  • 部份基金可涉及投資、市場、股票投資、流動性、交易對手、證券借貸及外幣風險及小型及/或中型公司的相關風險。
  • 部份基金所涉及投資於新興市場的風險包括政治、經濟、法律、規管、市場、結算、執行交易、交易對手及貨幣風險。
  • 部份基金可透過合格境外機構投資者("QFII")及/或 人民幣合格境外機構投資者 ("RQFII")及/或 滬港通計劃直接投資於中國A股,當中涉及額外的結算、規管、營運、交易對手及流動性風險。
  • 就分派股息類別,部份基金可能從資本中作出股息分派。股息分派若直接從資本中撥付,這代表投資者獲付還或提取原有投資本金的部份金額或原有投資應佔的任何資本收益,該等分派可能導致基金的每股資產淨值即時減少。
  • 部份基金投資可能集中在單一地區/單一國家/相同行業及/或相同主題營運。 因此,基金的價值可能會較為波動。
  • 部份基金使用的任何量化技巧可能無效,可能對基金的價值構成不利影響。
  • 除了投資、市場、流動性、交易對手、證券借貸、(反向)回購協議及外幣風險,部份基金可涉及定息收入投資有關的風險包括信貨風險、利率風險、可換股債券的風險、資產抵押證券的的風險、投資於非投資級別或不獲評級證券的風險及投資於未達投資級別主權證券的風險。
  • 部份基金可大量運用金融衍生工具。荷寶環球消費新趨勢股票可為對沖目的及為有效投資組合管理而運用金融衍生工具。運用金融衍生工具可涉及較高的交易對手、流通性及估值的風險。在不利的情況下,部份基金可能會因為使用金融衍生工具而承受重大虧損(甚至損失基金資產的全部)。
  • 荷寶歐洲高收益債券可涉及投資歐元區的風險。
  • 投資者在Robeco Capital Growth Funds的投資有可能大幅虧損。投資者應該參閱Robeco Capital Growth Funds之銷售文件內的資料﹙包括潛在風險﹚,而不應只根據這文件內的資料而作出投資。

3. 當地的法律及銷售限制




4. 使用此網站



5. 投資表現



6. 第三者網站

本網站含有來自第三方的資料或第三方經營的網站連結,而其中部分該等公司與荷寶沒有任何聯繫。跟隨連結登入任何其他此網站以外的網頁或第三方網站的風險,應由跟隨該連結的人士自行承擔。荷寶並無審閱此網站所連結或提述的任何網站,概不就該等網站的內容或所提供的產品、服務或其他項目作出推許或負上任何責任。荷寶概不就使用或依賴第三方網站所載的資料而導致的任何虧損或損毀負上法侓責任,包括(但不限於)任何虧損或利益或任何其他直接或間接的損毀。 此網站以外的網頁或第三方網站皆旨在作參考之用。

7. 責任限制




8. 知識產權


9. 私隠

荷寶保證將會根據現行的資料保障法例,以保密方式處理登入此網站的人士的數據。除非荷寶需按法律責任行事,否則在未經登入此網站的人士許可,不會向第三方提供該等數據。 請於我們的私隱及Cookie政策 中查找更多詳情。 

10. 適用法律


如果您已閱讀並理解本頁並同意上述免責聲明以及同意荷寶收集和使用您的個人資料,用於私隱及Cookie政策 所列的收集和使用個人資料的目的(包括用於直接推廣荷寶的產品或服務),請點擊“我同意”按鈕。否則,請點擊“我不同意”離開本網站。