21-03-2023 · Insight

Quant chart: how NLP can anticipate GICS changes

The recent changes to the global industry classification standards (GICS) illustrate their rigid and sluggish nature. This article argues that natural language processing (NLP) techniques can offer additional insights in today’s fast-changing market environment.

    Authors

  • Matthias Hanauer - Researcher

    Matthias Hanauer

    Researcher

  • Rob Huisman - Researcher

    Rob Huisman

    Researcher

The GICS is the classic framework to classify similar firms into sectors, industry groups, industries and sub-industries. But the GICS methodology is rigid. Revisions are infrequent and take years to implement, as they involve extensive consultations with market participants. As a result, alternative methods of classification have been suggested based on customer-supplier data, textual similarities in companies’ 10-K business descriptions, comparable technologies based on patent data or shared analyst coverage.

One of the major changes in the recent GICS revision is the creation of the new sub-industry transaction and payment processing services under the financials sector. This new sub-industry will include companies such as Visa, Mastercard and Paypal, which were previously included in the data processing & outsourced services sub-industry, under the software & services industry group and the information technology sector.

The change reflects both the increasing role these companies play in facilitating payments across various platforms and markets, and the fact that these activities are closely aligned with the business activities covered under the financial services industry group. However, this change only took effect on 17 March 2023, two years after the first consultation on the subject started.1

Text-based stock clustering (TBSC) is an interesting alternative to GICS. It uses NLP techniques to analyze textual data from various sources, such as 10-K reports. TBSC has several advantages over GICS:

  • TBSC can be more adaptive and flexible because it can update its classifications more frequently based on new information.

  • TBSC can be more granular and accurate because it can capture the similarities and differences among companies within or across sectors based on their specific products or services.

  • TBSC can be more informative and insightful because it provides explanations for its classifications based on textual evidence.

As technology advances, so do the opportunities for quantitative investors. By incorporating more data and leveraging advanced modelling techniques, we can develop deeper insights and enhance decision-making.

To illustrate these advantages, Figure 1 shows a 2D projection of company-specific vector embeddings derived from 10-K filings using the bidirectional encoder representations from transformers (BERT) model. We use 10-K reports for the fiscal year 2021 as input for the model to test whether the NLP technique could already anticipate the current GICS revisions.

The results show that the transaction and payment processing services companies – such as Visa, Mastercard and Paypal (light blue) – are indeed closer to their new industry group financial services (green) than their previous industry group software and services (brown). This finding suggests that TBSC can anticipate changes in GICS before they are officially implemented. However, we also find that the financial services industry group is rather heterogeneous compared to other industry groups such as banks, insurance, or semiconductors & semiconductor equipment.

Figure 1 | 2D projection of word embeddings based on 10-K filings for the fiscal year 2021.

Figure  1  |  2D projection of word embeddings based on 10-K filings for the fiscal year 2021.

Source: SEC, Refinitiv, Robeco. The figure shows a 2D projection of numerical embeddings derived from BERT based on firms’ 10-K filings for the fiscal year 2021. The analysis is restricted to MSCI USA Index constituents augmented with large and liquid constituents of the FTSE World Developed and S&P Broad Market Index. The different colors indicate different GICS industry groups within the Information Technology (Software & Services, Technology Hardware & Equipment, and Semiconductors & Semiconductor Equipment) and Financials (Banks, Financial Services, and Insurance) sectors. Furthermore, the stocks from the newly created Transaction and Payment Processing Services sub-industry under the Financial Services industry group are highlighted. Previously, these stocks were included in the Software & Services industry group.

In conclusion, TBSC might be a better and more timely alternative to standard sector or industry classifications, such as GICS. By using NLP techniques to analyze textual data from various sources, TBSC can provide more adaptive, granular, accurate, informative and insightful classifications for stock analysis.

Footnote

1 For example, the consultation of potential changes already started in 2021, were announced in March 2022, but only become effective in March 2023.

Let's keep the conversation going

Keep track of fast-moving events in sustainable and quantitative investing, trends and credits with our newsletters.

Stay updated
Robeco

Robeco aims to enable its clients to achieve their financial and sustainability goals by providing superior investment returns and solutions.

Important information This disclaimer applies to any documents and the verbal or written comments of any person in presentations or webinars on this website and taken together is referred to herein as the “Information”. The services to which the Information relate are NOT FOR RETAIL CLIENTS - The information contained in the Website is solely intended for professional investors, defined as investors which (1) qualify as professional clients within the meaning of the Markets in Financial Instruments Directive (MiFID), (2) have requested to be treated as professional clients within the meaning of the MiFID or (3) are authorized to receive such information under any other applicable laws and must not be relied or acted upon by any other persons. This Information does not constitute an offer to sell, or a solicitation of an offer to buy, any financial product, and may not be relied upon in connection with the purchase or sale of any financial product. You are cautioned against using this Information as the basis for making a decision to purchase any financial product. To the extent that you rely on the Information in connection with any investment decision, you do so at your own risk. The Information does not purport to be complete on any topic addressed. The Information may contain data or analysis prepared by third parties and no representation or warranty about the accuracy of such data or analysis is provided.

In all cases where historical performance is presented, please note that past performance is not a reliable indicator of future results and should not be relied upon as the basis for making an investment decision. Investors may not get back the amount originally invested. Neither Robeco Institutional Asset Management B.V. nor any of its affiliates guarantees the performance or the future returns of any investments. If the currency in which the past performance is displayed differs from the currency of the country in which you reside, then you should be aware that due to exchange rate fluctuations the performance shown may increase or decrease if converted into your local currency. Robeco Institutional Asset Management B.V. (“Robeco”) expressly prohibits any redistribution of the Information without the prior written consent of Robeco. The Information is not intended for distribution to, or use by, any person or entity in any jurisdiction or country where such distribution or use is contrary to law, rule or regulation. Certain information contained in the Information includes calculations or figures that have been prepared internally and have not been audited or verified by a third party. Use of different methods for preparing, calculating or presenting information may lead to different results. Robeco Institutional Asset Management B.V. is authorised as a manager of UCITS and AIFs by the Netherlands Authority for the Financial Markets and subject to limited regulation in the UK by the Financial Conduct Authority. Details about the extent of our regulation by the Financial Conduct Authority are available from us on request.