Statistical Clustering Techniques in Stock and Bond Selection

Financial data is large, multi-dimensional and often unstructured. In order to reduce the dimensionality and analyze such data, researchers have traditionally relied on their knowledge and experience to categorize it in useful groups, such as industries. 

In recent years, however, advances in statistical & machine learning techniques, increased computational power, and the availability of new data sources have offered new possibilities for data analysis. Since categorical groupings and discovering commonalities are of central importance to stock and bond selection, exploring clustering techniques with unsupervised learning has become one of the most promising areas of quantitative financial research.  

The Robeco Quantitative Research (QR) Department has come up with a number of research questions to be investigated using clustering techniques with the aim to uncover non-trivial structure in the financial markets, and we are looking for exceptional candidates to join us in exploring them. Breakthroughs in answering these question can improve the investment process across the board and add a tremendous amount of value.

Financial knowledge is not a pre-requisite, but creativity, programming skills and a strong understanding of statistical learning methods, as well as the willingness to work with large and messy data, are essential in order to successfully complete the project.

Are you interested?
Let us know your motivation and send it together with your top-3 favorite internship topics, your CV and list of grades to
Previous projects


[1] The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Hastie, T. and Tibshirani, R. and Friedman, J.H., Springer 2009

[2] Statistical Industry Classification,  Kakushadze Z. and and Yu W., arXiv 2016

[3] Creating Diversified Portfolios Using Cluster Analysis, Marvin K. and Bhatt S., 2015