IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2006.00123.html
   My bibliography  Save this paper

Machine Learning Fund Categorizations

Author

Listed:
  • Dhagash Mehta
  • Dhruv Desai
  • Jithin Pradeep

Abstract

Given the surge in popularity of mutual funds (including exchange-traded funds (ETFs)) as a diversified financial investment, a vast variety of mutual funds from various investment management firms and diversification strategies have become available in the market. Identifying similar mutual funds among such a wide landscape of mutual funds has become more important than ever because of many applications ranging from sales and marketing to portfolio replication, portfolio diversification and tax loss harvesting. The current best method is data-vendor provided categorization which usually relies on curation by human experts with the help of available data. In this work, we establish that an industry wide well-regarded categorization system is learnable using machine learning and largely reproducible, and in turn constructing a truly data-driven categorization. We discuss the intellectual challenges in learning this man-made system, our results and their implications.

Suggested Citation

  • Dhagash Mehta & Dhruv Desai & Jithin Pradeep, 2020. "Machine Learning Fund Categorizations," Papers 2006.00123, arXiv.org.
  • Handle: RePEc:arx:papers:2006.00123
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2006.00123
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. McDonald, John G., 1974. "Objectives and Performance of Mutual Funds, 1960–1969," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 9(3), pages 311-333, June.
    2. Miceli, M.A. & Susinno, G., 2004. "Ultrametricity in fund of funds diversification," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 344(1), pages 95-99.
    3. Athanasios Orphanides, "undated". "Compensation Incentives and Risk Taking Behavior: Evidence from Mutual Funds," Finance and Economics Discussion Series 1996-21, Board of Governors of the Federal Reserve System (U.S.), revised 10 Dec 2019.
    4. Fan Cai & Nhien-An Le-Khac & Tahar Kechadi, 2016. "Clustering Approaches for Financial Data Analysis: a Survey," Papers 1609.08520, arXiv.org.
    5. Pattarin, Francesco & Paterlini, Sandra & Minerva, Tommaso, 2004. "Clustering financial time series: an application to mutual funds style analysis," Computational Statistics & Data Analysis, Elsevier, vol. 47(2), pages 353-372, September.
    6. Rajna Gibson & Sébastien Gyger, 2007. "The Style Consistency of Hedge Funds," European Financial Management, European Financial Management Association, vol. 13(2), pages 287-308, March.
    7. Moreno, David & Marco, Paulina & Olmeda, Ignacio, 2006. "Self-organizing maps could improve the classification of Spanish mutual funds," European Journal of Operational Research, Elsevier, vol. 174(2), pages 1039-1054, October.
    8. Edwin J. Elton & Martin J. Gruber & Christopher R. Blake, 2003. "Incentive Fees and Mutual Funds," Journal of Finance, American Finance Association, vol. 58(2), pages 779-804, April.
    9. Corduas, Marcella & Piccolo, Domenico, 2008. "Time series clustering and classification by the autoregressive metric," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 1860-1872, January.
    10. Nandita Das, 2003. "Hedge Fund Classification using K-means Clustering Method," Computing in Economics and Finance 2003 284, Society for Computational Economics.
    11. repec:onb:oenbwp:y:2005:i:9:b:1 is not listed on IDEAS
    12. Brown, Stephen J. & Goetzmann, William N., 1997. "Mutual fund styles," Journal of Financial Economics, Elsevier, vol. 43(3), pages 373-399, March.
    13. Kim, Moon & Shukla, Ravi & Tomas, Michael, 2000. "Mutual fund objective misclassification," Journal of Economics and Business, Elsevier, vol. 52(4), pages 309-323.
    14. Ramin Baghai-Wadj & Rami El-Berry & Stefan Klocker & Markus Schwaiger, 2005. "The Consistency of Self-Declared Hedge Fund Styles — A Return-Based Analysis with Self-Organizing Maps," Financial Stability Report, Oesterreichische Nationalbank (Austrian Central Bank), issue 9, pages 64-76.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Victor DeMiguel & Javier Gil-Bazo & Francisco J. Nogales & André A. P. Santos, 2021. "Can Machine Learning Help to Select Portfolios of Mutual Funds?," Working Papers 1245, Barcelona School of Economics.
    2. Vipul Satone & Dhruv Desai & Dhagash Mehta, 2021. "Fund2Vec: Mutual Funds Similarity using Graph Learning," Papers 2106.12987, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Vipul Satone & Dhruv Desai & Dhagash Mehta, 2021. "Fund2Vec: Mutual Funds Similarity using Graph Learning," Papers 2106.12987, arXiv.org.
    2. Dhruv Desai & Ashmita Dhiman & Tushar Sharma & Deepika Sharma & Dhagash Mehta & Stefano Pasquali, 2023. "Quantifying Outlierness of Funds from their Categories using Supervised Similarity," Papers 2308.06882, arXiv.org.
    3. Jerinsh Jeyapaulraj & Dhruv Desai & Peter Chu & Dhagash Mehta & Stefano Pasquali & Philip Sommer, 2022. "Supervised similarity learning for corporate bonds using Random Forest proximities," Papers 2207.04368, arXiv.org, revised Oct 2022.
    4. Liu, Shen & Maharaj, Elizabeth Ann & Inder, Brett, 2014. "Polarization of forecast densities: A new approach to time series classification," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 345-361.
    5. Sensoy, Berk A., 2009. "Performance evaluation and self-designated benchmark indexes in the mutual fund industry," Journal of Financial Economics, Elsevier, vol. 92(1), pages 25-39, April.
    6. Anjum, Sohail & Qayyum, Unbreen & Qureshi, Madeeha Gohar, 2019. "Aggregate performance evaluation of US Equity Mutual Funds - Explaining the performance of Growth Funds vs. Value Funds," MPRA Paper 100043, University Library of Munich, Germany.
    7. repec:onb:oenbwp:y:2005:i:9:b:1 is not listed on IDEAS
    8. E. Otranto, 2011. "Classification of Volatility in Presence of Changes in Model Parameters," Working Paper CRENoS 201113, Centre for North South Economic Research, University of Cagliari and Sassari, Sardinia.
    9. Prather, Larry J. & Middleton, Karen L., 2002. "Are N+1 heads better than one?: The case of mutual fund managers," Journal of Economic Behavior & Organization, Elsevier, vol. 47(1), pages 103-120, January.
    10. Luca De Angelis, 2013. "Latent class models for financial data analysis: some statistical developments," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 22(2), pages 227-242, June.
    11. Giovanni De Luca & Paola Zuccolotto, 2011. "A tail dependence-based dissimilarity measure for financial time series clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 5(4), pages 323-340, December.
    12. De Luca Giovanni & Zuccolotto Paola, 2017. "A double clustering algorithm for financial time series based on extreme events," Statistics & Risk Modeling, De Gruyter, vol. 34(1-2), pages 1-12, June.
    13. Victor DeMiguel & Javier Gil-Bazo & Francisco J. Nogales & André A. P. Santos, 2021. "Can Machine Learning Help to Select Portfolios of Mutual Funds?," Working Papers 1245, Barcelona School of Economics.
    14. Alexander Kempf & Peer Osthoff, 2008. "SRI Funds: Nomen est Omen," Journal of Business Finance & Accounting, Wiley Blackwell, vol. 35(9-10), pages 1276-1294.
    15. Liu, Shen & Maharaj, Elizabeth Ann, 2013. "A hypothesis test using bias-adjusted AR estimators for classifying time series in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 60(C), pages 32-49.
    16. Otranto, Edoardo, 2008. "Clustering heteroskedastic time series by model-based procedures," Computational Statistics & Data Analysis, Elsevier, vol. 52(10), pages 4685-4698, June.
    17. Irina Bezhentseva Mateus & Cesario Mateus & Natasa Todorovic, 2019. "Benchmark-adjusted performance of US equity mutual funds and the issue of prospectus benchmarks," Journal of Asset Management, Palgrave Macmillan, vol. 20(1), pages 15-30, February.
    18. Moreno, David & Marco, Paulina & Olmeda, Ignacio, 2006. "Self-organizing maps could improve the classification of Spanish mutual funds," European Journal of Operational Research, Elsevier, vol. 174(2), pages 1039-1054, October.
    19. Peng, Hongfeng & Zhang, Zhenqi & Goodell, John W. & Li, Mingsheng, 2023. "Socially responsible investing: Is it for real or just for show?," International Review of Financial Analysis, Elsevier, vol. 86(C).
    20. Fabrizio Durante & Roberta Pappadà & Nicola Torelli, 2014. "Clustering of financial time series in risky scenarios," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(4), pages 359-376, December.
    21. Alexander Kempf & Peer Osthoff, 2008. "SRI Funds: Nomen est Omen," Journal of Business Finance & Accounting, Wiley Blackwell, vol. 35(9‐10), pages 1276-1294, November.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2006.00123. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.