IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2004.01496.html
   My bibliography  Save this paper

Company classification using machine learning

Author

Listed:
  • Sven Husmann
  • Antoniya Shivarova
  • Rick Steinert

Abstract

The recent advancements in computational power and machine learning algorithms have led to vast improvements in manifold areas of research. Especially in finance, the application of machine learning enables both researchers and practitioners to gain new insights into financial data and well-studied areas such as company classification. In our paper, we demonstrate that unsupervised machine learning algorithms can be used to visualize and classify company data in an economically meaningful and effective way. In particular, we implement the data-driven dimension reduction and visualization tool t-distributed stochastic neighbor embedding (t-SNE) in combination with spectral clustering. The resulting company groups can then be utilized by experts in the field for empirical analysis and optimal decision making. By providing an exemplary out-of-sample study within a portfolio optimization framework, we show that the application of t-SNE and spectral clustering improves the overall portfolio performance. Therefore, we introduce our approach to the financial community as a valuable technique in the context of data analysis and company classification.

Suggested Citation

  • Sven Husmann & Antoniya Shivarova & Rick Steinert, 2020. "Company classification using machine learning," Papers 2004.01496, arXiv.org, revised May 2020.
  • Handle: RePEc:arx:papers:2004.01496
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2004.01496
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. DeMiguel, Victor & Martin-Utrera, Alberto & Nogales, Francisco J., 2013. "Size matters: Optimal calibration of shrinkage estimators for portfolio selection," Journal of Banking & Finance, Elsevier, vol. 37(8), pages 3018-3034.
    2. Tu, Jun & Zhou, Guofu, 2011. "Markowitz meets Talmud: A combination of sophisticated and naive diversification strategies," Journal of Financial Economics, Elsevier, vol. 99(1), pages 204-215, January.
    3. Jorion, Philippe, 1991. "Bayesian and CAPM estimators of the means: Implications for portfolio selection," Journal of Banking & Finance, Elsevier, vol. 15(3), pages 717-727, June.
    4. Sven Husmann & Antoniya Shivarova & Rick Steinert, 2019. "Sparsity and Stability for Minimum-Variance Portfolios," Papers 1910.11840, arXiv.org.
    5. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    6. Green, Richard C & Hollifield, Burton, 1992. "When Will Mean-Variance Efficient Portfolios Be Well Diversified?," Journal of Finance, American Finance Association, vol. 47(5), pages 1785-1809, December.
    7. Best, Michael J & Grauer, Robert R, 1991. "On the Sensitivity of Mean-Variance-Efficient Portfolios to Changes in Asset Means: Some Analytical and Computational Results," The Review of Financial Studies, Society for Financial Studies, vol. 4(2), pages 315-342.
    8. Victor DeMiguel & Lorenzo Garlappi & Raman Uppal, 2009. "Optimal Versus Naive Diversification: How Inefficient is the 1-N Portfolio Strategy?," The Review of Financial Studies, Society for Financial Studies, vol. 22(5), pages 1915-1953, May.
    9. Olivier Ledoit & Michael Wolf, 2017. "Nonlinear Shrinkage of the Covariance Matrix for Portfolio Selection: Markowitz Meets Goldilocks," The Review of Financial Studies, Society for Financial Studies, vol. 30(12), pages 4349-4388.
    10. Ledoit, Olivier & Wolf, Michael, 2003. "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection," Journal of Empirical Finance, Elsevier, vol. 10(5), pages 603-621, December.
    11. Ravi Jagannathan & Tongshu Ma, 2003. "Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps," Journal of Finance, American Finance Association, vol. 58(4), pages 1651-1683, August.
    12. Kan, Raymond & Zhou, Guofu, 2007. "Optimal Portfolio Choice with Parameter Uncertainty," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 42(3), pages 621-656, September.
    13. Victor DeMiguel & Lorenzo Garlappi & Francisco J. Nogales & Raman Uppal, 2009. "A Generalized Approach to Portfolio Optimization: Improving Performance by Constraining Portfolio Norms," Management Science, INFORMS, vol. 55(5), pages 798-812, May.
    14. Gah-Yi Ban & Noureddine El Karoui & Andrew E. B. Lim, 2018. "Machine Learning and Portfolio Optimization," Management Science, INFORMS, vol. 64(3), pages 1136-1154, March.
    15. B. Fastrich & S. Paterlini & P. Winker, 2015. "Constructing optimal sparse portfolios using regularization methods," Computational Management Science, Springer, vol. 12(3), pages 417-434, July.
    16. Qiong Wu & Christopher G. Brinton & Zheng Zhang & Andrea Pizzoferrato & Zhenming Liu & Mihai Cucuringu, 2019. "Equity2Vec: End-to-end Deep Learning Framework for Cross-sectional Asset Pricing," Papers 1909.04497, arXiv.org, revised Oct 2021.
    17. Ravi Jagannathan & Tongshu Ma, 2003. "Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps," Journal of Finance, American Finance Association, vol. 58(4), pages 1651-1684, August.
    18. Karatzoglou, Alexandros & Smola, Alexandros & Hornik, Kurt & Zeileis, Achim, 2004. "kernlab - An S4 Package for Kernel Methods in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i09).
    19. Goto, Shingo & Xu, Yan, 2015. "Improving Mean Variance Optimization through Sparse Hedging Restrictions," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 50(6), pages 1415-1441, December.
    20. Andrew F. Siegel & Artemiza Woodgate, 2007. "Performance of Portfolios Optimized with Estimation Error," Management Science, INFORMS, vol. 53(6), pages 1005-1015, June.
    21. Jianqing Fan & Jingjin Zhang & Ke Yu, 2012. "Vast Portfolio Selection With Gross-Exposure Constraints," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 592-606, June.
    22. Sven Husmann & Antoniya Shivarova & Rick Steinert, 2019. "Cross-validated covariance estimators for high-dimensional minimum-variance portfolios," Papers 1910.13960, arXiv.org, revised Oct 2020.
    23. Prayut Jain & Shashi Jain, 2019. "Can Machine Learning-Based Portfolios Outperform Traditional Risk-Based Portfolios? The Need to Account for Covariance Misspecification," Risks, MDPI, vol. 7(3), pages 1-27, July.
    24. Dale L. Domian & David A. Louton & Marie D. Racine, 2007. "Diversification in Portfolios of Individual Stocks: 100 Stocks Are Not Enough," The Financial Review, Eastern Finance Association, vol. 42(4), pages 557-570, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Simerjot Kaur & Andrea Stefanucci & Sameena Shah, 2023. "InProC: Industry and Product/Service Code Classification," Papers 2305.13532, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Füss, Roland & Miebs, Felix & Trübenbach, Fabian, 2014. "A jackknife-type estimator for portfolio revision," Journal of Banking & Finance, Elsevier, vol. 43(C), pages 14-28.
    2. Hongseon Kim & Soonbong Lee & Seung Bum Soh & Seongmoon Kim, 2022. "Improving portfolio investment performance with distance‐based portfolio‐combining algorithms," Journal of Financial Research, Southern Finance Association;Southwestern Finance Association, vol. 45(4), pages 941-959, December.
    3. Chavez-Bedoya, Luis & Rosales, Francisco, 2021. "Reduction of estimation risk in optimal portfolio choice using redundant constraints," International Review of Financial Analysis, Elsevier, vol. 78(C).
    4. Sven Husmann & Antoniya Shivarova & Rick Steinert, 2022. "Sparsity and stability for minimum-variance portfolios," Risk Management, Palgrave Macmillan, vol. 24(3), pages 214-235, September.
    5. Wolfgang Karl Hardle & Yegor Klochkov & Alla Petukhina & Nikita Zhivotovskiy, 2022. "Robustifying Markowitz," Papers 2212.13996, arXiv.org.
    6. Sven Husmann & Antoniya Shivarova & Rick Steinert, 2019. "Sparsity and Stability for Minimum-Variance Portfolios," Papers 1910.11840, arXiv.org.
    7. Zhu, Bo & Zhang, Tianlun, 2021. "Long-term wealth growth portfolio allocation under parameter uncertainty: A non-conservative robust approach," The North American Journal of Economics and Finance, Elsevier, vol. 57(C).
    8. Ding, Yi & Li, Yingying & Zheng, Xinghua, 2021. "High dimensional minimum variance portfolio estimation under statistical factor models," Journal of Econometrics, Elsevier, vol. 222(1), pages 502-515.
    9. Härdle, Wolfgang & Klochkov, Yegor & Petukhina, Alla & Zhivotovskiy, Nikita, 2021. "Robustifying Markowitz," IRTG 1792 Discussion Papers 2021-018, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    10. Seyoung Park & Eun Ryung Lee & Sungchul Lee & Geonwoo Kim, 2019. "Dantzig Type Optimization Method with Applications to Portfolio Selection," Sustainability, MDPI, vol. 11(11), pages 1-32, June.
    11. Wang, Christina Dan & Chen, Zhao & Lian, Yimin & Chen, Min, 2022. "Asset selection based on high frequency Sharpe ratio," Journal of Econometrics, Elsevier, vol. 227(1), pages 168-188.
    12. Paolella, Marc S. & Polak, Paweł & Walker, Patrick S., 2021. "A non-elliptical orthogonal GARCH model for portfolio selection under transaction costs," Journal of Banking & Finance, Elsevier, vol. 125(C).
    13. Lassance, Nathan & Vanderveken, Rodolphe & Vrins, Frédéric, 2022. "On the optimal combination of naive and mean-variance portfolio strategies," LIDAM Discussion Papers LFIN 2022006, Université catholique de Louvain, Louvain Finance (LFIN).
    14. Thomas Conlon & John Cotter & Iason Kynigakis, 2021. "Machine Learning and Factor-Based Portfolio Optimization," Papers 2107.13866, arXiv.org.
    15. Hsu, Po-Hsuan & Han, Qiheng & Wu, Wensheng & Cao, Zhiguang, 2018. "Asset allocation strategies, data snooping, and the 1 / N rule," Journal of Banking & Finance, Elsevier, vol. 97(C), pages 257-269.
    16. Kourtis, Apostolos & Dotsis, George & Markellos, Raphael N., 2012. "Parameter uncertainty in portfolio selection: Shrinking the inverse covariance matrix," Journal of Banking & Finance, Elsevier, vol. 36(9), pages 2522-2531.
    17. Johannes Bock, 2018. "An updated review of (sub-)optimal diversification models," Papers 1811.08255, arXiv.org.
    18. Ding, Wenliang & Shu, Lianjie & Gu, Xinhua, 2023. "A robust Glasso approach to portfolio selection in high dimensions," Journal of Empirical Finance, Elsevier, vol. 70(C), pages 22-37.
    19. Yen, Yu-Min & Yen, Tso-Jung, 2014. "Solving norm constrained portfolio optimization via coordinate-wise descent algorithms," Computational Statistics & Data Analysis, Elsevier, vol. 76(C), pages 737-759.
    20. Hwang, Inchang & Xu, Simon & In, Francis, 2018. "Naive versus optimal diversification: Tail risk and performance," European Journal of Operational Research, Elsevier, vol. 265(1), pages 372-388.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2004.01496. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.