IDEAS home Printed from https://ideas.repec.org/a/oup/rfinst/v33y2020i5p2223-2273..html
   My bibliography  Save this article

Empirical Asset Pricing via Machine Learning

Author

Listed:
  • Shihao Gu
  • Bryan Kelly
  • Dacheng Xiu

Abstract

We perform a comparative analysis of machine learning methods for the canonical problem of empirical asset pricing: measuring asset risk premiums. We demonstrate large economic gains to investors using machine learning forecasts, in some cases doubling the performance of leading regression-based strategies from the literature. We identify the best-performing methods (trees and neural networks) and trace their predictive gains to allowing nonlinear predictor interactions missed by other methods. All methods agree on the same set of dominant predictive signals, a set that includes variations on momentum, liquidity, and volatility.Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online.

Suggested Citation

  • Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
  • Handle: RePEc:oup:rfinst:v:33:y:2020:i:5:p:2223-2273.
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/rfs/hhaa009
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Ivo Welch & Amit Goyal, 2008. "A Comprehensive Look at The Empirical Performance of Equity Premium Prediction," Review of Financial Studies, Society for Financial Studies, vol. 21(4), pages 1455-1508, July.
    2. Wayne E. Ferson & Campbell R. Harvey, 1999. "Conditioning Variables and the Cross Section of Stock Returns," Journal of Finance, American Finance Association, vol. 54(4), pages 1325-1360, August.
    3. Johnstone, Iain M. & Lu, Arthur Yu, 2009. "On Consistency and Sparsity for Principal Components Analysis in High Dimensions," Journal of the American Statistical Association, American Statistical Association, vol. 104(486), pages 682-693.
    4. Butaru, Florentin & Chen, Qingqing & Clark, Brian & Das, Sanmay & Lo, Andrew W. & Siddique, Akhtar, 2016. "Risk and risk management in the credit card industry," Journal of Banking & Finance, Elsevier, vol. 72(C), pages 218-239.
    5. Ralph S.J. Koijen & Stijn Van Nieuwerburgh, 2011. "Predictability of Returns and Cash Flows," Annual Review of Financial Economics, Annual Reviews, vol. 3(1), pages 467-491, December.
    6. Lewellen, Jonathan, 2015. "The Cross-section of Expected Stock Returns," Critical Finance Review, now publishers, vol. 4(1), pages 1-44, June.
    7. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    8. Guanhao Feng & Stefano Giglio & Dacheng Xiu, 2020. "Taming the Factor Zoo: A Test of New Factors," Journal of Finance, American Finance Association, vol. 75(3), pages 1327-1370, June.
    9. Hutchinson, James M & Lo, Andrew W & Poggio, Tomaso, 1994. "A Nonparametric Approach to Pricing and Hedging Derivative Securities via Learning Networks," Journal of Finance, American Finance Association, vol. 49(3), pages 851-889, July.
    10. Kozak, Serhiy & Nagel, Stefan & Santosh, Shrihari, 2020. "Shrinking the cross-section," Journal of Financial Economics, Elsevier, vol. 135(2), pages 271-292.
    11. Diebold, Francis X & Mariano, Roberto S, 2002. "Comparing Predictive Accuracy," Journal of Business & Economic Statistics, American Statistical Association, vol. 20(1), pages 134-144, January.
    12. Khandani, Amir E. & Kim, Adlar J. & Lo, Andrew W., 2010. "Consumer credit-risk models via machine-learning algorithms," Journal of Banking & Finance, Elsevier, vol. 34(11), pages 2767-2787, November.
    13. Francis X. Diebold, 2015. "Comparing Predictive Accuracy, Twenty Years Later: A Personal Perspective on the Use and Abuse of Diebold-Mariano Tests," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(1), pages 1-1, January.
    14. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    15. Lo, Andrew W & MacKinlay, A Craig, 1990. "Data-Snooping Biases in Tests of Financial Asset Pricing Models," Review of Financial Studies, Society for Financial Studies, vol. 3(3), pages 431-467.
    16. Robert Tibshirani, 2011. "Regression shrinkage and selection via the lasso: a retrospective," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(3), pages 273-282, June.
    17. Gu, Shihao & Kelly, Bryan & Xiu, Dacheng, 2021. "Autoencoder asset pricing models," Journal of Econometrics, Elsevier, vol. 222(1), pages 429-450.
    18. Joachim Freyberger & Andreas Neuhierl & Michael Weber, 2020. "Dissecting Characteristics Nonparametrically," Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2326-2377.
    19. Jeremiah Green & John R. M. Hand & X. Frank Zhang, 2017. "The Characteristics that Provide Independent Information about Average U.S. Monthly Stock Returns," Review of Financial Studies, Society for Financial Studies, vol. 30(12), pages 4389-4436.
    20. Pradeep Ravikumar & John Lafferty & Han Liu & Larry Wasserman, 2009. "Sparse additive models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(5), pages 1009-1030, November.
    21. White, Halbert, 1980. "Using Least Squares to Approximate Unknown Regression Functions," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 21(1), pages 149-170, February.
    22. Rosenberg, Barr, 1974. "Extra-Market Components of Covariance in Security Returns," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 9(2), pages 263-274, March.
    23. Kelly, Bryan & Pruitt, Seth, 2015. "The three-pass regression filter: A new approach to forecasting using many predictors," Journal of Econometrics, Elsevier, vol. 186(2), pages 294-316.
    24. Bai, Jushan & Ng, Serena, 2013. "Principal components estimation and identification of static factors," Journal of Econometrics, Elsevier, vol. 176(1), pages 18-29.
    25. Bryan Kelly & Seth Pruitt, 2013. "Market Expectations in the Cross-Section of Present Values," Journal of Finance, American Finance Association, vol. 68(5), pages 1721-1756, October.
    26. John Y. Campbell & Samuel B. Thompson, 2008. "Predicting Excess Stock Returns Out of Sample: Can Anything Beat the Historical Average?," Review of Financial Studies, Society for Financial Studies, vol. 21(4), pages 1509-1531, July.
    27. Yao, Jingtao & Li, Yili & Tan, Chew Lim, 2000. "Option price forecasting using neural networks," Omega, Elsevier, vol. 28(4), pages 455-466, August.
    28. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    29. Stefano Giglio & Dacheng Xiu, 2021. "Asset Pricing with Omitted Factors," Journal of Political Economy, University of Chicago Press, vol. 129(7), pages 1947-1990.
    30. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    31. David E. Rapach & Jack K. Strauss & Guofu Zhou, 2013. "International Stock Return Predictability: What Is the Role of the United States?," Journal of Finance, American Finance Association, vol. 68(4), pages 1633-1662, August.
    32. Rapach, David & Zhou, Guofu, 2013. "Forecasting Stock Returns," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 328-383, Elsevier.
    33. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    34. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    35. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    36. Jianqing Fan & Quefeng Li & Yuyan Wang, 2017. "Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 247-265, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    2. Xi Dong & Yan Li & David E. Rapach & Guofu Zhou, 2022. "Anomalies and the Expected Market Return," Journal of Finance, American Finance Association, vol. 77(1), pages 639-681, February.
    3. Wolfgang Drobetz & Tizian Otto, 2021. "Empirical asset pricing via machine learning: evidence from the European stock market," Journal of Asset Management, Palgrave Macmillan, vol. 22(7), pages 507-538, December.
    4. Oleg Rytchkov & Xun Zhong, 2020. "Information Aggregation and P-Hacking," Management Science, INFORMS, vol. 66(4), pages 1605-1626, April.
    5. Alois Weigand, 2019. "Machine learning in empirical asset pricing," Financial Markets and Portfolio Management, Springer;Swiss Society for Financial Market Research, vol. 33(1), pages 93-104, March.
    6. Thomas Conlon & John Cotter & Iason Kynigakis, 2021. "Machine Learning and Factor-Based Portfolio Optimization," Papers 2107.13866, arXiv.org.
    7. Victor DeMiguel & Javier Gil-Bazo & Francisco J. Nogales & André A. P. Santos, 2021. "Can Machine Learning Help to Select Portfolios of Mutual Funds?," Working Papers 1245, Barcelona School of Economics.
    8. Koo, Bonsoo & Anderson, Heather M. & Seo, Myung Hwan & Yao, Wenying, 2020. "High-dimensional predictive regression in the presence of cointegration," Journal of Econometrics, Elsevier, vol. 219(2), pages 456-477.
    9. Vigo Pereira, Caio, 2021. "Portfolio efficiency with high-dimensional data as conditioning information," International Review of Financial Analysis, Elsevier, vol. 77(C).
    10. Buncic, Daniel & Tischhauser, Martin, 2017. "Macroeconomic factors and equity premium predictability," International Review of Economics & Finance, Elsevier, vol. 51(C), pages 621-644.
    11. Huang, Dashan & Li, Jiangyuan & Wang, Liyao, 2021. "Are disagreements agreeable? Evidence from information aggregation," Journal of Financial Economics, Elsevier, vol. 141(1), pages 83-101.
    12. Kelly, Bryan T. & Pruitt, Seth & Su, Yinan, 2019. "Characteristics are covariances: A unified model of risk and return," Journal of Financial Economics, Elsevier, vol. 134(3), pages 501-524.
    13. Mykola Babiak & Jozef Barunik, 2020. "Deep Learning, Predictability, and Optimal Portfolio Returns," CERGE-EI Working Papers wp677, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    14. Kozak, Serhiy & Nagel, Stefan & Santosh, Shrihari, 2020. "Shrinking the cross-section," Journal of Financial Economics, Elsevier, vol. 135(2), pages 271-292.
    15. Clarke, Charles, 2022. "The level, slope, and curve factor model for stocks," Journal of Financial Economics, Elsevier, vol. 143(1), pages 159-187.
    16. Barbara Rossi, 2019. "Forecasting in the presence of instabilities: How do we know whether models predict well and how to improve them," Economics Working Papers 1711, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2021.
    17. Xing, Li-Min & Zhang, Yue-Jun, 2022. "Forecasting crude oil prices with shrinkage methods: Can nonconvex penalty and Huber loss help?," Energy Economics, Elsevier, vol. 110(C).
    18. Dichtl, Hubert & Drobetz, Wolfgang & Neuhierl, Andreas & Wendt, Viktoria-Sophie, 2021. "Data snooping in equity premium prediction," International Journal of Forecasting, Elsevier, vol. 37(1), pages 72-94.
    19. Daniel Borup & Bent Jesper Christensen & Nicolaj N{o}rgaard Muhlbach & Mikkel Slot Nielsen, 2020. "Targeting predictors in random forest regression," Papers 2004.01411, arXiv.org, revised Nov 2020.
    20. Fan, Jianqing & Xue, Lingzhou & Yao, Jiawei, 2017. "Sufficient forecasting using factor models," Journal of Econometrics, Elsevier, vol. 201(2), pages 292-306.

    More about this item

    JEL classification:

    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C58 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Financial Econometrics
    • G0 - Financial Economics - - General
    • G1 - Financial Economics - - General Financial Markets
    • G17 - Financial Economics - - General Financial Markets - - - Financial Forecasting and Simulation

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:rfinst:v:33:y:2020:i:5:p:2223-2273.. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: https://edirc.repec.org/data/sfsssea.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://edirc.repec.org/data/sfsssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.