IDEAS home Printed from https://ideas.repec.org/a/eee/finlet/v71y2025ics1544612324014351.html
   My bibliography  Save this article

Feature importance in linear models with ensemble machine learning: A study of the Fama and French five-factor model

Author

Listed:
  • Kwon, Tae Yeon

Abstract

This study explores key considerations for interpreting feature influence and importance in Machine Learning (ML) for financial models that commonly assume linearity. Simulations demonstrate that ML techniques, including Random Forest, XGBoost, and CatBoost, may produce misleading feature importance ranks when the underlying model is linear. We empirically examine the Fama–French five-factor model using U.S. monthly data from July 1964 to June 2024. While the most important factors are consistently identified, the ranks of moderately important factors vary depending on the estimation method. These results highlight the need for a critical application of ML in financial modeling when the purpose is interpretability.

Suggested Citation

  • Kwon, Tae Yeon, 2025. "Feature importance in linear models with ensemble machine learning: A study of the Fama and French five-factor model," Finance Research Letters, Elsevier, vol. 71(C).
  • Handle: RePEc:eee:finlet:v:71:y:2025:i:c:s1544612324014351
    DOI: 10.1016/j.frl.2024.106406
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1544612324014351
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.frl.2024.106406?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Bos, J.W.B. & Kool, C.J.M., 2006. "Bank efficiency: The role of bank strategy and local market conditions," Journal of Banking & Finance, Elsevier, vol. 30(7), pages 1953-1974, July.
    2. Guanhao Feng & Stefano Giglio & Dacheng Xiu, 2020. "Taming the Factor Zoo: A Test of New Factors," Journal of Finance, American Finance Association, vol. 75(3), pages 1327-1370, June.
    3. Joachim Freyberger & Andreas Neuhierl & Michael Weber & Andrew KarolyiEditor, 2020. "Dissecting Characteristics Nonparametrically," Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2326-2377.
    4. John M. Griffin, 2002. "Are the Fama and French Factors Global or Country Specific?," The Review of Financial Studies, Society for Financial Studies, vol. 15(3), pages 783-803.
    5. John Lintner, 1965. "Security Prices, Risk, And Maximal Gains From Diversification," Journal of Finance, American Finance Association, vol. 20(4), pages 587-615, December.
    6. Kozak, Serhiy & Nagel, Stefan & Santosh, Shrihari, 2020. "Shrinking the cross-section," Journal of Financial Economics, Elsevier, vol. 135(2), pages 271-292.
    7. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    8. Erkin Diyarbakirlioglu & Marc Desban & Souad Lajili Jarjir, 2022. "Asset pricing models with measurement error problems: A new framework with Compact Genetic Algorithms," Finance, Presses universitaires de Grenoble, vol. 43(2), pages 1-78.
    9. N. Kundan Kishor & Hardik A. Marfatia, 2017. "The Dynamic Relationship Between Housing Prices and the Macroeconomy: Evidence from OECD Countries," The Journal of Real Estate Finance and Economics, Springer, vol. 54(2), pages 237-268, February.
    10. Fama, Eugene F & French, Kenneth R, 1992. "The Cross-Section of Expected Stock Returns," Journal of Finance, American Finance Association, vol. 47(2), pages 427-465, June.
    11. Brogaard, Jonathan & Zareei, Abalfazl, 2023. "Machine Learning and the Stock Market," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 58(4), pages 1431-1472, June.
    12. Erkin Diyarbakirlioglu & Marc Desban & Souad Lajili jarjir, 2022. "Asset pricing models with measurement error problems: A new framework with compact genetic algorithms," Post-Print hal-03708066, HAL.
    13. Vesna Karadžić & Nikola Đalović, 2021. "Profitability Determinants of Big European Banks," Journal of Central Banking Theory and Practice, Central bank of Montenegro, vol. 10(2), pages 39-56.
    14. Fama, Eugene F & French, Kenneth R, 1995. "Size and Book-to-Market Factors in Earnings and Returns," Journal of Finance, American Finance Association, vol. 50(1), pages 131-155, March.
    15. Stefano Giglio & Dacheng Xiu, 2017. "Inference on Risk Premia in the Presence of Omitted Factors," NBER Working Papers 23527, National Bureau of Economic Research, Inc.
    16. Sendhil Mullainathan & Jann Spiess, 2017. "Machine Learning: An Applied Econometric Approach," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 87-106, Spring.
    17. Stefano Giglio & Dacheng Xiu, 2021. "Asset Pricing with Omitted Factors," Journal of Political Economy, University of Chicago Press, vol. 129(7), pages 1947-1990.
    18. Erkin Diyarbakirlioglu & Marc Desban & Souad Lajili Jarjir, 2022. "Asset pricing models with measurement error problems: A new framework with Compact Genetic Algorithms," Post-Print hal-03643083, HAL.
    19. William F. Sharpe, 1964. "Capital Asset Prices: A Theory Of Market Equilibrium Under Conditions Of Risk," Journal of Finance, American Finance Association, vol. 19(3), pages 425-442, September.
    20. Junyi Ye & Bhaskar Goswami & Jingyi Gu & Ajim Uddin & Guiling Wang, 2024. "From Factor Models to Deep Learning: Machine Learning in Reshaping Empirical Asset Pricing," Papers 2403.06779, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bryzgalova, Svetlana & Huang, Jiantao & Julliard, Christian, 2023. "Bayesian solutions for the factor zoo: we just ran two quadrillion models," LSE Research Online Documents on Economics 126151, London School of Economics and Political Science, LSE Library.
    2. Thomas Conlon & John Cotter & Iason Kynigakis, 2021. "Machine Learning and Factor-Based Portfolio Optimization," Papers 2107.13866, arXiv.org.
    3. Söhnke M. Bartram & Harald Lohre & Peter F. Pope & Ananthalakshmi Ranganathan, 2021. "Navigating the factor zoo around the world: an institutional investor perspective," Journal of Business Economics, Springer, vol. 91(5), pages 655-703, July.
    4. Wolfgang Drobetz & Tizian Otto, 2021. "Empirical asset pricing via machine learning: evidence from the European stock market," Journal of Asset Management, Palgrave Macmillan, vol. 22(7), pages 507-538, December.
    5. Cong Wang, 2024. "Stock return prediction with multiple measures using neural network models," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 10(1), pages 1-34, December.
    6. Sun, Chuanping, 2024. "Factor correlation and the cross section of asset returns: A correlation-robust machine learning approach," Journal of Empirical Finance, Elsevier, vol. 77(C).
    7. Ma, Tian & Leong, Wen Jun & Jiang, Fuwei, 2023. "A latent factor model for the Chinese stock market," International Review of Financial Analysis, Elsevier, vol. 87(C).
    8. De Nard, Gianluca & Zhao, Zhao, 2023. "Using, taming or avoiding the factor zoo? A double-shrinkage estimator for covariance matrices," Journal of Empirical Finance, Elsevier, vol. 72(C), pages 23-35.
    9. Fieberg, Christian & Liedtke, Gerrit & Zaremba, Adam & Cakici, Nusret, 2025. "A factor model for the cross-section of country equity risk premia," Journal of Banking & Finance, Elsevier, vol. 171(C).
    10. Vafai, Nima & Rakowski, David, 2024. "The sources of portfolio volatility and mutual fund performance," International Review of Financial Analysis, Elsevier, vol. 91(C).
    11. Anatolyev, Stanislav & Mikusheva, Anna, 2022. "Factor models with many assets: Strong factors, weak factors, and the two-pass procedure," Journal of Econometrics, Elsevier, vol. 229(1), pages 103-126.
    12. Assoe, Kodjovi & Attig, Najah & Sy, Oumar, 2024. "The battle of factors," Global Finance Journal, Elsevier, vol. 62(C).
    13. Fathi, Masoumeh & Grobys, Klaus & Äijö, Janne, 2025. "A common component of Fama and French factor variances," The North American Journal of Economics and Finance, Elsevier, vol. 75(PA).
    14. Ray Ball & Gil Sadka & Ayung Tseng, 2022. "Using accounting earnings and aggregate economic indicators to estimate firm-level systematic risk," Review of Accounting Studies, Springer, vol. 27(2), pages 607-646, June.
    15. Bandi, Federico M. & Chaudhuri, Shomesh E. & Lo, Andrew W. & Tamoni, Andrea, 2021. "Spectral factor models," Journal of Financial Economics, Elsevier, vol. 142(1), pages 214-238.
    16. Pedro M. Mirete-Ferrer & Alberto Garcia-Garcia & Juan Samuel Baixauli-Soler & Maria A. Prats, 2022. "A Review on Machine Learning for Asset Management," Risks, MDPI, vol. 10(4), pages 1-46, April.
    17. Baba-Yara, Fahiz & Boons, Martijn & Tamoni, Andrea, 2024. "Persistent and transitory components of firm characteristics: Implications for asset pricing," Journal of Financial Economics, Elsevier, vol. 154(C).
    18. Svetlana Bryzgalova & Jiantao Huang & Christian Julliard, 2023. "Bayesian Solutions for the Factor Zoo: We Just Ran Two Quadrillion Models," Journal of Finance, American Finance Association, vol. 78(1), pages 487-557, February.
    19. Molero-González, L. & Trinidad-Segovia, J.E. & Sánchez-Granero, M.A. & García-Medina, A., 2023. "Market Beta is not dead: An approach from Random Matrix Theory," Finance Research Letters, Elsevier, vol. 55(PA).
    20. Wang, Jinzhe & Zhu, Yifeng, 2024. "A comparison of factor models in China," Journal of Empirical Finance, Elsevier, vol. 79(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:finlet:v:71:y:2025:i:c:s1544612324014351. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/frl .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.