IDEAS home Printed from https://ideas.repec.org/a/spr/fininn/v11y2025i1d10.1186_s40854-025-00779-8.html
   My bibliography  Save this article

Stock return forecasting based on the proxy variables of category factors

Author

Listed:
  • Yuan Zhao

    (Lanzhou University of Technology
    Greater Bay Intelligent Finance and Risk Management Research Base)

  • Xue Gong

    (Nanjing University of Science and Technology
    Greater Bay Intelligent Finance and Risk Management Research Base)

  • Weiguo Zhang

    (Shenzhen University)

  • Weijun Xu

    (South China University of Technology
    Greater Bay Intelligent Finance and Risk Management Research Base)

Abstract

Stock return prediction has been in the spotlight because it involves numerous factors. Improving the accuracy of stock return prediction and quantifying the impact of individual factors on forecasting remain challenging tasks. Motivated by these challenges, we propose a novel forecasting method that entails proxy variables of category factors and the random forest technique. This new method aims to quantify the information and importance of category factors, thereby enhancing the predictability of stock returns. Specifically, we categorize a large set of return predictors into several category factors. We then utilize the importance of the original variables to construct proxy variables for these category factors. Subsequently, we use the proxy variables to build a random forest model for predicting stock returns. Our empirical analysis results demonstrate that the proposed method effectively quantifies the importance of both the original factors and category factors. Furthermore, we find that the fundamental information factor consistently ranks as the most crucial category factor for stock return forecasting. Additionally, the proposed method exhibits a more robust and prominent prediction performance than competing models such as single-category-factor-based random forest models, dimension-reduction, and forecast-combination methods. Most importantly, the proposed method produces forecast results that can assist investors with understanding stock market dynamics and facilitate higher investment returns.

Suggested Citation

  • Yuan Zhao & Xue Gong & Weiguo Zhang & Weijun Xu, 2025. "Stock return forecasting based on the proxy variables of category factors," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 11(1), pages 1-48, December.
  • Handle: RePEc:spr:fininn:v:11:y:2025:i:1:d:10.1186_s40854-025-00779-8
    DOI: 10.1186/s40854-025-00779-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1186/s40854-025-00779-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1186/s40854-025-00779-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Leippold, Markus & Wang, Qian & Zhou, Wenyu, 2022. "Machine learning in the Chinese stock market," Journal of Financial Economics, Elsevier, vol. 145(2), pages 64-82.
    2. Michael W. McCracken & Serena Ng, 2016. "FRED-MD: A Monthly Database for Macroeconomic Research," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 574-589, October.
    3. Goulet Coulombe, Philippe & Leroux, Maxime & Stevanovic, Dalibor & Surprenant, Stéphane, 2021. "Macroeconomic data transformations matter," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1338-1354.
    4. Gala, Vito D. & Pagliardi, Giovanni & Zenios, Stavros A., 2023. "Global political risk and international stock returns," Journal of Empirical Finance, Elsevier, vol. 72(C), pages 78-102.
    5. Kim, Hyeongwoo & Ko, Kyunghwan, 2020. "Improving forecast accuracy of financial vulnerability: PLS factor model approach," Economic Modelling, Elsevier, vol. 88(C), pages 341-355.
    6. Peter R. Hansen & Asger Lunde & James M. Nason, 2011. "The Model Confidence Set," Econometrica, Econometric Society, vol. 79(2), pages 453-497, March.
    7. Shu, Lei & Lu, Feiyang & Chen, Yu, 2023. "Robust forecasting with scaled independent component analysis," Finance Research Letters, Elsevier, vol. 51(C).
    8. Westerlund, Joakim & Narayan, Paresh Kumar, 2012. "Does the choice of estimator matter when forecasting returns?," Journal of Banking & Finance, Elsevier, vol. 36(9), pages 2632-2640.
    9. Kenneth L. Fisher & Meir Statman, 2000. "Investor Sentiment and Stock Returns," Financial Analysts Journal, Taylor & Francis Journals, vol. 56(2), pages 16-23, March.
    10. Dashan Huang & Fuwei Jiang & Kunpeng Li & Guoshi Tong & Guofu Zhou, 2022. "Scaled PCA: A New Approach to Dimension Reduction," Management Science, INFORMS, vol. 68(3), pages 1678-1695, March.
    11. Clark, Todd E. & West, Kenneth D., 2007. "Approximately normal tests for equal predictive accuracy in nested models," Journal of Econometrics, Elsevier, vol. 138(1), pages 291-311, May.
    12. Ghulam Sarwar & Walayet Khan, 2019. "Interrelations of U.S. market fears and emerging markets returns: Global evidence," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 24(1), pages 527-539, January.
    13. Pan, Zhiyuan & Pettenuzzo, Davide & Wang, Yudong, 2020. "Forecasting stock returns: A predictor-constrained approach," Journal of Empirical Finance, Elsevier, vol. 55(C), pages 200-217.
    14. Christopher J. Neely & David E. Rapach & Jun Tu & Guofu Zhou, 2014. "Forecasting the Equity Risk Premium: The Role of Technical Indicators," Management Science, INFORMS, vol. 60(7), pages 1772-1791, July.
    15. Gupta, Rangan & Hammoudeh, Shawkat & Modise, Mampho P. & Nguyen, Duc Khuong, 2014. "Can economic uncertainty, financial stress and consumer sentiments predict U.S. equity premium?," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 33(C), pages 367-378.
    16. Giovannelli, Alessandro & Massacci, Daniele & Soccorsi, Stefano, 2021. "Forecasting stock returns with large dimensional factor models," Journal of Empirical Finance, Elsevier, vol. 63(C), pages 252-269.
    17. Orte, Francisco & Mira, José & Sánchez, María Jesús & Solana, Pablo, 2023. "A random forest-based model for crypto asset forecasts in futures markets with out-of-sample prediction," Research in International Business and Finance, Elsevier, vol. 64(C).
    18. Lv, Wendai & Qi, Jipeng, 2022. "Stock market return predictability: A combination forecast perspective," International Review of Financial Analysis, Elsevier, vol. 84(C).
    19. Guo, Yangli & He, Feng & Liang, Chao & Ma, Feng, 2022. "Oil price volatility predictability: New evidence from a scaled PCA approach," Energy Economics, Elsevier, vol. 105(C).
    20. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    21. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    22. Mekelburg, Erik & Strauss, Jack, 2024. "Pooling and winsorizing machine learning forecasts to predict stock returns with high-dimensional data," Journal of Empirical Finance, Elsevier, vol. 79(C).
    23. Wang, Yudong & Liu, Li & Ma, Feng & Diao, Xundi, 2018. "Momentum of return predictability," Journal of Empirical Finance, Elsevier, vol. 45(C), pages 141-156.
    24. Hai Lin & Chunchi Wu & Guofu Zhou, 2018. "Forecasting Corporate Bond Returns with a Large Set of Predictors: An Iterated Combination Approach," Management Science, INFORMS, vol. 64(9), pages 4218-4238, September.
    25. Bokun, Kathryn O. & Jackson, Laura E. & Kliesen, Kevin L. & Owyang, Michael T., 2023. "FRED-SD: A real-time database for state-level data with forecasting applications," International Journal of Forecasting, Elsevier, vol. 39(1), pages 279-297.
    26. Basu, S, 1977. "Investment Performance of Common Stocks in Relation to Their Price-Earnings Ratios: A Test of the Efficient Market Hypothesis," Journal of Finance, American Finance Association, vol. 32(3), pages 663-682, June.
    27. Garcia-Jorcano, Laura & Sanchis-Marco, Lidia, 2022. "Spillover effects between commodity and stock markets: A SDSES approach," Resources Policy, Elsevier, vol. 79(C).
    28. David E. Rapach & Jack K. Strauss & Guofu Zhou, 2010. "Out-of-Sample Equity Premium Prediction: Combination Forecasts and Links to the Real Economy," The Review of Financial Studies, Society for Financial Studies, vol. 23(2), pages 821-862, February.
    29. Yang, Jianlei & Yang, Chunpeng, 2021. "The impact of mixed-frequency geopolitical risk on stock market returns," Economic Analysis and Policy, Elsevier, vol. 72(C), pages 226-240.
    30. Ma, Feng & Wang, Jiqian & Wahab, M.I.M. & Ma, Yuanhui, 2023. "Stock market volatility predictability in a data-rich world: A new insight," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1804-1819.
    31. Ho, Chienwei & Hung, Chi-Hsiou, 2009. "Investor sentiment as conditioning information in asset pricing," Journal of Banking & Finance, Elsevier, vol. 33(5), pages 892-903, May.
    32. Tan, Xueping & Sirichand, Kavita & Vivian, Andrew & Wang, Xinyu, 2022. "Forecasting European carbon returns using dimension reduction techniques: Commodity versus financial fundamentals," International Journal of Forecasting, Elsevier, vol. 38(3), pages 944-969.
    33. Jiang, Fuwei & Lee, Joshua & Martin, Xiumin & Zhou, Guofu, 2019. "Manager sentiment and stock returns," Journal of Financial Economics, Elsevier, vol. 132(1), pages 126-149.
    34. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    35. Björn Bick & Holger Kraft & Claus Munk, 2013. "Solving Constrained Consumption-Investment Problems by Simulation of Artificial Market Strategies," Management Science, INFORMS, vol. 59(2), pages 485-503, June.
    36. Hoang, Khoa & Cannavan, Damien & Huang, Ronghong & Peng, Xiaowen, 2021. "Predicting stock returns with implied cost of capital: A partial least squares approach," Journal of Financial Markets, Elsevier, vol. 53(C).
    37. Zhang, Yaojie & Ma, Feng & Wang, Yudong, 2019. "Forecasting crude oil prices with a large set of predictors: Can LASSO select powerful predictors?," Journal of Empirical Finance, Elsevier, vol. 54(C), pages 97-117.
    38. Dashan Huang & Fuwei Jiang & Jun Tu & Guofu Zhou, 2015. "Investor Sentiment Aligned: A Powerful Predictor of Stock Returns," The Review of Financial Studies, Society for Financial Studies, vol. 28(3), pages 791-837.
    39. Wen, Chufu & Zhu, Haoyang & Dai, Zhifeng, 2023. "Forecasting commodity prices returns: The role of partial least squares approach," Energy Economics, Elsevier, vol. 125(C).
    40. Liang, Chao & Xu, Yongan & Wang, Jianqiong & Yang, Mo, 2022. "Whether dimensionality reduction techniques can improve the ability of sentiment proxies to predict stock market returns," International Review of Financial Analysis, Elsevier, vol. 82(C).
    41. Salisu, Afees A. & Tchankam, Jean Paul, 2022. "US Stock return predictability with high dimensional models," Finance Research Letters, Elsevier, vol. 45(C).
    42. Victoria Atanasov & Stig V. Møller & Richard Priestley, 2020. "Consumption Fluctuations and Expected Returns," Journal of Finance, American Finance Association, vol. 75(3), pages 1677-1713, June.
    43. Kang, Wensheng & Ratti, Ronald A., 2013. "Oil shocks, policy uncertainty and stock market return," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 26(C), pages 305-318.
    44. Narayan, Paresh Kumar & Liu, Ruipeng, 2018. "A new GARCH model with higher moments for stock return predictability," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 56(C), pages 93-103.
    45. Iyke, Bernard Njindan & Ho, Sin-Yu, 2021. "Stock return predictability over four centuries: The role of commodity returns," Finance Research Letters, Elsevier, vol. 40(C).
    46. Orawan Ratanapakorn & Subhash Sharma, 2007. "Dynamic analysis between the US stock returns and the macroeconomic variables," Applied Financial Economics, Taylor & Francis Journals, vol. 17(5), pages 369-377.
    47. Kelly, Bryan & Pruitt, Seth, 2015. "The three-pass regression filter: A new approach to forecasting using many predictors," Journal of Econometrics, Elsevier, vol. 186(2), pages 294-316.
    48. Zhang, Yaojie & Wang, Yudong, 2023. "Forecasting crude oil futures market returns: A principal component analysis combination approach," International Journal of Forecasting, Elsevier, vol. 39(2), pages 659-673.
    49. Geng, Jiang-Bo & Chen, Fu-Rui & Ji, Qiang & Liu, Bing-Yue, 2021. "Network connectedness between natural gas markets, uncertainty and stock markets," Energy Economics, Elsevier, vol. 95(C).
    50. Chen, Jian & Tang, Guohao & Yao, Jiaquan & Zhou, Guofu, 2022. "Investor Attention and Stock Returns," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 57(2), pages 455-484, March.
    51. Dichtl, Hubert & Drobetz, Wolfgang & Otto, Tizian, 2023. "Forecasting Stock Market Crashes via Machine Learning," Journal of Financial Stability, Elsevier, vol. 65(C).
    52. Liu, Li & Ma, Feng & Wang, Yudong, 2015. "Forecasting excess stock returns with crude oil market data," Energy Economics, Elsevier, vol. 48(C), pages 316-324.
    53. Zhao, Yuan & Zhang, Weiguo & Gong, Xue & Wang, Chao, 2021. "A novel method for online real-time forecasting of crude oil price," Applied Energy, Elsevier, vol. 303(C).
    54. Arampatzidis, Ioannis & Panagiotidis, Theodore, 2023. "On the identification of the oil-stock market relationship," Economic Modelling, Elsevier, vol. 120(C).
    55. Sekandary, Ghezal & Bask, Mikael, 2023. "Monetary policy uncertainty, monetary policy surprises and stock returns," Journal of Economics and Business, Elsevier, vol. 124(C).
    56. Dai, Zhifeng & Zhu, Huan & Kang, Jie, 2021. "New technical indicators and stock returns predictability," International Review of Economics & Finance, Elsevier, vol. 71(C), pages 127-142.
    57. Amendola, Alessandra & Braione, Manuela & Candila, Vincenzo & Storti, Giuseppe, 2020. "A Model Confidence Set approach to the combination of multivariate volatility forecasts," International Journal of Forecasting, Elsevier, vol. 36(3), pages 873-891.
    58. Xiao Zhong & David Enke, 2019. "Predicting the daily return direction of the stock market using hybrid machine learning algorithms," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 5(1), pages 1-20, December.
    59. Balcilar, Mehmet & Gupta, Rangan & Kim, Won Joong & Kyei, Clement, 2019. "The role of economic policy uncertainties in predicting stock returns and their volatility for Hong Kong, Malaysia and South Korea," International Review of Economics & Finance, Elsevier, vol. 59(C), pages 150-163.
    60. Zhang, Xiaotao & Li, Guoran & Li, Yishuo & Zou, Gaofeng & Wu, Ji George, 2023. "Which is more important in stock market forecasting: Attention or sentiment?," International Review of Financial Analysis, Elsevier, vol. 89(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wen, Danyan & He, Mengxi & Wang, Yudong & Zhang, Yaojie, 2024. "Forecasting crude oil market volatility: A comprehensive look at uncertainty variables," International Journal of Forecasting, Elsevier, vol. 40(3), pages 1022-1041.
    2. Dai, Zhifeng & Dong, Xiaodi & Kang, Jie & Hong, Lianying, 2020. "Forecasting stock market returns: New technical indicators and two-step economic constraint method," The North American Journal of Economics and Finance, Elsevier, vol. 53(C).
    3. Chen, Juan & Ma, Feng & Qiu, Xuemei & Li, Tao, 2023. "The role of categorical EPU indices in predicting stock-market returns," International Review of Economics & Finance, Elsevier, vol. 87(C), pages 365-378.
    4. Gong, Xue & Ye, Xin & Zhang, Weiguo & Zhang, Yue, 2023. "Predicting energy futures high-frequency volatility using technical indicators: The role of interaction," Energy Economics, Elsevier, vol. 119(C).
    5. Zhang, Yaojie & Ma, Feng & Wang, Yudong, 2019. "Forecasting crude oil prices with a large set of predictors: Can LASSO select powerful predictors?," Journal of Empirical Finance, Elsevier, vol. 54(C), pages 97-117.
    6. Zhang, Yaojie & Wang, Yudong, 2023. "Forecasting crude oil futures market returns: A principal component analysis combination approach," International Journal of Forecasting, Elsevier, vol. 39(2), pages 659-673.
    7. He, Mengxi & Zhang, Zhikai & Zhang, Yaojie, 2024. "Forecasting crude oil prices with global ocean temperatures," Energy, Elsevier, vol. 311(C).
    8. Zhang, Yaojie & Wahab, M.I.M. & Wang, Yudong, 2023. "Forecasting crude oil market volatility using variable selection and common factor," International Journal of Forecasting, Elsevier, vol. 39(1), pages 486-502.
    9. Niu, Zibo & Demirer, Riza & Suleman, Muhammad Tahir & Zhang, Hongwei & Zhu, Xuehong, 2024. "Do industries predict stock market volatility? Evidence from machine learning models," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 90(C).
    10. Wen, Chufu & Zhu, Haoyang & Dai, Zhifeng, 2023. "Forecasting commodity prices returns: The role of partial least squares approach," Energy Economics, Elsevier, vol. 125(C).
    11. Dai, Zhifeng & Kang, Jie & Wen, Fenghua, 2021. "Predicting stock returns: A risk measurement perspective," International Review of Financial Analysis, Elsevier, vol. 74(C).
    12. Zhang, Hongwei & Wang, Wentao & Niu, Zibo, 2024. "Geopolitical risks and crude oil futures volatility: Evidence from machine learning," Resources Policy, Elsevier, vol. 98(C).
    13. Xu, Yongan & Liang, Chao & Li, Yan & Huynh, Toan L.D., 2022. "News sentiment and stock return: Evidence from managers’ news coverages," Finance Research Letters, Elsevier, vol. 48(C).
    14. Xu, Yongan & Liang, Chao & Wang, Jianqiong, 2023. "Financial stress and returns predictability: Fresh evidence from China," Pacific-Basin Finance Journal, Elsevier, vol. 78(C).
    15. Mengxi He & Xianfeng Hao & Yaojie Zhang & Fanyi Meng, 2021. "Forecasting stock return volatility using a robust regression model," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(8), pages 1463-1478, December.
    16. Yaojie Zhang & Mengxi He & Zhikai Zhang, 2024. "Forecasting stock returns with industry volatility concentration," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(7), pages 2705-2730, November.
    17. Wen, Danyan & Liu, Li & Wang, Yudong & Zhang, Yaojie, 2022. "Forecasting crude oil market returns: Enhanced moving average technical indicators," Resources Policy, Elsevier, vol. 76(C).
    18. Zeng, Qing & Lu, Xinjie & Xu, Jin & Lin, Yu, 2024. "Macro-Driven Stock Market Volatility Prediction: Insights from a New Hybrid Machine Learning Approach," International Review of Financial Analysis, Elsevier, vol. 96(PB).
    19. He, Mengxi & Zhang, Yaojie & Wen, Danyan & Wang, Yudong, 2021. "Forecasting crude oil prices: A scaled PCA approach," Energy Economics, Elsevier, vol. 97(C).
    20. Zhikai Zhang & Yaojie Zhang & Yudong Wang, 2024. "Forecasting the equity premium using weighted regressions: Does the jump variation help?," Empirical Economics, Springer, vol. 66(5), pages 2049-2082, May.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:fininn:v:11:y:2025:i:1:d:10.1186_s40854-025-00779-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.