IDEAS home Printed from https://ideas.repec.org/a/eee/phsmap/v666y2025ics0378437125001852.html

Stock market forecasting based on machine learning: The role of investor sentiment

Author

Listed:
  • Ren, Tingting
  • Li, Shaofang

Abstract

Stock market prediction remains a classical yet challenging problem, with the focus on the investor sentiment growing increasing significant in big data era. This analysis delves into the question whether and how predicable is the stock market when considering investor sentiment. By leveraging the initial and customized LM financial lexicon and Vader technology, Word2vec and Doc2vec and BERT embedding vector method (along with two fine-tuned models: FinBERT and SentiBERT), we first construct nine investor sentiment indexes based on the textual data from Twitter between November 2019 and December 2021. And then we employ three machine learning algorithms (SVR, AdaBoost, and RF) to predict the daily return of the S&P 500 index. The experiment results confirm that the investor sentiment index can enhance prediction accuracy beyond the market indicator, aligning with prior research. Embedding vector methods exhibit superior performance compared to the fine-tuned models, and the customized dictionaries outperform their traditional counterparts. Furthermore, the composite sentiment index, integrating all the single indexes, achieves the best overall performance. To further validate our findings, we conduct robustness checks on the DJIA index and across different economic cycles, observe that the single sentiment index performs worse with shorter datasets, whereas the composite index demonstrates consistent improvement in both volatile and steady periods. These findings offer valuable insights for future research and provide practical applications in stock market prediction.

Suggested Citation

  • Ren, Tingting & Li, Shaofang, 2025. "Stock market forecasting based on machine learning: The role of investor sentiment," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 666(C).
  • Handle: RePEc:eee:phsmap:v:666:y:2025:i:c:s0378437125001852
    DOI: 10.1016/j.physa.2025.130533
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0378437125001852
    Download Restriction: Full text for ScienceDirect subscribers only. Journal offers the option of making the article available online on Science direct for a fee of $3,000

    File URL: https://libkey.io/10.1016/j.physa.2025.130533?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Hossein Jokar & Vahid Daneshi, 2020. "Investor sentiment, stock price, and audit quality," International Journal of Managerial and Financial Accounting, Inderscience Enterprises Ltd, vol. 12(1), pages 25-47.
    2. Gholampour, Vahid, 2019. "Daily expectations of returns index," Journal of Empirical Finance, Elsevier, vol. 54(C), pages 236-252.
    3. Weiguo Zhang & Xue Gong & Chao Wang & Xin Ye, 2021. "Predicting stock market volatility based on textual sentiment: A nonlinear analysis," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(8), pages 1479-1500, December.
    4. Sapkota, Niranjan, 2022. "News-based sentiment and bitcoin volatility," International Review of Financial Analysis, Elsevier, vol. 82(C).
    5. Malcolm Baker & Jeffrey Wurgler, 2006. "Investor Sentiment and the Cross‐Section of Stock Returns," Journal of Finance, American Finance Association, vol. 61(4), pages 1645-1680, August.
    6. Eric. W. K. See-To & Yang Yang, 2017. "Market sentiment dispersion and its effects on stock return and volatility," Electronic Markets, Springer;IIM University of St. Gallen, vol. 27(3), pages 283-296, August.
    7. Song, Ziyu & Gong, Xiaomin & Zhang, Cheng & Yu, Changrui, 2023. "Investor sentiment based on scaled PCA method: A powerful predictor of realized volatility in the Chinese stock market," International Review of Economics & Finance, Elsevier, vol. 83(C), pages 528-545.
    8. Renault, Thomas, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Journal of Banking & Finance, Elsevier, vol. 84(C), pages 25-40.
    9. Deng, Shangkun & Huang, Xiaoru & Zhu, Yingke & Su, Zhihao & Fu, Zhe & Shimada, Tatsuro, 2023. "Stock index direction forecasting using an explainable eXtreme Gradient Boosting and investor sentiments," The North American Journal of Economics and Finance, Elsevier, vol. 64(C).
    10. Lucey, Brian & Ren, Boru, 2021. "Does news tone help forecast oil?," Economic Modelling, Elsevier, vol. 104(C).
    11. Thomas Renault, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-03205113, HAL.
    12. Li, Yelin & Bu, Hui & Li, Jiahong & Wu, Junjie, 2020. "The role of text-extracted investor sentiment in Chinese stock price prediction with the enhancement of deep learning," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1541-1562.
    13. Kumar, Satish & Rao, Sandeep & Goyal, Kirti & Goyal, Nisha, 2022. "Journal of Behavioral and Experimental Finance: A bibliometric overview," Journal of Behavioral and Experimental Finance, Elsevier, vol. 34(C).
    14. Liang, Chao & Tang, Linchun & Li, Yan & Wei, Yu, 2020. "Which sentiment index is more informative to forecast stock market volatility? Evidence from China," International Review of Financial Analysis, Elsevier, vol. 71(C).
    15. Gang Chu & John W. Goodell & Dehua Shen & Yongjie Zhang, 2022. "Machine learning to establish proxies for investor attention: evidence of improved stock-return prediction," Annals of Operations Research, Springer, vol. 318(1), pages 103-128, November.
    16. Wang, Hu & Li, Shouwei & Ma, Yuyin & Jiang, Shuyang, 2022. "Does investor sentiment affect fund crashes? Evidence from Chinese open-end funds," The North American Journal of Economics and Finance, Elsevier, vol. 60(C).
    17. Fama, Eugene F, 1970. "Efficient Capital Markets: A Review of Theory and Empirical Work," Journal of Finance, American Finance Association, vol. 25(2), pages 383-417, May.
    18. Obaid, Khaled & Pukthuanthong, Kuntara, 2022. "A picture is worth a thousand words: Measuring investor sentiment by combining machine learning and photos from news," Journal of Financial Economics, Elsevier, vol. 144(1), pages 273-297.
    19. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    20. Rui Jiang & Conghua Wen, 2022. "A Comparison between Parametric and Nonparametric Volatility Forecasting of Stock Index Futures in China," Emerging Markets Finance and Trade, Taylor & Francis Journals, vol. 58(9), pages 2522-2537, July.
    21. Wang, Gaoshan & Yu, Guangjin & Shen, Xiaohong, 2021. "The effect of online environmental news on green industry stocks: The mediating role of investor sentiment," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 573(C).
    22. Fu, Junhui & Wu, Xiang & Liu, Yufang & Chen, Rongda, 2021. "Firm-specific investor sentiment and stock price crash risk," Finance Research Letters, Elsevier, vol. 38(C).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhao, Xia & Hu, Qing & Song, Yuping & Huang, Jiefei, 2025. "Systemic risk spillovers incorporating investor sentiment: Evidence from an improved TENET analysis," Economic Modelling, Elsevier, vol. 151(C).
    2. Xue, Xiaorui & Li, Shaofang & Wang, Xiaonan & Ren, Tingting, 2026. "Enhancing stock market predictions with multivariate signal decomposition and dynamic feature optimization," The North American Journal of Economics and Finance, Elsevier, vol. 81(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Abakah, Emmanuel Joel Aikins & Abdullah, Mohammad & Yousaf, Imran & Kumar Tiwari, Aviral & Li, Yanshuang, 2024. "Economic sanctions sentiment and global stock markets," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 91(C).
    2. Zhang, Cheng & Gao, Bin & Xu, Xiangrong & Qin, Mimi, 2025. "MFA RPC news sentiment and stock returns," Pacific-Basin Finance Journal, Elsevier, vol. 92(C).
    3. Chu, Xiaojun & Wan, Xinmin & Qiu, Jianying, 2023. "The relative importance of overnight sentiment versus trading-hour sentiment in volatility forecasting," Journal of Behavioral and Experimental Finance, Elsevier, vol. 39(C).
    4. Song, Ziyu & Gong, Xiaomin & Zhang, Cheng & Yu, Changrui, 2023. "Investor sentiment based on scaled PCA method: A powerful predictor of realized volatility in the Chinese stock market," International Review of Economics & Finance, Elsevier, vol. 83(C), pages 528-545.
    5. Xiaohong Shen & Gaoshan Wang & Yue Wang & Alfred Peris, 2021. "The Influence of Research Reports on Stock Returns: The Mediating Effect of Machine-Learning-Based Investor Sentiment," Discrete Dynamics in Nature and Society, Hindawi, vol. 2021, pages 1-14, December.
    6. Lin Ren & Yingyue Sun & Deping Xiong & Yu Wei, 2025. "Evaluating the Impact of Private and Public Sentiments on the Linkage Between Gold and Stock Markets: Evidence from China," Evaluation Review, , vol. 49(4), pages 739-772, August.
    7. Santi, Caterina, 2023. "Investor climate sentiment and financial markets," International Review of Financial Analysis, Elsevier, vol. 86(C).
    8. Zachary McGurk & Adam Nowak & Joshua C. Hall, 2020. "Stock returns and investor sentiment: textual analysis and social media," Journal of Economics and Finance, Springer;Academy of Economics and Finance, vol. 44(3), pages 458-485, July.
    9. Tang, Zhenpeng & Lin, Qiaofeng & Cai, Yi & Chen, Kaijie & Liu, Dinggao, 2024. "Harnessing the power of real-time forum opinion: Unveiling its impact on stock market dynamics using intraday high-frequency data in China," International Review of Financial Analysis, Elsevier, vol. 93(C).
    10. Mubeen Abdur Rehman & Saeed Ahmad Sabir & Muhammad Zahid Javed & Haider Mahmood, 2024. "The Connectedness Knowledge from Investors’ Sentiments, Financial Crises, and Trade Policy: An Economic Perspective," Journal of the Knowledge Economy, Springer;Portland International Center for Management of Engineering and Technology (PICMET), vol. 15(4), pages 20038-20062, December.
    11. Nader Mahmoudi & Łukasz P. Olech & Paul Docherty, 2022. "A comprehensive study of domain-specific emoji meanings in sentiment classification," Computational Management Science, Springer, vol. 19(2), pages 159-197, June.
    12. Zhang, Junhuan & Zhang, Ziyan & Wen, Jiaqi, 2025. "A multifactor model using large language models and multimodal investor sentiment," International Review of Economics & Finance, Elsevier, vol. 102(C).
    13. Qing Liu & Hosung Son, 2025. "Text Sentiment Mining used for Constructing Investor Sentiment in Social Media: Survey and Recommendations," SAGE Open, , vol. 15(1), pages 21582440251, March.
    14. Shen, Yiran & Liu, Chang & Sun, Xiaolei & Guo, Kun, 2023. "Investor sentiment and the Chinese new energy stock market: A risk–return perspective," International Review of Economics & Finance, Elsevier, vol. 84(C), pages 395-408.
    15. Diego Pitta Jesus & Elvira Helena Oliveira Medeiros & Lucas Lúcio Godeiro & Andressa Lemes Proque, 2025. "Forecasting Brazilian Stock Market Using Sentiment Indices from Textual Data, Chat-GPT-Based and Technical Indicators," Computational Economics, Springer;Society for Computational Economics, vol. 66(5), pages 3735-3780, November.
    16. Gu, Ming & Hirshleifer, David & Teoh, Siew Hong & Wu, Shijia, 2025. "GIFfluence: A Visual Approach to Investor Sentiment and the Stock Market," MPRA Paper 127438, University Library of Munich, Germany.
    17. Maher Hamid, 2026. "Implementing domain-specific LLMs for strategic investment decisions: a retrospective case study comparing AI and human expertise," Digital Finance, Springer, vol. 8(1), pages 1-134, March.
    18. Zongwu Cai & Pixiong Chen, 2022. "New Online Investor Sentiment and Asset Returns," WORKING PAPERS SERIES IN THEORETICAL AND APPLIED ECONOMICS 202216, University of Kansas, Department of Economics, revised Nov 2022.
    19. Eierle, Brigitte & Klamer, Sebastian & Muck, Matthias, 2022. "Does it really pay off for investors to consider information from social media?," International Review of Financial Analysis, Elsevier, vol. 81(C).
    20. Prajwal Eachempati & Praveen Ranjan Srivastava, 2021. "Accounting for unadjusted news sentiment for asset pricing," Qualitative Research in Financial Markets, Emerald Group Publishing Limited, vol. 13(3), pages 383-422, May.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:666:y:2025:i:c:s0378437125001852. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.