IDEAS home Printed from https://ideas.repec.org/a/isp/journl/v19y2025i1p215-224.html

From Financial News To Price Forecasts: Assessing The Role Of Nlp-Based Sentiment Analysis And Variable Selection In Stock Price Prediction

Author

Listed:
  • Erdem Korhan Akçay
  • İsmail Yenilmez

Abstract

This study investigates the impact of NLP-based sentiment analysis and variable selection techniques on stock price prediction. Sentiment indicators are extracted using Natural Language Processing (NLP) methods, including TextBlob-based sentiment polarity scores, VADER compound scores, and domainspecific lexicon-based sentiment scores. To address the complexity and dimensionality of financial data, variable selection techniques (PCA, LASSO, Elastic Net, PCA + LASSO, and PCA + Elastic Net) are employed. These methods help in constructing a more efficient feature set by reducing noise and multicollinearity. The selected features, combined with sentiment variables, are utilized in predictive models including ARIMAX, ANN, LSTM, and GRU. The models are tested on eight publicly traded stocks (AAPL, AMZN, GOOG, META, MSFT, NFLX, NVDA, TSLA) over a four-year period. The results indicate that the inclusion of sentiment variables improves forecasting performance, particularly when combined with dimensionality reduction and regularization techniques. Among these approaches, combinations of PCA with regularization techniques often lead to more stable and competitive forecasting performance. The findings highlight the value of integrating unstructured textual data from financial news into time series forecasting models, contributing to improved predictive performance in stock market applications.

Suggested Citation

  • Erdem Korhan Akçay & İsmail Yenilmez, 2025. "From Financial News To Price Forecasts: Assessing The Role Of Nlp-Based Sentiment Analysis And Variable Selection In Stock Price Prediction," Economy & Business Journal, International Scientific Publications, Bulgaria, vol. 19(1), pages 215-224.
  • Handle: RePEc:isp:journl:v:19:y:2025:i:1:p:215-224
    as

    Download full text from publisher

    File URL: https://www.scientific-publications.net/get/1000072/1767176181899605.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Olamilekan Shobayo & Sidikat Adeyemi-Longe & Olusogo Popoola & Bayode Ogunleye, 2024. "Innovative Sentiment Analysis and Prediction of Stock Price Using FinBERT, GPT-4 and Logistic Regression: A Data-Driven Approach," Papers 2412.06837, arXiv.org.
    2. Diego Vallarino, 2025. "An AI-Enhanced Forecasting Framework: Integrating LSTM and Transformer-Based Sentiment for Stock Price Prediction," Journal of Economic Analysis, Anser Press, vol. 4(3), pages 1-15, September.
    3. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    4. repec:bla:jfinan:v:59:y:2004:i:3:p:1259-1294 is not listed on IDEAS
    5. Hakan Pabuccu & Adrian Barbu, 2024. "Feature selection with annealing for forecasting financial time series," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 10(1), pages 1-26, December.
    6. Htet Htet Htun & Michael Biehl & Nicolai Petkov, 2023. "Survey of feature selection and extraction techniques for stock market prediction," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 9(1), pages 1-25, December.
    7. Sheung Yin Kevin Mo & Anqi Liu & Steve Y. Yang, 2016. "News sentiment to market impact and its feedback effect," Environment Systems and Decisions, Springer, vol. 36(2), pages 158-166, June.
    8. Hakan Pabuccu & Adrian Barbu, 2023. "Feature Selection with Annealing for Forecasting Financial Time Series," Papers 2303.02223, arXiv.org, revised Feb 2024.
    9. Engle, Robert F, 1982. "Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation," Econometrica, Econometric Society, vol. 50(4), pages 987-1007, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Paul Hubert & Fabien Labondance, 2016. "Central Bank Sentiment and Policy Expectations," Sciences Po Economics Publications (main) hal-03459227, HAL.
    2. Thiasha Naidoo & Peter Moores-Pitt & Paul-Francois Muzindutsi & Kazeem O Isah, 2025. "Analysing investor sentiment and stock market volatility of the JSE size-based indices: a GARCH-MIDAS approach," Risk Management, Palgrave Macmillan, vol. 27(3), pages 1-23, September.
    3. Zhu (Drew) Zhang & Jie Yuan & Amulya Gupta, 2024. "Let the Laser Beam Connect the Dots: Forecasting and Narrating Stock Market Volatility," INFORMS Journal on Computing, INFORMS, vol. 36(6), pages 1400-1416, December.
    4. Paul Hubert & Fabien Labondance, 2019. "Central bank tone and the dispersion of views within monetary policy committees," Working Papers hal-03403256, HAL.
    5. Junyu Chen & Tom Boot & Lingwei Kong & Weining Wang, 2026. "Transformer-based CoVaR: Systemic Risk in Textual Information," Papers 2602.12490, arXiv.org.
    6. Sushant Chari & Purva Hegde Desai & Nilesh Borde & Babu George, 2023. "Aggregate News Sentiment and Stock Market Returns in India," JRFM, MDPI, vol. 16(8), pages 1-18, August.
    7. Scott R. Baker & Nicholas Bloom & Steven J. Davis & Marco C. Sammon, 2021. "What Triggers Stock Market Jumps?," NBER Working Papers 28687, National Bureau of Economic Research, Inc.
    8. Massimiliano Caporin & Francesco Poli, 2017. "Building News Measures from Textual Data and an Application to Volatility Forecasting," Econometrics, MDPI, vol. 5(3), pages 1-46, August.
    9. Takeshi Inuduka & Akihito Yokose & Shunsuke Managi, 2024. "Influencing cryptocurrency: analyzing celebrity sentiments on X (formerly Twitter) and their impact on bitcoin prices," Digital Finance, Springer, vol. 6(3), pages 379-426, September.
    10. Domenica Mino & Cillian Williamson, 2025. "Sentiment and Volatility in Financial Markets: A Review of BERT and GARCH Applications during Geopolitical Crises," Papers 2510.16503, arXiv.org.
    11. Roman Frydman & Soren Johansen & Anders Rahbek & Morten Tabor, 2017. "The Qualitative Expectations Hypothesis: Model Ambiguity, Consistent Representations of Market Forecasts, and Sentiment," Working Papers Series 59, Institute for New Economic Thinking.
    12. Gianna Figà-Talamanca & Marco Patacca, 2024. "An explorative analysis of sentiment impact on S&P 500 components returns, volatility and downside risk," Annals of Operations Research, Springer, vol. 342(3), pages 2095-2117, November.
    13. Dániel Léber & Balázs Egyed, 2026. "The Sentiment Augmented GARCH-LSTM Hybrid Model for Value-at-Risk Forecasting," Computational Economics, Springer;Society for Computational Economics, vol. 67(1), pages 313-353, January.
    14. Filippo Lechthaler & Lisa Leinert, 2019. "Moody oil: What is driving the crude oil price?," Empirical Economics, Springer, vol. 57(5), pages 1547-1578, November.
    15. repec:spo:wpmain:info:hdl:2441/7mota32nad8aopst8f7d5aebpo is not listed on IDEAS
    16. repec:hal:wpaper:hal-01374710 is not listed on IDEAS
    17. Marcelo Sardelich & Suresh Manandhar, 2018. "Multimodal deep learning for short-term stock volatility prediction," Papers 1812.10479, arXiv.org.
    18. Prajwal Eachempati & Praveen Ranjan Srivastava, 2021. "Accounting for unadjusted news sentiment for asset pricing," Qualitative Research in Financial Markets, Emerald Group Publishing Limited, vol. 13(3), pages 383-422, May.
    19. Arif Pathan, 2025. "Transformers Beyond Order: A Chaos-Markov-Gaussian Framework for Short-Term Sentiment Forecasting of Any Financial OHLC timeseries Data," Papers 2506.17244, arXiv.org.
    20. Ho, Kin-Yip & Shi, Yanlin & Zhang, Zhaoyong, 2013. "How does news sentiment impact asset volatility? Evidence from long memory and regime-switching approaches," The North American Journal of Economics and Finance, Elsevier, vol. 26(C), pages 436-456.
    21. Masoud Soleimani, 2025. "LLM-Generated Counterfactual Stress Scenarios for Portfolio Risk Simulation via Hybrid Prompt-RAG Pipeline," Papers 2512.07867, arXiv.org.
    22. Yen-Ju Hsu & Yang-Cheng Lu & J. Jimmy Yang, 2021. "News sentiment and stock market volatility," Review of Quantitative Finance and Accounting, Springer, vol. 57(3), pages 1093-1122, October.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • A - General Economics and Teaching

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:isp:journl:v:19:y:2025:i:1:p:215-224. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Svetoslav Ivanov (email available below). General contact details of provider: https://www.scientific-publications.net/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.