IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2107.08721.html
   My bibliography  Save this paper

Stock Movement Prediction with Financial News using Contextualized Embedding from BERT

Author

Listed:
  • Qinkai Chen

Abstract

News events can greatly influence equity markets. In this paper, we are interested in predicting the short-term movement of stock prices after financial news events using only the headlines of the news. To achieve this goal, we introduce a new text mining method called Fine-Tuned Contextualized-Embedding Recurrent Neural Network (FT-CE-RNN). Compared with previous approaches which use static vector representations of the news (static embedding), our model uses contextualized vector representations of the headlines (contextualized embeddings) generated from Bidirectional Encoder Representations from Transformers (BERT). Our model obtains the state-of-the-art result on this stock movement prediction task. It shows significant improvement compared with other baseline models, in both accuracy and trading simulations. Through various trading simulations based on millions of headlines from Bloomberg News, we demonstrate the ability of this model in real scenarios.

Suggested Citation

  • Qinkai Chen, 2021. "Stock Movement Prediction with Financial News using Contextualized Embedding from BERT," Papers 2107.08721, arXiv.org.
  • Handle: RePEc:arx:papers:2107.08721
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2107.08721
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Matthew Gentzkow & Bryan Kelly & Matt Taddy, 2019. "Text as Data," Journal of Economic Literature, American Economic Association, vol. 57(3), pages 535-574, September.
    2. Zhi-Qiang Jiang & Gang-Jin Wang & Askery Canabarro & Boris Podobnik & Chi Xie & H. Eugene Stanley & Wei-Xing Zhou, 2018. "Short term prediction of extreme returns based on the recurrence interval analysis," Quantitative Finance, Taylor & Francis Journals, vol. 18(3), pages 353-370, March.
    3. Dimitri Kroujiline & Maxim Gusev & Dmitry Ushanov & Sergey V. Sharov & Boris Govorkov, 2015. "Forecasting stock market returns over multiple time horizons," Papers 1508.04332, arXiv.org, revised Mar 2016.
    4. Dimitri Kroujiline & Maxim Gusev & Dmitry Ushanov & Sergey V. Sharov & Boris Govorkov, 2016. "Forecasting stock market returns over multiple time horizons," Quantitative Finance, Taylor & Francis Journals, vol. 16(11), pages 1695-1712, November.
    5. Jianping Li & Guowen Li & Xiaoqian Zhu & Yanzhen Yao, 2020. "Identifying the influential factors of commodity futures prices through a new text mining approach," Quantitative Finance, Taylor & Francis Journals, vol. 20(12), pages 1967-1981, December.
    6. Patell, Jm, 1976. "Corporate Forecasts Of Earnings Per Share And Stock-Price Behavior - Empirical Tests," Journal of Accounting Research, Wiley Blackwell, vol. 14(2), pages 246-276.
    7. Ymir Mäkinen & Juho Kanniainen & Moncef Gabbouj & Alexandros Iosifidis, 2019. "Forecasting jump arrivals in stock prices: new attention-based network architecture using limit order book data," Quantitative Finance, Taylor & Francis Journals, vol. 19(12), pages 2033-2050, December.
    8. Ronny Luss & Alexandre D'Aspremont, 2015. "Predicting abnormal returns from news using text classification," Quantitative Finance, Taylor & Francis Journals, vol. 15(6), pages 999-1012, June.
    9. Zheng Tracy Ke & Bryan T. Kelly & Dacheng Xiu, 2019. "Predicting Returns With Text Data," NBER Working Papers 26186, National Bureau of Economic Research, Inc.
    10. Kraft, John & Kraft, Arthur, 1977. "Determinants of Common Stock Prices: A Time Series Analysis," Journal of Finance, American Finance Association, vol. 32(2), pages 417-425, May.
    11. Daigo Tashiro & Hiroyasu Matsushima & Kiyoshi Izumi & Hiroki Sakaji, 2019. "Encoding of high-frequency order information and prediction of short-term stock price by deep learning," Quantitative Finance, Taylor & Francis Journals, vol. 19(9), pages 1499-1506, September.
    12. Xingchen Wan & Jie Yang & Slavi Marinov & Jan-Peter Calliess & Stefan Zohren & Xiaowen Dong, 2020. "Sentiment Correlation in Financial News Networks and Associated Market Movements," Papers 2011.06430, arXiv.org, revised Feb 2021.
    13. Bhandari, Laxmi Chand, 1988. " Debt/Equity Ratio and Expected Common Stock Returns: Empirical Evidence," Journal of Finance, American Finance Association, vol. 43(2), pages 507-528, June.
    14. Guillaume Coqueret, 2020. "Stock-specific sentiment and return predictability," Quantitative Finance, Taylor & Francis Journals, vol. 20(9), pages 1531-1551, September.
    15. Shun Chen & Lei Ge, 2019. "Exploring the attention mechanism in LSTM-based Hong Kong stock price movement prediction," Quantitative Finance, Taylor & Francis Journals, vol. 19(9), pages 1507-1515, September.
    16. Doron Sonsino & Tal Shavit, 2014. "Return prediction and stock selection from unidentified historical data," Quantitative Finance, Taylor & Francis Journals, vol. 14(4), pages 641-655, April.
    17. Nima Nonejad, 2021. "Bayesian model averaging and the conditional volatility process: an application to predicting aggregate equity returns by conditioning on economic variables," Quantitative Finance, Taylor & Francis Journals, vol. 21(8), pages 1387-1411, August.
    18. Huazhu Zhang & Cheng Yan, 2018. "Modelling fundamental analysis in portfolio selection," Quantitative Finance, Taylor & Francis Journals, vol. 18(8), pages 1315-1326, August.
    19. Dave Donaldson & Adam Storeygard, 2016. "The View from Above: Applications of Satellite Data in Economics," Journal of Economic Perspectives, American Economic Association, vol. 30(4), pages 171-198, Fall.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jianfei Zhang & Mathieu Rosenbaum, 2023. "Towards systematic intraday news screening: a liquidity-focused approach," Papers 2304.05115, arXiv.org.
    2. Yanzhao Zou & Dorien Herremans, 2022. "PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin," Papers 2206.00648, arXiv.org, revised Oct 2023.
    3. Liping Wang & Jiawei Li & Lifan Zhao & Zhizhuo Kou & Xiaohan Wang & Xinyi Zhu & Hao Wang & Yanyan Shen & Lei Chen, 2023. "Methods for Acquiring and Incorporating Knowledge into Stock Price Prediction: A Survey," Papers 2308.04947, arXiv.org.
    4. Jimei Shen & Zhehu Yuan & Yifan Jin, 2022. "AlphaMLDigger: A Novel Machine Learning Solution to Explore Excess Return on Investment," Papers 2206.11072, arXiv.org, revised Dec 2022.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Eghbal Rahimikia & Stefan Zohren & Ser-Huang Poon, 2021. "Realised Volatility Forecasting: Machine Learning via Financial Word Embedding," Papers 2108.00480, arXiv.org, revised Mar 2023.
    2. Massimo Ferrari Minesso & Laura Lebastard & Helena Mezo, 2023. "Text-Based Recession Probabilities," IMF Economic Review, Palgrave Macmillan;International Monetary Fund, vol. 71(2), pages 415-438, June.
    3. Ge, S., 2020. "Text-Based Linkages and Local Risk Spillovers in the Equity Market," Cambridge Working Papers in Economics 20115, Faculty of Economics, University of Cambridge.
    4. Luca Grilli & Domenico Santoro, 2022. "Forecasting financial time series with Boltzmann entropy through neural networks," Computational Management Science, Springer, vol. 19(4), pages 665-681, October.
    5. Salman Bahoo & Marco Cucculelli & Xhoana Goga & Jasmine Mondolo, 2024. "Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis," SN Business & Economics, Springer, vol. 4(2), pages 1-46, February.
    6. Hansen, Stephen & Davis, Steven & Seminario-Amez, Cristhian, 2020. "Firm-level Risk Exposures and Stock Returns in the Wake of COVID-19," CEPR Discussion Papers 15314, C.E.P.R. Discussion Papers.
    7. Mengda Li & Charles-Albert Lehalle, 2021. "Do Word Embeddings Really Understand Loughran-McDonald's Polarities?," Papers 2103.09813, arXiv.org.
    8. Simon Fritzsch & Philipp Scharner & Gregor Weiß, 2021. "Estimating the relation between digitalization and the market value of insurers," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 88(3), pages 529-567, September.
    9. Dimitri Kroujiline & Maxim Gusev & Dmitry Ushanov & Sergey V. Sharov & Boris Govorkov, 2018. "An Endogenous Mechanism of Business Cycles," Papers 1803.05002, arXiv.org, revised Sep 2019.
    10. Marcus Cordi & Damien Challet & Serge Kassibrakis, 2021. "The market nanostructure origin of asset price time reversal asymmetry," Quantitative Finance, Taylor & Francis Journals, vol. 21(2), pages 295-304, February.
    11. Zheng, Hannan & Schwenkler, Gustavo, 2020. "The network of firms implied by the news," ESRB Working Paper Series 108, European Systemic Risk Board.
    12. Chao, Xiangrui & Ran, Qin & Chen, Jia & Li, Tie & Qian, Qian & Ergu, Daji, 2022. "Regulatory technology (Reg-Tech) in financial stability supervision: Taxonomy, key methods, applications and future directions," International Review of Financial Analysis, Elsevier, vol. 80(C).
    13. Marozzi, Armando, 2021. "The ECB's tracker: nowcasting the press conferences of the ECB," Working Paper Series 2609, European Central Bank.
    14. Tom Marty & Bruce Vanstone & Tobias Hahn, 2020. "News media analytics in finance: a survey," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 60(2), pages 1385-1434, June.
    15. Marcus Cordi & Serge Kassibrakis & Damien Challet, 2018. "The market nanostructure origin of asset price time reversal asymmetry," Working Papers hal-01966419, HAL.
    16. Zhang, Yongjie & Chu, Gang & Shen, Dehua, 2021. "The role of investor attention in predicting stock prices: The long short-term memory networks perspective," Finance Research Letters, Elsevier, vol. 38(C).
    17. Charles W. Calomiris & Nida Çakır Melek & Harry Mamaysky, 2021. "Predicting the Oil Market," NBER Working Papers 29379, National Bureau of Economic Research, Inc.
    18. Ardia, David & Bluteau, Keven & Boudt, Kris, 2022. "Media abnormal tone, earnings announcements, and the stock market," Journal of Financial Markets, Elsevier, vol. 61(C).
    19. Karl Naumann-Woleske & Michael Benzaquen & Maxim Gusev & Dimitri Kroujiline, 2021. "Capital Demand Driven Business Cycles: Mechanism and Effects," Papers 2110.00360, arXiv.org, revised Sep 2022.
    20. Schnaubelt, Matthias & Fischer, Thomas G. & Krauss, Christopher, 2020. "Separating the signal from the noise – Financial machine learning for Twitter," Journal of Economic Dynamics and Control, Elsevier, vol. 114(C).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2107.08721. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.