IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2108.00480.html
   My bibliography  Save this paper

Realised Volatility Forecasting: Machine Learning via Financial Word Embedding

Author

Listed:
  • Eghbal Rahimikia
  • Stefan Zohren
  • Ser-Huang Poon

Abstract

This study develops FinText, a financial word embedding compiled from 15 years of business news archives. The results show that FinText produces substantially more accurate results than general word embeddings based on the gold-standard financial benchmark we introduced. In contrast to well-known econometric models, and over the sample period from 27 July 2007 to 27 January 2022 for 23 NASDAQ stocks, using stock-related news, our simple natural language processing model supported by different word embeddings improves realised volatility forecasts on high volatility days. This improvement in realised volatility forecasting performance switches to normal volatility days when general hot news is used. By utilising SHAP, an Explainable AI method, we also identify and classify key phrases in stock-related and general hot news that moved volatility.

Suggested Citation

  • Eghbal Rahimikia & Stefan Zohren & Ser-Huang Poon, 2021. "Realised Volatility Forecasting: Machine Learning via Financial Word Embedding," Papers 2108.00480, arXiv.org, revised Mar 2023.
  • Handle: RePEc:arx:papers:2108.00480
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2108.00480
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Matthew Gentzkow & Bryan Kelly & Matt Taddy, 2019. "Text as Data," Journal of Economic Literature, American Economic Association, vol. 57(3), pages 535-574, September.
    2. Bubna, Amit & Das, Sanjiv R. & Prabhala, Nagpurnanand, 2020. "Venture Capital Communities," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 55(2), pages 621-651, March.
    3. Zihao Zhang & Stefan Zohren & Stephen Roberts, 2018. "BDLOB: Bayesian Deep Convolutional Neural Networks for Limit Order Books," Papers 1811.10041, arXiv.org.
    4. Daniele Bianchi & Matthias Büchner & Tobias Hoogteijling & Andrea Tamoni, 2021. "Corrigendum: Bond Risk Premiums with Machine Learning [Bond risk premiums with machine learning]," The Review of Financial Studies, Society for Financial Studies, vol. 34(2), pages 1090-1103.
    5. Tim Loughran & Bill Mcdonald, 2016. "Textual Analysis in Accounting and Finance: A Survey," Journal of Accounting Research, Wiley Blackwell, vol. 54(4), pages 1187-1230, September.
    6. Luyang Chen & Markus Pelger & Jason Zhu, 2019. "Deep Learning in Asset Pricing," Papers 1904.00745, arXiv.org, revised Aug 2021.
    7. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    8. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    9. Bollerslev, Tim & Patton, Andrew J. & Quaedvlieg, Rogier, 2016. "Exploiting the errors: A simple approach for improved volatility forecasting," Journal of Econometrics, Elsevier, vol. 192(1), pages 1-18.
    10. Andrew J. Patton & Kevin Sheppard, 2015. "Good Volatility, Bad Volatility: Signed Jumps and The Persistence of Volatility," The Review of Economics and Statistics, MIT Press, vol. 97(3), pages 683-697, July.
    11. Zheng Tracy Ke & Bryan T. Kelly & Dacheng Xiu, 2019. "Predicting Returns With Text Data," NBER Working Papers 26186, National Bureau of Economic Research, Inc.
    12. Patton, Andrew J., 2011. "Volatility forecast comparison using imperfect volatility proxies," Journal of Econometrics, Elsevier, vol. 160(1), pages 246-256, January.
    13. Justin Sirignano & Rama Cont, 2019. "Universal features of price formation in financial markets: perspectives from deep learning," Quantitative Finance, Taylor & Francis Journals, vol. 19(9), pages 1449-1459, September.
    14. Shapiro, Adam Hale & Sudhof, Moritz & Wilson, Daniel J., 2022. "Measuring news sentiment," Journal of Econometrics, Elsevier, vol. 228(2), pages 221-243.
    15. Daniele Bianchi & Matthias Büchner & Andrea Tamoni, 2021. "Bond Risk Premiums with Machine Learning [Quadratic term structure models: Theory and evidence]," The Review of Financial Studies, Society for Financial Studies, vol. 34(2), pages 1046-1089.
    16. Engle, Robert F & Ng, Victor K, 1993. "Measuring and Testing the Impact of News on Volatility," Journal of Finance, American Finance Association, vol. 48(5), pages 1749-1778, December.
    17. Zihao Zhang & Stefan Zohren, 2021. "Multi-Horizon Forecasting for Limit Order Books: Novel Deep Learning Approaches and Hardware Acceleration using Intelligent Processing Units," Papers 2105.10430, arXiv.org, revised Aug 2021.
    18. Leland Bybee & Bryan T. Kelly & Asaf Manela & Dacheng Xiu, 2020. "The Structure of Economic News," NBER Working Papers 26648, National Bureau of Economic Research, Inc.
    19. Fulvio Corsi, 2009. "A Simple Approximate Long-Memory Model of Realized Volatility," Journal of Financial Econometrics, Oxford University Press, vol. 7(2), pages 174-196, Spring.
    20. Gah-Yi Ban & Noureddine El Karoui & Andrew E. B. Lim, 2018. "Machine Learning and Portfolio Optimization," Management Science, INFORMS, vol. 64(3), pages 1136-1154, March.
    21. Gu, Shihao & Kelly, Bryan & Xiu, Dacheng, 2021. "Autoencoder asset pricing models," Journal of Econometrics, Elsevier, vol. 222(1), pages 429-450.
    22. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    23. Kalay, Avner & Sade, Orly & Wohl, Avi, 2004. "Measuring stock illiquidity: An investigation of the demand and supply schedules at the TASE," Journal of Financial Economics, Elsevier, vol. 74(3), pages 461-486, December.
    24. Kieran Wood & Stephen Roberts & Stefan Zohren, 2021. "Slow Momentum with Fast Reversion: A Trading Strategy Using Deep Learning and Changepoint Detection," Papers 2105.13727, arXiv.org, revised Dec 2021.
    25. Robert F. Engle & Susana Campos-Martins, 2020. "Measuring and Hedging Geopolitical Risk," NIPE Working Papers 08/2020, NIPE - Universidade do Minho.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hoang, Daniel & Wiegratz, Kevin, 2022. "Machine learning methods in finance: Recent applications and prospects," Working Paper Series in Economics 158, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
    2. Andrew J. Patton & Yasin Simsek, 2023. "Generalized Autoregressive Score Trees and Forests," Papers 2305.18991, arXiv.org.
    3. Chao Zhang & Yihuang Zhang & Mihai Cucuringu & Zhongmin Qian, 2022. "Volatility forecasting with machine learning and intraday commonality," Papers 2202.08962, arXiv.org, revised Feb 2023.
    4. Sergio Consoli & Luca Tiozzo Pezzoli & Elisa Tosetti, 2022. "Neural forecasting of the Italian sovereign bond market with economic news," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(S2), pages 197-224, December.
    5. Casas Villalba, Maria Isabel & Mao, Xiuping & Lopes Moreira Da Veiga, María Helena, 2020. "Adaptative predictability of stock market returns," DES - Working Papers. Statistics and Econometrics. WS 31648, Universidad Carlos III de Madrid. Departamento de Estadística.
    6. Zhu, Haibin & Bai, Lu & He, Lidan & Liu, Zhi, 2023. "Forecasting realized volatility with machine learning: Panel data perspective," Journal of Empirical Finance, Elsevier, vol. 73(C), pages 251-271.
    7. Borup, Daniel & Christensen, Bent Jesper & Mühlbach, Nicolaj Søndergaard & Nielsen, Mikkel Slot, 2023. "Targeting predictors in random forest regression," International Journal of Forecasting, Elsevier, vol. 39(2), pages 841-868.
    8. Zhao, Albert Bo & Cheng, Tingting, 2022. "Stock return prediction: Stacking a variety of models," Journal of Empirical Finance, Elsevier, vol. 67(C), pages 288-317.
    9. Luo, Qin & Bu, Jinfeng & Xu, Weiju & Huang, Dengshi, 2023. "Stock market volatility prediction: Evidence from a new bagging model," International Review of Economics & Finance, Elsevier, vol. 87(C), pages 445-456.
    10. Paul Geertsema & Helen Lu, 2023. "Relative Valuation with Machine Learning," Journal of Accounting Research, Wiley Blackwell, vol. 61(1), pages 329-376, March.
    11. Cakici, Nusret & Fieberg, Christian & Metko, Daniel & Zaremba, Adam, 2023. "Machine learning goes global: Cross-sectional return predictability in international stock markets," Journal of Economic Dynamics and Control, Elsevier, vol. 155(C).
    12. Kim Christensen & Mathias Siggaard & Bezirgen Veliyev, 2021. "A machine learning approach to volatility forecasting," CREATES Research Papers 2021-03, Department of Economics and Business Economics, Aarhus University.
    13. Victor DeMiguel & Javier Gil-Bazo & Francisco J. Nogales & André A. P. Santos, 2021. "Can Machine Learning Help to Select Portfolios of Mutual Funds?," Working Papers 1245, Barcelona School of Economics.
    14. Bollerslev, Tim & Medeiros, Marcelo C. & Patton, Andrew J. & Quaedvlieg, Rogier, 2022. "From zero to hero: Realized partial (co)variances," Journal of Econometrics, Elsevier, vol. 231(2), pages 348-360.
    15. Gang Chu & John W. Goodell & Dehua Shen & Yongjie Zhang, 2022. "Machine learning to establish proxies for investor attention: evidence of improved stock-return prediction," Annals of Operations Research, Springer, vol. 318(1), pages 103-128, November.
    16. Chao Zhang & Xingyue Pu & Mihai Cucuringu & Xiaowen Dong, 2023. "Graph Neural Networks for Forecasting Multivariate Realized Volatility with Spillover Effects," Papers 2308.01419, arXiv.org.
    17. Helena Chuliá & Sabuhi Khalili & Jorge M. Uribe, 2024. "Monitoring time-varying systemic risk in sovereign debt and currency markets with generative AI," IREA Working Papers 202402, University of Barcelona, Research Institute of Applied Economics, revised Feb 2024.
    18. Lu, Xinjie & Ma, Feng & Xu, Jin & Zhang, Zehui, 2022. "Oil futures volatility predictability: New evidence based on machine learning models11All the authors contribute to the paper equally," International Review of Financial Analysis, Elsevier, vol. 83(C).
    19. Schnaubelt, Matthias & Seifert, Oleg, 2020. "Valuation ratios, surprises, uncertainty or sentiment: How does financial machine learning predict returns from earnings announcements?," FAU Discussion Papers in Economics 04/2020, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    20. Jorge Guijarro-Ordonez & Markus Pelger & Greg Zanotti, 2021. "Deep Learning Statistical Arbitrage," Papers 2106.04028, arXiv.org, revised Oct 2022.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2108.00480. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.