IDEAS home Printed from https://ideas.repec.org/a/mup/actaun/actaun_2018066061573.html
   My bibliography  Save this article

Text-Mining in Streams of Textual Data Using Time Series Applied to Stock Market

Author

Listed:
  • Pavel Netolický

    (Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 613 00 Brno, Czech Republic)

  • Jonáš Petrovský

    (Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 613 00 Brno, Czech Republic)

  • František Dařena

    (Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 613 00 Brno, Czech Republic)

Abstract

Each day, a lot of text data is generated. This data comes from various sources and may contain valuable information. In this article, we use text mining methods to discover if there is a connection between news articles and changes of the S&P 500 stock index. The index values and documents were divided into time windows according to the direction of the index value changes. We achieved a classification accuracy of 65-74 %.

Suggested Citation

  • Pavel Netolický & Jonáš Petrovský & František Dařena, 2018. "Text-Mining in Streams of Textual Data Using Time Series Applied to Stock Market," Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, Mendel University Press, vol. 66(6), pages 1573-1580.
  • Handle: RePEc:mup:actaun:actaun_2018066061573
    DOI: 10.11118/actaun201866061573
    as

    Download full text from publisher

    File URL: http://acta.mendelu.cz/doi/10.11118/actaun201866061573.html
    Download Restriction: free of charge

    File URL: http://acta.mendelu.cz/doi/10.11118/actaun201866061573.pdf
    Download Restriction: free of charge

    File URL: https://libkey.io/10.11118/actaun201866061573?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Pierre Giot, 2005. "Market risk models for intraday data," The European Journal of Finance, Taylor & Francis Journals, vol. 11(4), pages 309-324.
    2. Kearney, Colm & Liu, Sha, 2014. "Textual sentiment in finance: A survey of methods and models," International Review of Financial Analysis, Elsevier, vol. 33(C), pages 171-185.
    3. František Dařena & Jan Přichystal, 2018. "Analysis of the Association between Topics in Online Documents and Stock Price Movements," Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, Mendel University Press, vol. 66(6), pages 1431-1439.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yan Luo & Linying Zhou, 2020. "Textual tone in corporate financial disclosures: a survey of the literature," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 17(2), pages 101-110, September.
    2. Bennani, Hamza, 2018. "Media coverage and ECB policy-making: Evidence from an augmented Taylor rule," Journal of Macroeconomics, Elsevier, vol. 57(C), pages 26-38.
    3. Joshua Zoen Git Hiew & Xin Huang & Hao Mou & Duan Li & Qi Wu & Yabo Xu, 2019. "BERT-based Financial Sentiment Index and LSTM-based Stock Return Predictability," Papers 1906.09024, arXiv.org, revised Jul 2022.
    4. David Bholat & Stephen Hans & Pedro Santos & Cheryl Schonhardt-Bailey, 2015. "Text mining for central banks," Handbooks, Centre for Central Banking Studies, Bank of England, number 33, April.
    5. Ahmed, Yousry & Elshandidy, Tamer, 2016. "The effect of bidder conservatism on M&A decisions: Text-based evidence from US 10-K filings," International Review of Financial Analysis, Elsevier, vol. 46(C), pages 176-190.
    6. Peter Malec, 2016. "A Semiparametric Intraday GARCH Model," Cambridge Working Papers in Economics 1633, Faculty of Economics, University of Cambridge.
    7. Rongjiang Cai & Tao Lv & Cheng Wang & Nana Liu, 2023. "Can Environmental Information Disclosure Enhance Firm Value?—An Analysis Based on Textual Characteristics of Annual Reports," IJERPH, MDPI, vol. 20(5), pages 1-21, February.
    8. František Dařena & Jan Přichystal, 2018. "Analysis of the Association between Topics in Online Documents and Stock Price Movements," Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, Mendel University Press, vol. 66(6), pages 1431-1439.
    9. Muhammad Farhan Malik & Yuan George Shan & Jamie Yixing Tong, 2022. "Do auditors price litigious tone?," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 62(S1), pages 1715-1760, April.
    10. Ahmad, Khurshid & Han, JingGuang & Hutson, Elaine & Kearney, Colm & Liu, Sha, 2016. "Media-expressed negative tone and firm-level stock returns," Journal of Corporate Finance, Elsevier, vol. 37(C), pages 152-172.
    11. Diego F. Téllez & Jesús M. Godoy, 2017. "Mission Power and Firm Financial Performance," Documentos de Trabajo CIEF 15655, Universidad EAFIT.
    12. Davidovic, Milivoje, 2021. "From pandemic to financial contagion: High-frequency risk metrics and Bayesian volatility analysis," Finance Research Letters, Elsevier, vol. 42(C).
    13. Yuting Chen & Don Bredin & Valerio Potì & Roman Matkovskyy, 2022. "COVID risk narratives: a computational linguistic approach to the econometric identification of narrative risk during a pandemic," Digital Finance, Springer, vol. 4(1), pages 17-61, March.
    14. Maria Pacurar, 2008. "Autoregressive Conditional Duration Models In Finance: A Survey Of The Theoretical And Empirical Literature," Journal of Economic Surveys, Wiley Blackwell, vol. 22(4), pages 711-751, September.
    15. Voges, Michelle & Leschinski, Christian & Sibbertsen, Philipp, 2017. "Seasonal long memory in intraday volatility and trading volume of Dow Jones stocks," Hannover Economic Papers (HEP) dp-599, Leibniz Universität Hannover, Wirtschaftswissenschaftliche Fakultät.
    16. David E. Allen & Michael McAleer, 2019. "Fake News and Propaganda: Trump’s Democratic America and Hitler’s National Socialist (Nazi) Germany," Sustainability, MDPI, vol. 11(19), pages 1-19, September.
    17. Kumar, Rahul & Deb, Soumya Guha & Mukherjee, Shubhadeep, 2020. "Do words reveal the latent truth? Identifying communication patterns of corporate losers," Journal of Behavioral and Experimental Finance, Elsevier, vol. 26(C).
    18. Xiufeng Yan, 2021. "Autoregressive conditional duration modelling of high frequency data," Papers 2111.02300, arXiv.org.
    19. Picault, Matthieu & Pinter, Julien & Renault, Thomas, 2022. "Media sentiment on monetary policy: Determinants and relevance for inflation expectations," Journal of International Money and Finance, Elsevier, vol. 124(C).
    20. Renato Camodeca & Alex Almici & Umberto Sagliaschi, 2018. "Sustainability Disclosure in Integrated Reporting: Does It Matter to Investors? A Cheap Talk Approach," Sustainability, MDPI, vol. 10(12), pages 1-34, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:mup:actaun:actaun_2018066061573. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Ivo Andrle (email available below). General contact details of provider: https://mendelu.cz/en/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.