IDEAS home Printed from https://ideas.repec.org/a/mup/actaun/actaun_2018066061431.html
   My bibliography  Save this article

Analysis of the Association between Topics in Online Documents and Stock Price Movements

Author

Listed:
  • František Dařena

    (Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 61300 Brno, Czech Republic)

  • Jan Přichystal

    (Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 61300 Brno, Czech Republic)

Abstract

This paper aims at discovering the topics hidden in the newspaper articles that have an impact on movements of stock prices of the corresponding companies. Document topics are characterized by combinations of specific words in documents and are shared across a document collection. We describe the process of discovering the topics, the creation of a mapping of the topics to stock price movements, and quantifying and evaluating the results. As the method for finding and quantifying the association, we use machine learning-based classification. We achieved an accuracy of stock price movement predictions higher than 70 %. A feature selection procedure was applied to the features characterizing the topics in order to facilitate the process of assigning a label to the topic by a human expert.

Suggested Citation

  • František Dařena & Jan Přichystal, 2018. "Analysis of the Association between Topics in Online Documents and Stock Price Movements," Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, Mendel University Press, vol. 66(6), pages 1431-1439.
  • Handle: RePEc:mup:actaun:actaun_2018066061431
    DOI: 10.11118/actaun201866061431
    as

    Download full text from publisher

    File URL: http://acta.mendelu.cz/doi/10.11118/actaun201866061431.html
    Download Restriction: free of charge

    File URL: http://acta.mendelu.cz/doi/10.11118/actaun201866061431.pdf
    Download Restriction: free of charge

    File URL: https://libkey.io/10.11118/actaun201866061431?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Bukovina, Jaroslav, 2016. "Social media big data and capital markets—An overview," Journal of Behavioral and Experimental Finance, Elsevier, vol. 11(C), pages 18-26.
    2. Siganos, Antonios & Vagenas-Nanos, Evangelos & Verwijmeren, Patrick, 2017. "Divergence of sentiment and stock market trading," Journal of Banking & Finance, Elsevier, vol. 78(C), pages 130-141.
    3. Blau, Benjamin M. & Griffith, Todd G., 2016. "Price clustering and the stability of stock prices," Journal of Business Research, Elsevier, vol. 69(10), pages 3933-3942.
    4. Kearney, Colm & Liu, Sha, 2014. "Textual sentiment in finance: A survey of methods and models," International Review of Financial Analysis, Elsevier, vol. 33(C), pages 171-185.
    5. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    6. Felix Ming Fai Wong & Zhenming Liu & Mung Chiang, 2014. "Stock Market Prediction from WSJ: Text Mining via Sparse Matrix Factorization," Papers 1406.7330, arXiv.org.
    7. S. le Cessie & J. C. van Houwelingen, 1992. "Ridge Estimators in Logistic Regression," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 41(1), pages 191-201, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Pavel Netolický & Jonáš Petrovský & František Dařena, 2018. "Text-Mining in Streams of Textual Data Using Time Series Applied to Stock Market," Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, Mendel University Press, vol. 66(6), pages 1573-1580.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Frantisek Darena & Jonas Petrovsky & Jan Zizka & Jan Prichystal, 2016. "Analyzing the correlation between online texts and stock price movements at micro-level using machine learning," MENDELU Working Papers in Business and Economics 2016-67, Mendel University in Brno, Faculty of Business and Economics.
    2. Patrick Houlihan & Germán G. Creamer, 2021. "Leveraging Social Media to Predict Continuation and Reversal in Asset Prices," Computational Economics, Springer;Society for Computational Economics, vol. 57(2), pages 433-453, February.
    3. Andrew Todd & James Bowden & Yashar Moshfeghi, 2024. "Text‐based sentiment analysis in finance: Synthesising the existing literature and exploring future directions," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 31(1), March.
    4. Yan Luo & Linying Zhou, 2020. "Textual tone in corporate financial disclosures: a survey of the literature," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 17(2), pages 101-110, September.
    5. Bennani, Hamza, 2018. "Media coverage and ECB policy-making: Evidence from an augmented Taylor rule," Journal of Macroeconomics, Elsevier, vol. 57(C), pages 26-38.
    6. Yuting Chen & Don Bredin & Valerio Potì & Roman Matkovskyy, 2022. "COVID risk narratives: a computational linguistic approach to the econometric identification of narrative risk during a pandemic," Digital Finance, Springer, vol. 4(1), pages 17-61, March.
    7. Sun, Andrew & Lachanski, Michael & Fabozzi, Frank J., 2016. "Trade the tweet: Social media text mining and sparse matrix factorization for stock market prediction," International Review of Financial Analysis, Elsevier, vol. 48(C), pages 272-281.
    8. Picault, Matthieu & Pinter, Julien & Renault, Thomas, 2022. "Media sentiment on monetary policy: Determinants and relevance for inflation expectations," Journal of International Money and Finance, Elsevier, vol. 124(C).
    9. Renato Camodeca & Alex Almici & Umberto Sagliaschi, 2018. "Sustainability Disclosure in Integrated Reporting: Does It Matter to Investors? A Cheap Talk Approach," Sustainability, MDPI, vol. 10(12), pages 1-34, November.
    10. Shuangyan Li & Guangrui Wang & Yongli Luo, 2022. "Tone of language, financial disclosure, and earnings management: a textual analysis of form 20-F," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-24, December.
    11. Massimo Ferrari Minesso & Frederik Kurcz & Maria Sole Pagliari, 2022. "Do words hurt more than actions? The impact of trade tensions on financial markets," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(6), pages 1138-1159, September.
    12. Picault, Matthieu & Renault, Thomas, 2017. "Words are not all created equal: A new measure of ECB communication," Journal of International Money and Finance, Elsevier, vol. 79(C), pages 136-156.
    13. Vegard Høghaug Larsen & Leif Anders Thorsrud, 2022. "Asset returns, news topics, and media effects," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 838-868, July.
    14. Ricardo Correa & Keshav Garud & Juan M Londono & Nathan Mislang, 2021. "Sentiment in Central Banks’ Financial Stability Reports," Review of Finance, European Finance Association, vol. 25(1), pages 85-120.
    15. Anastasiou, Dimitrios & Katsafados, Apostolos G., 2020. "Bank Deposits Flows and Textual Sentiment: When an ECB President's speech is not just a speech," MPRA Paper 99729, University Library of Munich, Germany.
    16. Saadon, Yossi & Schreiber, Ben Z., 2023. "Newspapers tone and the overnight-intraday stock return anomaly," Journal of Financial Markets, Elsevier, vol. 65(C).
    17. Oz, Seda, 2024. "The impact of terrorist attacks and mass shootings on earnings management," The British Accounting Review, Elsevier, vol. 56(3).
    18. Loughran, Tim & McDonald, Bill & Pragidis, Ioannis, 2019. "Assimilation of oil news into prices," International Review of Financial Analysis, Elsevier, vol. 63(C), pages 105-118.
    19. Renault, Thomas, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Journal of Banking & Finance, Elsevier, vol. 84(C), pages 25-40.
    20. Nicolò Fraccaroli & Alessandro Giovannini & Jean-François Jamet & Eric Persson, 2023. "Central Banks in Parliaments: A Text Analysis of the Parliamentary Hearings of the Bank of England, the European Central Bank, and the Federal Reserve," International Journal of Central Banking, International Journal of Central Banking, vol. 19(2), pages 543-600, June.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:mup:actaun:actaun_2018066061431. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Ivo Andrle (email available below). General contact details of provider: https://mendelu.cz/en/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.