IDEAS home Printed from
   My bibliography  Save this paper

Predicting Financial Markets: Comparing Survey, News, Twitter and Search Engine Data


  • Huina Mao
  • Scott Counts
  • Johan Bollen


Financial market prediction on the basis of online sentiment tracking has drawn a lot of attention recently. However, most results in this emerging domain rely on a unique, particular combination of data sets and sentiment tracking tools. This makes it difficult to disambiguate measurement and instrument effects from factors that are actually involved in the apparent relation between online sentiment and market values. In this paper, we survey a range of online data sets (Twitter feeds, news headlines, and volumes of Google search queries) and sentiment tracking methods (Twitter Investor Sentiment, Negative News Sentiment and Tweet & Google Search volumes of financial terms), and compare their value for financial prediction of market indices such as the Dow Jones Industrial Average, trading volumes, and market volatility (VIX), as well as gold prices. We also compare the predictive power of traditional investor sentiment survey data, i.e. Investor Intelligence and Daily Sentiment Index, against those of the mentioned set of online sentiment indicators. Our results show that traditional surveys of Investor Intelligence are lagging indicators of the financial markets. However, weekly Google Insight Search volumes on financial search queries do have predictive value. An indicator of Twitter Investor Sentiment and the frequency of occurrence of financial terms on Twitter in the previous 1-2 days are also found to be very statistically significant predictors of daily market log return. Survey sentiment indicators are however found not to be statistically significant predictors of financial market values, once we control for all other mood indicators as well as the VIX.

Suggested Citation

  • Huina Mao & Scott Counts & Johan Bollen, 2011. "Predicting Financial Markets: Comparing Survey, News, Twitter and Search Engine Data," Papers 1112.1051,
  • Handle: RePEc:arx:papers:1112.1051

    Download full text from publisher

    File URL:
    File Function: Latest version
    Download Restriction: no

    References listed on IDEAS

    1. Kahneman, Daniel & Tversky, Amos, 1979. "Prospect Theory: An Analysis of Decision under Risk," Econometrica, Econometric Society, vol. 47(2), pages 263-291, March.
    2. Paul C. Tetlock, 2007. "Giving Content to Investor Sentiment: The Role of Media in the Stock Market," Journal of Finance, American Finance Association, vol. 62(3), pages 1139-1168, June.
    3. Werner Antweiler & Murray Z. Frank, 2004. "Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards," Journal of Finance, American Finance Association, vol. 59(3), pages 1259-1294, June.
    4. Zhi Da & Joseph Engelberg & Pengjie Gao, 2011. "In Search of Attention," Journal of Finance, American Finance Association, vol. 66(5), pages 1461-1499, October.
    Full references (including those not matched with items on IDEAS)


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Thomas Forss & Peter Sarlin, 2017. "News-sentiment networks as a risk indicator," Papers 1706.05812,
    2. Stanislaus Maier-Paape & Andreas Platen, 2015. "Lead-Lag Relationship using a Stop-and-Reverse-MinMax Process," Papers 1504.06235,
    3. Ji, Qiang & Guo, Jian-Feng, 2015. "Oil price volatility and oil-related events: An Internet concern study perspective," Applied Energy, Elsevier, vol. 137(C), pages 256-264.
    4. repec:eee:phsmap:v:483:y:2017:i:c:p:139-155 is not listed on IDEAS
    5. repec:spr:elmark:v:27:y:2017:i:3:d:10.1007_s12525-017-0254-5 is not listed on IDEAS
    6. Jaroslav Bukovina, 2016. "Social Media and Capital Markets – an Overview," MENDELU Working Papers in Business and Economics 2016-57, Mendel University in Brno, Faculty of Business and Economics.
    7. Jaroslav Bukovina, 2015. "Sentiment of a society and large-cap stock liquidity," MENDELU Working Papers in Business and Economics 2015-56, Mendel University in Brno, Faculty of Business and Economics.
    8. Guo, Jian-Feng & Ji, Qiang, 2013. "How does market concern derived from the Internet affect oil prices?," Applied Energy, Elsevier, vol. 112(C), pages 1536-1543.
    9. Gabriele Ranco & Ilaria Bordino & Giacomo Bormetti & Guido Caldarelli & Fabrizio Lillo & Michele Treccani, 2014. "Coupling news sentiment with web browsing data improves prediction of intra-day price dynamics," Papers 1412.3948,, revised Dec 2015.
    10. repec:men:wpaper:57_2015 is not listed on IDEAS
    11. Felix Ming Fai Wong & Zhenming Liu & Mung Chiang, 2014. "Stock Market Prediction from WSJ: Text Mining via Sparse Matrix Factorization," Papers 1406.7330,

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1112.1051. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (arXiv administrators). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.