IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v318y2022i1d10.1007_s10479-022-04892-0.html
   My bibliography  Save this article

Machine learning to establish proxies for investor attention: evidence of improved stock-return prediction

Author

Listed:
  • Gang Chu

    (Tianjin University)

  • John W. Goodell

    (University of Akron)

  • Dehua Shen

    (Nankai University, Jinnan District)

  • Yongjie Zhang

    (Tianjin University)

Abstract

It is widely recognized that limited attention capacity of individual investors affects stock performance. We construct five aggregate investor attention indices for each stock by extracting common information components related to stock returns from various attention proxies using equal-weighted (EW), principal component analysis (PCA), partial least squares (PLS), gradient boosting decision tree (GBDT), and random forest (RF) methods. In a sample of all Shanghai Stock Exchange 50 constituent stocks, we identify two attention indices constructed by machine learning algorithms, RF and GBDT, that provide economically meaningful enhanced prediction of stock returns in both in-sample and out-of-sample periods. Moreover, these indices are negatively related to return volatility. Results suggest the utility of using machine-learning to form proxies of investor attention and reveal the excellent forecasting power of these proxies in asset pricing.

Suggested Citation

  • Gang Chu & John W. Goodell & Dehua Shen & Yongjie Zhang, 2022. "Machine learning to establish proxies for investor attention: evidence of improved stock-return prediction," Annals of Operations Research, Springer, vol. 318(1), pages 103-128, November.
  • Handle: RePEc:spr:annopr:v:318:y:2022:i:1:d:10.1007_s10479-022-04892-0
    DOI: 10.1007/s10479-022-04892-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-022-04892-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-022-04892-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Han, Liyan & Xu, Yang & Yin, Libo, 2018. "Does investor attention matter? The attention-return relationships in FX markets," Economic Modelling, Elsevier, vol. 68(C), pages 644-660.
    2. Ron Kaniel & Shuming Liu & Gideon Saar & Sheridan Titman, 2012. "Individual Investor Trading and Return Patterns around Earnings Announcements," Journal of Finance, American Finance Association, vol. 67(2), pages 639-680, April.
    3. Hu, Yitong & Li, Xiao & Goodell, John W. & Shen, Dehua, 2021. "Investor attention shocks and stock co-movement: Substitution or reinforcement?," International Review of Financial Analysis, Elsevier, vol. 73(C).
    4. Aggarwal, Raj & Goodell, John W., 2008. "Equity premia in emerging markets: National characteristics as determinants," Journal of Multinational Financial Management, Elsevier, vol. 18(4), pages 389-404, October.
    5. Peng, Lin & Xiong, Wei, 2006. "Investor attention, overconfidence and category learning," Journal of Financial Economics, Elsevier, vol. 80(3), pages 563-602, June.
    6. Çepni, Oğuzhan & Guney, I. Ethem & Gupta, Rangan & Wohar, Mark E., 2020. "The role of an aligned investor sentiment index in predicting bond risk premia of the U.S," Journal of Financial Markets, Elsevier, vol. 51(C).
    7. Malcolm Baker & Jeffrey Wurgler, 2006. "Investor Sentiment and the Cross‐Section of Stock Returns," Journal of Finance, American Finance Association, vol. 61(4), pages 1645-1680, August.
    8. Harrison Hong & Jeremy C. Stein, 1999. "A Unified Theory of Underreaction, Momentum Trading, and Overreaction in Asset Markets," Journal of Finance, American Finance Association, vol. 54(6), pages 2143-2184, December.
    9. Daniele Bianchi & Matthias Büchner & Tobias Hoogteijling & Andrea Tamoni, 2021. "Corrigendum: Bond Risk Premiums with Machine Learning [Bond risk premiums with machine learning]," The Review of Financial Studies, Society for Financial Studies, vol. 34(2), pages 1090-1103.
    10. Smith, Geoffrey Peter, 2012. "Google Internet search activity and volatility prediction in the market for foreign currency," Finance Research Letters, Elsevier, vol. 9(2), pages 103-110.
    11. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," Review of Finance, European Finance Association, vol. 33(5), pages 2223-2273.
    12. Qianwei Ying & Dongmin Kong & Danglun Luo, 2015. "Investor Attention, Institutional Ownership, and Stock Return: Empirical Evidence from China," Emerging Markets Finance and Trade, Taylor & Francis Journals, vol. 51(3), pages 672-685, May.
    13. Barberis, Nicholas & Shleifer, Andrei & Vishny, Robert, 1998. "A model of investor sentiment," Journal of Financial Economics, Elsevier, vol. 49(3), pages 307-343, September.
    14. Shihao Gu & Bryan Kelly & Dacheng Xiu, 2020. "Empirical Asset Pricing via Machine Learning," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2223-2273.
    15. Li, Yue & Goodell, John W. & Shen, Dehua, 2021. "Comparing search-engine and social-media attentions in finance research: Evidence from cryptocurrencies," International Review of Economics & Finance, Elsevier, vol. 75(C), pages 723-746.
    16. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    17. Christopher J. Neely & David E. Rapach & Jun Tu & Guofu Zhou, 2014. "Forecasting the Equity Risk Premium: The Role of Technical Indicators," Management Science, INFORMS, vol. 60(7), pages 1772-1791, July.
    18. Gao, Lin & Süss, Stephan, 2015. "Market sentiment in commodity futures returns," Journal of Empirical Finance, Elsevier, vol. 33(C), pages 84-103.
    19. Dashan Huang & Fuwei Jiang & Jun Tu & Guofu Zhou, 2015. "Investor Sentiment Aligned: A Powerful Predictor of Stock Returns," The Review of Financial Studies, Society for Financial Studies, vol. 28(3), pages 791-837.
    20. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    21. Jing Chen & Yu‐Jane Liu & Lei Lu & Ya Tang, 2016. "Investor Attention and Macroeconomic News Announcements: Evidence from Stock Index Futures," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 36(3), pages 240-266, March.
    22. Pedro Bordalo & Nicola Gennaioli & Andrei Shleifer, 2012. "Salience Theory of Choice Under Risk," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 127(3), pages 1243-1285.
    23. Zhu, Zhaobo & Sun, Licheng & Chen, Min, 2019. "Fundamental strength and short-term return reversal," Journal of Empirical Finance, Elsevier, vol. 52(C), pages 22-39.
    24. Merton, Robert C, 1987. "A Simple Model of Capital Market Equilibrium with Incomplete Information," Journal of Finance, American Finance Association, vol. 42(3), pages 483-510, July.
    25. Zhang, Wei & Shen, Dehua & Zhang, Yongjie & Xiong, Xiong, 2013. "Open source information, investor attention, and asset pricing," Economic Modelling, Elsevier, vol. 33(C), pages 613-619.
    26. Zhi Da & Joseph Engelberg & Pengjie Gao, 2011. "In Search of Attention," Journal of Finance, American Finance Association, vol. 66(5), pages 1461-1499, October.
    27. Li, Xin & Ma, Jian & Wang, Shouyang & Zhang, Xun, 2015. "How does Google search affect trader positions and crude oil prices?," Economic Modelling, Elsevier, vol. 49(C), pages 162-171.
    28. Rapach, David & Zhou, Guofu, 2013. "Forecasting Stock Returns," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 328-383, Elsevier.
    29. Daniele Bianchi & Matthias Büchner & Andrea Tamoni, 2021. "Bond Risk Premiums with Machine Learning [Quadratic term structure models: Theory and evidence]," The Review of Financial Studies, Society for Financial Studies, vol. 34(2), pages 1046-1089.
    30. Fama, Eugene F, 1970. "Efficient Capital Markets: A Review of Theory and Empirical Work," Journal of Finance, American Finance Association, vol. 25(2), pages 383-417, May.
    31. Vlastakis, Nikolaos & Markellos, Raphael N., 2012. "Information demand and stock market volatility," Journal of Banking & Finance, Elsevier, vol. 36(6), pages 1808-1821.
    32. Lily Fang & Joel Peress, 2009. "Media Coverage and the Cross‐section of Stock Returns," Journal of Finance, American Finance Association, vol. 64(5), pages 2023-2052, October.
    33. Daniel Andrei & Michael Hasler, 2015. "Investor Attention and Stock Market Volatility," The Review of Financial Studies, Society for Financial Studies, vol. 28(1), pages 33-72.
    34. Aggarwal, Raj & Goodell, John W., 2011. "International variations in expected equity premia: Role of financial architecture and governance," Journal of Banking & Finance, Elsevier, vol. 35(11), pages 3090-3100, November.
    35. Lin Peng & Wei Xiong & Tim Bollerslev, 2007. "Investor Attention and Time‐varying Comovements," European Financial Management, European Financial Management Association, vol. 13(3), pages 394-422, June.
    36. Bijl, Laurens & Kringhaug, Glenn & Molnár, Peter & Sandvik, Eirik, 2016. "Google searches and stock returns," International Review of Financial Analysis, Elsevier, vol. 45(C), pages 150-156.
    37. Kent Daniel & David Hirshleifer & Avanidhar Subrahmanyam, 1998. "Investor Psychology and Security Market Under- and Overreactions," Journal of Finance, American Finance Association, vol. 53(6), pages 1839-1885, December.
    38. Salman Arif & Charles M. C. Lee, 2014. "Aggregate Investment and Investor Sentiment," The Review of Financial Studies, Society for Financial Studies, vol. 27(11), pages 3241-3279.
    39. Gu, Shihao & Kelly, Bryan & Xiu, Dacheng, 2021. "Autoencoder asset pricing models," Journal of Econometrics, Elsevier, vol. 222(1), pages 429-450.
    40. Michael S. Drake & Darren T. Roulstone & Jacob R. Thornock, 2012. "Investor Information Demand: Evidence from Google Searches Around Earnings Announcements," Journal of Accounting Research, Wiley Blackwell, vol. 50(4), pages 1001-1040, September.
    41. Ding, Rong & Hou, Wenxuan, 2015. "Retail investor attention and stock liquidity," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 37(C), pages 12-26.
    42. Brad M. Barber & Terrance Odean, 2008. "All That Glitters: The Effect of Attention and News on the Buying Behavior of Individual and Institutional Investors," The Review of Financial Studies, Society for Financial Studies, vol. 21(2), pages 785-818, April.
    43. Li, Jun & Yu, Jianfeng, 2012. "Investor attention, psychological anchors, and stock return predictability," Journal of Financial Economics, Elsevier, vol. 104(2), pages 401-419.
    44. Huang, Shiyang & Huang, Yulin & Lin, Tse-Chun, 2019. "Attention allocation and return co-movement: Evidence from repeated natural experiments," Journal of Financial Economics, Elsevier, vol. 132(2), pages 369-383.
    45. Niklas Karlsson & George Loewenstein & Duane Seppi, 2009. "The ostrich effect: Selective attention to information," Journal of Risk and Uncertainty, Springer, vol. 38(2), pages 95-115, April.
    46. Zhang, Bing & Wang, Yudong, 2015. "Limited attention of individual investors and stock performance: Evidence from the ChiNext market," Economic Modelling, Elsevier, vol. 50(C), pages 94-104.
    47. Dzielinski, Michal, 2012. "Measuring economic uncertainty and its impact on the stock market," Finance Research Letters, Elsevier, vol. 9(3), pages 167-175.
    48. Daskalaki, Charoula & Kostakis, Alexandros & Skiadopoulos, George, 2014. "Are there common factors in individual commodity futures returns?," Journal of Banking & Finance, Elsevier, vol. 40(C), pages 346-363.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chu, Gang & Dowling, Michael & Shen, Dehua & Zhang, Yongjie, 2023. "Information demand density matters: Evidence from the post-earnings announcement drift," International Review of Financial Analysis, Elsevier, vol. 86(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Goodell, John W. & Kumar, Satish & Li, Xiao & Pattnaik, Debidutta & Sharma, Anuj, 2022. "Foundations and research clusters in investor attention: Evidence from bibliometric and topic modelling analysis," International Review of Economics & Finance, Elsevier, vol. 82(C), pages 511-529.
    2. Cheng, Feiyang & Chiao, Chaoshin & Wang, Chunfeng & Fang, Zhenming & Yao, Shouyu, 2021. "Does retail investor attention improve stock liquidity? A dynamic perspective," Economic Modelling, Elsevier, vol. 94(C), pages 170-183.
    3. Christophe Desagre & Catherine D'Hondt, 2020. "Googlization and retail investors' trading activity," LIDAM Discussion Papers LFIN 2020004, Université catholique de Louvain, Louvain Finance (LFIN).
    4. Ramos, Sofia B. & Latoeiro, Pedro & Veiga, Helena, 2020. "Limited attention, salience of information and stock market activity," Economic Modelling, Elsevier, vol. 87(C), pages 92-108.
    5. Desagre, Christophe & D’Hondt, Catherine, 2021. "Googlization and retail trading activity," Journal of Behavioral and Experimental Finance, Elsevier, vol. 29(C).
    6. Qadan, Mahmoud & Zoua’bi, Maher, 2019. "Financial attention and the demand for information," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 82(C).
    7. Tihana Škrinjarić, 2019. "Time Varying Spillovers between the Online Search Volume and Stock Returns: Case of CESEE Markets," IJFS, MDPI, vol. 7(4), pages 1-30, October.
    8. Dong, Dayong & Wu, Keke & Fang, Jianchun & Gozgor, Giray & Yan, Cheng, 2022. "Investor attention factors and stock returns: Evidence from China," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 77(C).
    9. Hervé, Fabrice & Zouaoui, Mohamed & Belvaux, Bertrand, 2019. "Noise traders and smart money: Evidence from online searches," Economic Modelling, Elsevier, vol. 83(C), pages 141-149.
    10. Chen, Zhongdong & Craig, Karen Ann, 2023. "Active attention, retail investor base, and stock returns," Journal of Behavioral and Experimental Finance, Elsevier, vol. 39(C).
    11. Georgios Bampinas & Theodore Panagiotidis & Christina Rouska, 2019. "Volatility persistence and asymmetry under the microscope: the role of information demand for gold and oil," Scottish Journal of Political Economy, Scottish Economic Society, vol. 66(1), pages 180-197, February.
    12. Adam Zaremba & Jacob Koby Shemer, 2018. "Price-Based Investment Strategies," Springer Books, Springer, number 978-3-319-91530-2, December.
    13. Cai, Wenwu & Lu, Jing, 2019. "Investors’ financial attention frequency and trading activity," Pacific-Basin Finance Journal, Elsevier, vol. 58(C).
    14. Ahmad, Fawad & Oriani, Raffaele, 2022. "Investor attention, information acquisition, and value premium: A mispricing perspective," International Review of Financial Analysis, Elsevier, vol. 79(C).
    15. Chaiyuth Padungsaksawasdi & Sirimon Treepongkaruna & Robert Brooks, 2019. "Investor Attention and Stock Market Activities: New Evidence from Panel Data," IJFS, MDPI, vol. 7(2), pages 1-19, June.
    16. Blankespoor, Elizabeth & deHaan, Ed & Marinovic, Iván, 2020. "Disclosure processing costs, investors’ information choice, and equity market outcomes: A review," Journal of Accounting and Economics, Elsevier, vol. 70(2).
    17. Tripathi, Abhinava & Pandey, Ashish, 2021. "Information dissemination across global markets during the spread of COVID-19 pandemic," International Review of Economics & Finance, Elsevier, vol. 74(C), pages 103-115.
    18. González-Fernández, Marcos & González-Velasco, Carmen, 2020. "A sentiment index to measure sovereign risk using Google data," International Review of Economics & Finance, Elsevier, vol. 69(C), pages 406-418.
    19. Vozlyublennaia, Nadia, 2014. "Investor attention, index performance, and return predictability," Journal of Banking & Finance, Elsevier, vol. 41(C), pages 17-35.
    20. Cai, Haidong & Jiang, Ying & Liu, Xiaoquan, 2022. "Investor attention, aggregate limit-hits, and stock returns," International Review of Financial Analysis, Elsevier, vol. 83(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:318:y:2022:i:1:d:10.1007_s10479-022-04892-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.