IDEAS home Printed from https://ideas.repec.org/a/eee/phsmap/v542y2020ics0378437119318941.html
   My bibliography  Save this article

Forecasting stock price movements with multiple data sources: Evidence from stock market in China

Author

Listed:
  • Zhou, Zhongbao
  • Gao, Meng
  • Liu, Qing
  • Xiao, Helu

Abstract

We employ multiple heterogeneous data sources, including historical transaction data, technical indicators, stock posts, news and Baidu index, to predict the directions of stock price movements. We focus on the distinctive predicting patterns of active and inactive stocks, and we examine the predictive power of support vector machine (SVM) in different levels of activity for a single stock. We construct a total of 14 data source combinations according to the above 5 heterogeneous data sources, and choose three forecasting horizons, namely 1 day, 2 days and 3 days, so that we can investigate the forecast effects of stock price movements in China A-share market under different data source combinations and forecasting horizons. It is concluded that the optimal data source combinations of active and inactive stocks are different. Active stocks achieve the highest accuracy when combining multiple non-traditional data sources, while inactive stocks obtain the highest accuracy when combining traditional data sources with non-traditional data sources. We further divide each stock into inactive periods, active periods and very active periods, and compare the forecast effects of the same stocks in different periods. We conclude that, for most combinations of data sources, the more active the stock is, the more accurate we achieve, which indicates that our approach is more powerful for predicting the price movements of stocks in active and very active periods.

Suggested Citation

  • Zhou, Zhongbao & Gao, Meng & Liu, Qing & Xiao, Helu, 2020. "Forecasting stock price movements with multiple data sources: Evidence from stock market in China," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 542(C).
  • Handle: RePEc:eee:phsmap:v:542:y:2020:i:c:s0378437119318941
    DOI: 10.1016/j.physa.2019.123389
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0378437119318941
    Download Restriction: Full text for ScienceDirect subscribers only. Journal offers the option of making the article available online on Science direct for a fee of $3,000

    File URL: https://libkey.io/10.1016/j.physa.2019.123389?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hyejung Chung & Kyung-shik Shin, 2018. "Genetic Algorithm-Optimized Long Short-Term Memory Network for Stock Market Prediction," Sustainability, MDPI, vol. 10(10), pages 1-18, October.
    2. Cao, Jian & Li, Zhi & Li, Jian, 2019. "Financial time series forecasting model based on CEEMDAN and LSTM," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 519(C), pages 127-139.
    3. Yong Jiang & Zhongbao Zhou, 2018. "Does the time horizon of the return predictive effect of investor sentiment vary with stock characteristics? A Granger causality analysis in the frequency domain," Papers 1803.02962, arXiv.org.
    4. Renault, Thomas, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Journal of Banking & Finance, Elsevier, vol. 84(C), pages 25-40.
    5. Basak, Suryoday & Kar, Saibal & Saha, Snehanshu & Khaidem, Luckyson & Dey, Sudeepa Roy, 2019. "Predicting the direction of stock market prices using tree-based classifiers," The North American Journal of Economics and Finance, Elsevier, vol. 47(C), pages 552-567.
    6. Xi Zhang & Yunjia Zhang & Senzhang Wang & Yuntao Yao & Binxing Fang & Philip S. Yu, 2018. "Improving Stock Market Prediction via Heterogeneous Information Fusion," Papers 1801.00588, arXiv.org.
    7. Michael H. Breitner & Christian Dunis & Hans-Jörg Mettenheim & Christopher Neely & Georgios Sermpinis & Rafael Rosillo & Javier Giner & David De la Fuente, 2014. "Stock Market Simulation Using Support Vector Machines," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 33(6), pages 488-500, September.
    8. Wei, Yu-Chen & Lu, Yang-Cheng & Chen, Jen-Nan & Hsu, Yen-Ju, 2017. "Informativeness of the market news sentiment in the Taiwan stock market," The North American Journal of Economics and Finance, Elsevier, vol. 39(C), pages 158-181.
    9. Zhou, Zhongbao & Lin, Ling & Li, Shuxian, 2018. "International stock market contagion: A CEEMDAN wavelet analysis," Economic Modelling, Elsevier, vol. 72(C), pages 333-352.
    10. Zhang, Ningning & Lin, Aijing & Shang, Pengjian, 2017. "Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 477(C), pages 161-173.
    11. Hong Zhang & Lixing Chen & Yong Qu & Guo Zhao & Zhenwei Guo, 2014. "Support Vector Regression Based on Grid-Search Method for Short-Term Wind Power Forecasting," Journal of Applied Mathematics, Hindawi, vol. 2014, pages 1-11, June.
    12. Prasad Sankar Bhattacharya & Dimitrios D. Thomakos, 2018. "Robust model rankings of forecasting performance," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 37(6), pages 676-690, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Day, Min-Yuh & Ni, Yensen, 2023. "Do clean energy indices outperform using contrarian strategies based on contrarian trading rules?," Energy, Elsevier, vol. 272(C).
    2. Xiao, Helu & Ren, Tiantian & Zhou, Zhongbao & Liu, Wenbin, 2021. "Parameter uncertainty in estimation of portfolio efficiency: Evidence from an interval diversification-consistent DEA approach," Omega, Elsevier, vol. 103(C).
    3. Zhou, Zhongbao & Gao, Meng & Xiao, Helu & Wang, Rui & Liu, Wenbin, 2021. "Big data and portfolio optimization: A novel approach integrating DEA with multiple data sources," Omega, Elsevier, vol. 104(C).
    4. Seddigh, Mohammad Reza & Targholizadeh, Aida & Shokouhyar, Sajjad & Shokoohyar, Sina, 2023. "Social media and expert analysis cast light on the mechanisms of underlying problems in pharmaceutical supply chain: An exploratory approach," Technological Forecasting and Social Change, Elsevier, vol. 191(C).
    5. Marian Pompiliu Cristescu & Raluca Andreea Nerisanu & Dumitru Alexandru Mara & Simona-Vasilica Oprea, 2022. "Using Market News Sentiment Analysis for Stock Market Prediction," Mathematics, MDPI, vol. 10(22), pages 1-12, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhou, Zhongbao & Gao, Meng & Xiao, Helu & Wang, Rui & Liu, Wenbin, 2021. "Big data and portfolio optimization: A novel approach integrating DEA with multiple data sources," Omega, Elsevier, vol. 104(C).
    2. Kamaladdin Fataliyev & Aneesh Chivukula & Mukesh Prasad & Wei Liu, 2021. "Stock Market Analysis with Text Data: A Review," Papers 2106.12985, arXiv.org, revised Jul 2021.
    3. Min Liu & Wei‐Chong Choo & Chi‐Chuan Lee & Chien‐Chiang Lee, 2023. "Trading volume and realized volatility forecasting: Evidence from the China stock market," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(1), pages 76-100, January.
    4. Wang, Gaoshan & Yu, Guangjin & Shen, Xiaohong, 2021. "The effect of online environmental news on green industry stocks: The mediating role of investor sentiment," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 573(C).
    5. Lin, Yu & Yan, Yan & Xu, Jiali & Liao, Ying & Ma, Feng, 2021. "Forecasting stock index price using the CEEMDAN-LSTM model," The North American Journal of Economics and Finance, Elsevier, vol. 57(C).
    6. Cristescu Marian Pompiliu & Nerişanu Raluca Andreea & Mara Dumitru Alexandru, 2022. "Using Data Mining in the Sentiment Analysis Process on the Financial Market," Journal of Social and Economic Statistics, Sciendo, vol. 11(1-2), pages 36-58, December.
    7. Nawaf Almaskati, 2022. "Machine learning in finance: Major applications, issues, metrics, and future trends," International Journal of Financial Engineering (IJFE), World Scientific Publishing Co. Pte. Ltd., vol. 9(03), pages 1-32, September.
    8. Baoqiang Zhan & Shu Zhang & Helen S. Du & Xiaoguang Yang, 2022. "Exploring Statistical Arbitrage Opportunities Using Machine Learning Strategy," Computational Economics, Springer;Society for Computational Economics, vol. 60(3), pages 861-882, October.
    9. Henriques, Irene & Sadorsky, Perry, 2023. "Forecasting rare earth stock prices with machine learning," Resources Policy, Elsevier, vol. 86(PA).
    10. Shen, Yiran & Liu, Chang & Sun, Xiaolei & Guo, Kun, 2023. "Investor sentiment and the Chinese new energy stock market: A risk–return perspective," International Review of Economics & Finance, Elsevier, vol. 84(C), pages 395-408.
    11. Zhang, Junsheng & Peng, Zezhi & Zeng, Yamin & Yang, Haisheng, 2023. "Do big data mutual funds outperform?," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 88(C).
    12. Junior, Peterson Owusu & Tiwari, Aviral Kumar & Padhan, Hemachandra & Alagidede, Imhotep, 2020. "Analysis of EEMD-based quantile-in-quantile approach on spot- futures prices of energy and precious metals in India," Resources Policy, Elsevier, vol. 68(C).
    13. Linyi Yang & Yingpeng Ma & Yue Zhang, 2023. "Measuring Consistency in Text-based Financial Forecasting Models," Papers 2305.08524, arXiv.org, revised Jun 2023.
    14. Sumit Saroha & Marta Zurek-Mortka & Jerzy Ryszard Szymanski & Vineet Shekher & Pardeep Singla, 2021. "Forecasting of Market Clearing Volume Using Wavelet Packet-Based Neural Networks with Tracking Signals," Energies, MDPI, vol. 14(19), pages 1-21, September.
    15. Saqib Farid & Rubeena Tashfeen & Tahseen Mohsan & Arsal Burhan, 2023. "Forecasting stock prices using a data mining method: Evidence from emerging market," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 28(2), pages 1911-1917, April.
    16. Thomas Renault, 2020. "Sentiment analysis and machine learning in finance: a comparison of methods and models on one million messages," Digital Finance, Springer, vol. 2(1), pages 1-13, September.
    17. Chao Liu & Fengfeng Gao & Mengwan Zhang & Yuanrui Li & Cun Qian, 2024. "Reference Vector-Based Multiobjective Clustering Ensemble Approach for Time Series Forecasting," Computational Economics, Springer;Society for Computational Economics, vol. 64(1), pages 181-210, July.
    18. Gianluca Anese & Marco Corazza & Michele Costola & Loriana Pelizzon, 2023. "Impact of public news sentiment on stock market index return and volatility," Computational Management Science, Springer, vol. 20(1), pages 1-36, December.
    19. Barboza, Flavio & Altman, Edward, 2024. "Predicting financial distress in Latin American companies: A comparative analysis of logistic regression and random forest models," The North American Journal of Economics and Finance, Elsevier, vol. 72(C).
    20. Wei-Chang Yeh & Yu-Hsin Hsieh & Chia-Ling Huang, 2022. "Newly Developed Flexible Grid Trading Model Combined ANN and SSO algorithm," Papers 2211.12839, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:542:y:2020:i:c:s0378437119318941. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.