IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v17y2020i7p2365-d339385.html
   My bibliography  Save this article

Prediction of Number of Cases of 2019 Novel Coronavirus (COVID-19) Using Social Media Search Index

Author

Listed:
  • Lei Qin

    (School of Statistics, University of International Business and Economics, Beijing 100029, China
    These authors have contributed equally to this study (joint primary authors).)

  • Qiang Sun

    (School of Statistics, University of International Business and Economics, Beijing 100029, China)

  • Yidan Wang

    (School of Statistics, University of International Business and Economics, Beijing 100029, China)

  • Ke-Fei Wu

    (Graduate Institute of Business Administration, College of Management, Fu Jen Catholic University, New Taipei City 242, Taiwan)

  • Mingchih Chen

    (Graduate Institute of Business Administration, College of Management, Fu Jen Catholic University, New Taipei City 242, Taiwan)

  • Ben-Chang Shia

    (Research Center of Big Data, College of management, Taipei Medical University, Taipei 110, Taiwan
    College of Management, Taipei Medical University, Taipei 110, Taiwan
    Executive Master Program of Business Administration in Biotechnology, College of management, Taipei Medical University, Taipei 110, Taiwan
    These authors have contributed equally to this study (joint primary authors).)

  • Szu-Yuan Wu

    (Department of Food Nutrition and Health Biotechnology, College of Medical and Health Science, Asia University, Taichung 41354, Taiwan
    Division of Radiation Oncology, Lo-Hsu Medical Foundation, Lotung Poh-Ai Hospital, Yilan 265, Taiwan
    Big Data Center, Lo-Hsu Medical Foundation, Lotung Poh-Ai Hospital, Yilan 265, Taiwan
    Department of Healthcare Administration, College of Medical and Health Science, Asia University, Taichung 41354, Taiwan)

Abstract

Predicting the number of new suspected or confirmed cases of novel coronavirus disease 2019 (COVID-19) is crucial in the prevention and control of the COVID-19 outbreak. Social media search indexes (SMSI) for dry cough, fever, chest distress, coronavirus, and pneumonia were collected from 31 December 2019 to 9 February 2020. The new suspected cases of COVID-19 data were collected from 20 January 2020 to 9 February 2020. We used the lagged series of SMSI to predict new suspected COVID-19 case numbers during this period. To avoid overfitting, five methods, namely subset selection, forward selection, lasso regression, ridge regression, and elastic net, were used to estimate coefficients. We selected the optimal method to predict new suspected COVID-19 case numbers from 20 January 2020 to 9 February 2020. We further validated the optimal method for new confirmed cases of COVID-19 from 31 December 2019 to 17 February 2020. The new suspected COVID-19 case numbers correlated significantly with the lagged series of SMSI. SMSI could be detected 6–9 days earlier than new suspected cases of COVID-19. The optimal method was the subset selection method, which had the lowest estimation error and a moderate number of predictors. The subset selection method also significantly correlated with the new confirmed COVID-19 cases after validation. SMSI findings on lag day 10 were significantly correlated with new confirmed COVID-19 cases. SMSI could be a significant predictor of the number of COVID-19 infections. SMSI could be an effective early predictor, which would enable governments’ health departments to locate potential and high-risk outbreak areas.

Suggested Citation

  • Lei Qin & Qiang Sun & Yidan Wang & Ke-Fei Wu & Mingchih Chen & Ben-Chang Shia & Szu-Yuan Wu, 2020. "Prediction of Number of Cases of 2019 Novel Coronavirus (COVID-19) Using Social Media Search Index," IJERPH, MDPI, vol. 17(7), pages 1-14, March.
  • Handle: RePEc:gam:jijerp:v:17:y:2020:i:7:p:2365-:d:339385
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/17/7/2365/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/17/7/2365/
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Abdelrahman E. E. Eltoukhy & Ibrahim Abdelfadeel Shaban & Felix T. S. Chan & Mohammad A. M. Abdel-Aal, 2020. "Data Analytics for Predicting COVID-19 Cases in Top Affected Countries: Observations and Recommendations," IJERPH, MDPI, vol. 17(19), pages 1-25, September.
    2. Taixia Shen & Chao Wang, 2021. "Big Data Technology Applications and the Right to Health in China during the COVID-19 Pandemic," IJERPH, MDPI, vol. 18(14), pages 1-15, July.
    3. Heli Lu & Menglin Xia & Ziyuan Qin & Siqi Lu & Ruimin Guan & Yuna Yang & Changhong Miao & Taizheng Chen, 2022. "The Built Environment Assessment of Residential Areas in Wuhan during the Coronavirus Disease (COVID-19) Outbreak," IJERPH, MDPI, vol. 19(13), pages 1-20, June.
    4. Mengyue Yuan & Tong Liu & Chao Yang, 2022. "Exploring the Relationship among Human Activities, COVID-19 Morbidity, and At-Risk Areas Using Location-Based Social Media Data: Knowledge about the Early Pandemic Stage in Wuhan," IJERPH, MDPI, vol. 19(11), pages 1-22, May.
    5. Abdallah Alsayed & Hayder Sadir & Raja Kamil & Hasan Sari, 2020. "Prediction of Epidemic Peak and Infected Cases for COVID-19 Disease in Malaysia, 2020," IJERPH, MDPI, vol. 17(11), pages 1-15, June.
    6. Taicir Mezghani & Mouna Boujelbène Abbes, 2023. "Forecast the Role of GCC Financial Stress on Oil Market and GCC Financial Markets Using Convolutional Neural Networks," Asia-Pacific Financial Markets, Springer;Japanese Association of Financial Economics and Engineering, vol. 30(3), pages 505-530, September.
    7. Israel Edem Agbehadji & Bankole Osita Awuzie & Alfred Beati Ngowi & Richard C. Millham, 2020. "Review of Big Data Analytics, Artificial Intelligence and Nature-Inspired Computing Models towards Accurate Detection of COVID-19 Pandemic Cases and Contact Tracing," IJERPH, MDPI, vol. 17(15), pages 1-16, July.
    8. Yongzhu Xiong & Yunpeng Wang & Feng Chen & Mingyong Zhu, 2020. "Spatial Statistics and Influencing Factors of the COVID-19 Epidemic at Both Prefecture and County Levels in Hubei Province, China," IJERPH, MDPI, vol. 17(11), pages 1-26, May.
    9. Manuel Hermosilla & Jian Ni & Haizhong Wang & Jin Zhang, 2023. "Leveraging the E-commerce footprint for the surveillance of healthcare utilization," Health Care Management Science, Springer, vol. 26(4), pages 604-625, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:17:y:2020:i:7:p:2365-:d:339385. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.