IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0215600.html
   My bibliography  Save this article

Regional level influenza study based on Twitter and machine learning method

Author

Listed:
  • Hongxin Xue
  • Yanping Bai
  • Hongping Hu
  • Haijian Liang

Abstract

The significance of flu prediction is that the appropriate preventive and control measures can be taken by relevant departments after assessing predicted data; thus, morbidity and mortality can be reduced. In this paper, three flu prediction models, based on twitter and US Centers for Disease Control’s (CDC’s) Influenza-Like Illness (ILI) data, are proposed (models 1-3) to verify the factors that affect the spread of the flu. In this work, an Improved Particle Swarm Optimization algorithm to optimize the parameters of Support Vector Regression (IPSO-SVR) was proposed. The IPSO-SVR was trained by the independent and dependent variables of the three models (models 1-3) as input and output. The trained IPSO-SVR method was used to predict the regional unweighted percentage ILI (%ILI) events in the US. The prediction results of each model are analyzed and compared. The results show that the IPSO-SVR method (model 3) demonstrates excellent performance in real-time prediction of ILIs, and further highlights the benefits of using real-time twitter data, thus providing an effective means for the prevention and control of flu.

Suggested Citation

  • Hongxin Xue & Yanping Bai & Hongping Hu & Haijian Liang, 2019. "Regional level influenza study based on Twitter and machine learning method," PLOS ONE, Public Library of Science, vol. 14(4), pages 1-23, April.
  • Handle: RePEc:plo:pone00:0215600
    DOI: 10.1371/journal.pone.0215600
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0215600
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0215600&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0215600?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Tatiana Petukhova & Davor Ojkic & Beverly McEwen & Rob Deardon & Zvonimir Poljak, 2018. "Assessment of autoregressive integrated moving average (ARIMA), generalized linear autoregressive moving average (GLARMA), and random forest (RF) time series regression models for predicting influenza," PLOS ONE, Public Library of Science, vol. 13(6), pages 1-17, June.
    2. Declan Butler, 2013. "When Google got flu wrong," Nature, Nature, vol. 494(7436), pages 155-156, February.
    3. Eui-Ki Kim & Jong Hyeon Seok & Jang Seok Oh & Hyong Woo Lee & Kyung Hyun Kim, 2013. "Use of Hangeul Twitter to Track and Predict Human Influenza Infection," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-11, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Soo Beom Choi & Insung Ahn, 2020. "Forecasting seasonal influenza-like illness in South Korea after 2 and 30 weeks using Google Trends and influenza data from Argentina," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-14, July.
    2. Steven Heston & Nitish R. Sinha, 2016. "News versus Sentiment : Predicting Stock Returns from News Stories," Finance and Economics Discussion Series 2016-048, Board of Governors of the Federal Reserve System (U.S.).
    3. Zeynep Ertem & Dorrie Raymond & Lauren Ancel Meyers, 2018. "Optimal multi-source forecasting of seasonal influenza," PLOS Computational Biology, Public Library of Science, vol. 14(9), pages 1-16, September.
    4. Ibrahim Musa & Hyun Woo Park & Lkhagvadorj Munkhdalai & Keun Ho Ryu, 2018. "Global Research on Syndromic Surveillance from 1993 to 2017: Bibliometric Analysis and Visualization," Sustainability, MDPI, vol. 10(10), pages 1-20, September.
    5. Rivera, Roberto, 2016. "A dynamic linear model to forecast hotel registrations in Puerto Rico using Google Trends data," Tourism Management, Elsevier, vol. 57(C), pages 12-20.
    6. Nataliya Shakhovska & Ivan Izonin & Nataliia Melnykova, 2021. "The Hierarchical Classifier for COVID-19 Resistance Evaluation," Data, MDPI, vol. 6(1), pages 1-17, January.
    7. Jiachen Sun & Peter A. Gloor, 2021. "Assessing the Predictive Power of Online Social Media to Analyze COVID-19 Outbreaks in the 50 U.S. States," Future Internet, MDPI, vol. 13(7), pages 1-13, July.
    8. Daniel E. O'Leary & Veda C. Storey, 2020. "A Google–Wikipedia–Twitter Model as a Leading Indicator of the Numbers of Coronavirus Deaths," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 27(3), pages 151-158, July.
    9. Jun, Seung-Pyo & Park, Do-Hyung, 2016. "Consumer information search behavior and purchasing decisions: Empirical evidence from Korea," Technological Forecasting and Social Change, Elsevier, vol. 107(C), pages 97-111.
    10. Schaer, Oliver & Kourentzes, Nikolaos & Fildes, Robert, 2019. "Demand forecasting with user-generated online information," International Journal of Forecasting, Elsevier, vol. 35(1), pages 197-212.
    11. Jun, Seung-Pyo & Yoo, Hyoung Sun & Lee, Jae-Seong, 2021. "The impact of the pandemic declaration on public awareness and behavior: Focusing on COVID-19 google searches," Technological Forecasting and Social Change, Elsevier, vol. 166(C).
    12. Jose Ramon Albert & Arturo Martinez Jr. & Katrina Miradora & Jan Arvin Lapuz & Marymell Martillan & Criselda De Dios & Iva Sebastian-Samaniego, 2019. "Readiness of National Statistical Systems in Asia and the Pacific for Leveraging Big Data to Monitor the SDGs," Working Papers id:13017, eSocialSciences.
    13. Peter Congdon, 2022. "A spatio-temporal autoregressive model for monitoring and predicting COVID infection rates," Journal of Geographical Systems, Springer, vol. 24(4), pages 583-610, October.
    14. Woloszko, Nicolas, 2024. "Nowcasting with panels and alternative data: The OECD weekly tracker," International Journal of Forecasting, Elsevier, vol. 40(4), pages 1302-1335.
    15. Zhijuan Song & Xiaocan Jia & Junzhe Bao & Yongli Yang & Huili Zhu & Xuezhong Shi, 2021. "Spatio-Temporal Analysis of Influenza-Like Illness and Prediction of Incidence in High-Risk Regions in the United States from 2011 to 2020," IJERPH, MDPI, vol. 18(13), pages 1-14, July.
    16. Katsikopoulos, Konstantinos V. & Şimşek, Özgür & Buckmann, Marcus & Gigerenzer, Gerd, 2022. "Transparent modeling of influenza incidence: Big data or a single data point from psychological theory?," International Journal of Forecasting, Elsevier, vol. 38(2), pages 613-619.
    17. Dushmanta Kumar Padhi & Neelamadhab Padhy & Akash Kumar Bhoi & Jana Shafi & Muhammad Fazal Ijaz, 2021. "A Fusion Framework for Forecasting Financial Market Direction Using Enhanced Ensemble Models and Technical Indicators," Mathematics, MDPI, vol. 9(21), pages 1-31, October.
    18. Kui Liu & Li Li & Tao Jiang & Bin Chen & Zhenggang Jiang & Zhengting Wang & Yongdi Chen & Jianmin Jiang & Hua Gu, 2016. "Chinese Public Attention to the Outbreak of Ebola in West Africa: Evidence from the Online Big Data Platform," IJERPH, MDPI, vol. 13(8), pages 1-15, August.
    19. Jiangpeng Chen & Xun Lei & Li Zhang & Bin Peng, 2015. "Using Extreme Value Theory Approaches to Forecast the Probability of Outbreak of Highly Pathogenic Influenza in Zhejiang, China," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-10, February.
    20. Yulin Hswen & Alyssa J. Moran & Siona Prasad & Anna Li & Denise Simon & Lauren Cleveland & Jared B. Hawkins & John S. Brownstein & Jason Block, 2021. "The Federal Menu Labeling Law and Twitter Discussions about Calories in the United States: An Interrupted Time-Series Analysis," IJERPH, MDPI, vol. 18(20), pages 1-11, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0215600. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.