IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v13y2021i6p3099-d515306.html
   My bibliography  Save this article

A Model for Rapid Selection and COVID-19 Prediction with Dynamic and Imbalanced Data

Author

Listed:
  • Jeonghun Kim

    (Department of Management, Kyung Hee University, Seoul 02447, Korea)

  • Ohbyung Kwon

    (School of Management, Kyung Hee University, Seoul 02447, Korea)

Abstract

The COVID-19 pandemic is threatening our quality of life and economic sustainability. The rapid spread of COVID-19 around the world requires each country or region to establish appropriate anti-proliferation policies in a timely manner. It is important, in making COVID-19-related health policy decisions, to predict the number of confirmed COVID-19 patients as accurately and quickly as possible. Predictions are already being made using several traditional models such as the susceptible, infected, and recovered (SIR) and susceptible, exposed, infected, and resistant (SEIR) frameworks, but these predictions may not be accurate due to the simplicity of the models, so a prediction model with more diverse input features is needed. However, it is difficult to propose a universal predictive model globally because there are differences in data availability by country and region. Moreover, the training data for predicting confirmed patients is typically an imbalanced dataset consisting mostly of normal data; this imbalance negatively affects the accuracy of prediction. Hence, the purposes of this study are to extract rules for selecting appropriate prediction algorithms and data imbalance resolution methods according to the characteristics of the datasets available for each country or region, and to predict the number of COVID-19 patients based on these algorithms. To this end, a decision tree-type rule was extracted to identify 13 data characteristics and a discrimination algorithm was selected based on those characteristics. With this system, we predicted the COVID-19 situation in four regions: Africa, China, Korea, and the United States. The proposed method has higher prediction accuracy than the random selection method, the ensemble method, or the greedy method of discriminant analysis, and prediction takes very little time.

Suggested Citation

  • Jeonghun Kim & Ohbyung Kwon, 2021. "A Model for Rapid Selection and COVID-19 Prediction with Dynamic and Imbalanced Data," Sustainability, MDPI, vol. 13(6), pages 1-18, March.
  • Handle: RePEc:gam:jsusta:v:13:y:2021:i:6:p:3099-:d:515306
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/13/6/3099/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/13/6/3099/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Zhang, Xiaolei & Ma, Renjun & Wang, Lin, 2020. "Predicting turning point, duration and attack rate of COVID-19 outbreaks in major Western countries," Chaos, Solitons & Fractals, Elsevier, vol. 135(C).
    2. Akio Matsumoto & Ugo Merlone & Ferenc Szidarovszky, 2012. "Some notes on applying the Herfindahl--Hirschman Index," Applied Economics Letters, Taylor & Francis Journals, vol. 19(2), pages 181-184, February.
    3. Jaemun Sim & Jonathan Sangyun Lee & Ohbyung Kwon, 2015. "Missing Values and Optimal Selection of an Imputation Method and Classification Algorithm to Improve the Accuracy of Ubiquitous Computing Applications," Mathematical Problems in Engineering, Hindawi, vol. 2015, pages 1-14, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Francesco Pilati & Riccardo Tronconi & Giandomenico Nollo & Sunderesh S. Heragu & Florian Zerzer, 2021. "Digital Twin of COVID-19 Mass Vaccination Centers," Sustainability, MDPI, vol. 13(13), pages 1-26, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marin, Giovanni & Vona, Francesco, 2023. "Finance and the reallocation of scientific, engineering and mathematical talent," Research Policy, Elsevier, vol. 52(5).
    2. Müller, Raphael & Spengel, Christoph & Weck, Stefan, 2021. "How do investors value the publication of tax information? Evidence from the European public country-by-country reporting," ZEW Discussion Papers 21-077, ZEW - Leibniz Centre for European Economic Research.
    3. Zahra Dehghan Shabani & Rouhollah Shahnazi, 2020. "Spatial distribution dynamics and prediction of COVID‐19 in Asian countries: spatial Markov chain approach," Regional Science Policy & Practice, Wiley Blackwell, vol. 12(6), pages 1005-1025, December.
    4. Singhal, Amit & Singh, Pushpendra & Lall, Brejesh & Joshi, Shiv Dutt, 2020. "Modeling and prediction of COVID-19 pandemic using Gaussian mixture model," Chaos, Solitons & Fractals, Elsevier, vol. 138(C).
    5. Lars Christian Bruno & Riana Steen, 2022. "Norwegian oil market concentration and its effects on the oil service companies 1993–2013," Scottish Journal of Political Economy, Scottish Economic Society, vol. 69(2), pages 242-262, May.
    6. Luca Bonacini & Giovanni Gallo & Fabrizio Patriarca, 2021. "Identifying policy challenges of COVID-19 in hardly reliable data and judging the success of lockdown measures," Journal of Population Economics, Springer;European Society for Population Economics, vol. 34(1), pages 275-301, January.
    7. Filippo Bontadini & Francesco Vona, 2020. "Anatomy of Green Specialization: Evidence from EU Production Data, 1995-2015," Working Papers hal-03403070, HAL.
    8. Weikang Zhang & Isabel K. M. Yan & Yin-Wong Cheung, 2023. "The COVID-19 pandemics and import demand elasticities: evidence from China’s customs data," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-25, December.
    9. Amaral, Marco A. & Oliveira, Marcelo M. de & Javarone, Marco A., 2021. "An epidemiological model with voluntary quarantine strategies governed by evolutionary game dynamics," Chaos, Solitons & Fractals, Elsevier, vol. 143(C).
    10. Clement Tisdell & Mohammad Alauddin & Md. Abdur Rashid Sarker & Md Anwarul Kabir, 2019. "Agricultural Diversity and Sustainability: General Features and Bangladeshi Illustrations," Sustainability, MDPI, vol. 11(21), pages 1-22, October.
    11. Swapnarekha, H. & Behera, Himansu Sekhar & Nayak, Janmenjoy & Naik, Bighnaraj, 2020. "Role of intelligent computing in COVID-19 prognosis: A state-of-the-art review," Chaos, Solitons & Fractals, Elsevier, vol. 138(C).
    12. Huang, Chiou-Jye & Shen, Yamin & Kuo, Ping-Huan & Chen, Yung-Hsiang, 2022. "Novel spatiotemporal feature extraction parallel deep neural network for forecasting confirmed cases of coronavirus disease 2019," Socio-Economic Planning Sciences, Elsevier, vol. 80(C).
    13. Ausloos, Marcel, 2020. "Rank–size law, financial inequality indices and gain concentrations by cyclist teams. The case of a multiple stage bicycle race, like Tour de France," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 540(C).
    14. repec:hal:spmain:info:hdl:2441/6m5kss847r91no96hiublu6anu is not listed on IDEAS
    15. Koutsellis, Themistoklis & Nikas, Alexandros, 2020. "A predictive model and country risk assessment for COVID-19: An application of the Limited Failure Population concept," Chaos, Solitons & Fractals, Elsevier, vol. 140(C).
    16. Byeongki Jeong & Janghyeok Yoon, 2017. "Competitive Intelligence Analysis of Augmented Reality Technology Using Patent Information," Sustainability, MDPI, vol. 9(4), pages 1-22, March.
    17. Filippo Bontadini & Francesco Vona, 2020. "Anatomy of Green Specialization: Evidence from EU Production Data, 1995-2015," SciencePo Working papers Main hal-03403070, HAL.
    18. Parbat, Debanjan & Chakraborty, Monisha, 2020. "A python based support vector regression model for prediction of COVID19 cases in India," Chaos, Solitons & Fractals, Elsevier, vol. 138(C).
    19. Henry Kankwamba & Mariam Kadzamira & Karl Pauw, 2018. "How diversified is cropping in Malawi? Patterns, determinants and policy implications," Food Security: The Science, Sociology and Economics of Food Production and Access to Food, Springer;The International Society for Plant Pathology, vol. 10(2), pages 323-338, April.
    20. Dalton Garcia Borges de Souza & Erivelton Antonio dos Santos & Francisco Tarcísio Alves Júnior & Mariá Cristina Vasconcelos Nascimento, 2021. "On Comparing Cross-Validated Forecasting Models with a Novel Fuzzy-TOPSIS Metric: A COVID-19 Case Study," Sustainability, MDPI, vol. 13(24), pages 1-25, December.
    21. Frode Eika Sandnes, 2021. "Everyone onboard? Participation ratios as a metric for research activity assessments within young universities," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6105-6113, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:13:y:2021:i:6:p:3099-:d:515306. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.