IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0269685.html
   My bibliography  Save this article

Multi-class classification algorithms for the diagnosis of anemia in an outpatient clinical setting

Author

Listed:
  • Rajan Vohra
  • Abir Hussain
  • Anil Kumar Dudyala
  • Jankisharan Pahareeya
  • Wasiq Khan

Abstract

Anemia is one of the most pressing public health issues in the world with iron deficiency a major public health issue worldwide. The highest prevalence of anemia is in developing countries. The complete blood count is a blood test used to diagnose the prevalence of anemia. While earlier studies have framed the problem of diagnosis as a binary classification problem, this paper frames it as a multi class (three classes) classification problem with mild, moderate and severe classes. The three classes for the anemia classification (mild, moderate, severe) are so chosen as the world health organization (WHO) guidelines formalize this categorization based on the Haemoglobin (HGB) values of the chosen sample of patients in the Complete Blood Count (CBC) patient data set. Complete blood count test data was collected in an outpatient clinical setting in India. We used Feature selection with Majority voting to identify the key attributes in the input patient data set. In addition, since the original data set was imbalanced we used Synthetic Minority Oversampling Technique (SMOTE) to balance the data set. Four data sets including the original data set were used to perform the data experiments. Six standard machine learning algorithms were utilised to test our four data sets, performing multi class classification. Benchmarking these algorithms was performed and tabulated using both10 fold cross validation and hold out methods. The experimental results indicated that multilayer perceptron network was predominantly giving good recall values across mild and moderate class which are early and middle stages of the disease. With a good prediction model at early stages, medical intervention can provide preventive measure from further deterioration into severe stage or recommend the use of supplements to overcome this problem.

Suggested Citation

  • Rajan Vohra & Abir Hussain & Anil Kumar Dudyala & Jankisharan Pahareeya & Wasiq Khan, 2022. "Multi-class classification algorithms for the diagnosis of anemia in an outpatient clinical setting," PLOS ONE, Public Library of Science, vol. 17(7), pages 1-18, July.
  • Handle: RePEc:plo:pone00:0269685
    DOI: 10.1371/journal.pone.0269685
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0269685
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0269685&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0269685?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ben Taieb, Souhaib & Hyndman, Rob J., 2014. "A gradient boosting approach to the Kaggle load forecasting competition," International Journal of Forecasting, Elsevier, vol. 30(2), pages 382-394.
    2. Dudyala Anil Kumar & V. Ravi, 2008. "Predicting credit card customer churn in banks using data mining," International Journal of Data Analysis Techniques and Strategies, Inderscience Enterprises Ltd, vol. 1(1), pages 4-28.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W. & Lessmann, Stefan, 2020. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1563-1578.
    2. Döpke, Jörg & Fritsche, Ulrich & Pierdzioch, Christian, 2017. "Predicting recessions with boosted regression trees," International Journal of Forecasting, Elsevier, vol. 33(4), pages 745-759.
    3. Theresa Maria Rausch & Tobias Albrecht & Daniel Baier, 2022. "Beyond the beaten paths of forecasting call center arrivals: on the use of dynamic harmonic regression with predictor variables," Journal of Business Economics, Springer, vol. 92(4), pages 675-706, May.
    4. Smirnov, Dmitry & Huchzermeier, Arnd, 2020. "Analytics for labor planning in systems with load-dependent service times," European Journal of Operational Research, Elsevier, vol. 287(2), pages 668-681.
    5. Makridakis, Spyros & Hyndman, Rob J. & Petropoulos, Fotios, 2020. "Forecasting in social settings: The state of the art," International Journal of Forecasting, Elsevier, vol. 36(1), pages 15-28.
    6. Seungwook Kim & Daeyoung Choi & Eunjung Lee & Wonjong Rhee, 2017. "Churn prediction of mobile and online casual games using play log data," PLOS ONE, Public Library of Science, vol. 12(7), pages 1-19, July.
    7. Maher Selim & Ryan Zhou & Wenying Feng & Peter Quinsey, 2021. "Estimating Energy Forecasting Uncertainty for Reliable AI Autonomous Smart Grid Design," Energies, MDPI, vol. 14(1), pages 1-15, January.
    8. Barrow, Devon K. & Crone, Sven F., 2016. "A comparison of AdaBoost algorithms for time series forecast combination," International Journal of Forecasting, Elsevier, vol. 32(4), pages 1103-1119.
    9. Samuel Atuahene & Yukun Bao & Patricia Semwaah Gyan & Yao Yevenyo Ziggah, 2019. "Accurate Forecast Improvement Approach for Short Term Load Forecasting Using Hybrid Filter-Wrap Feature Selection," International Journal of Management Science and Business Administration, Inovatus Services Ltd., vol. 5(2), pages 37-49, January.
    10. Souhaib Ben Taieb & Rob J Hyndman, 2014. "Boosting multi-step autoregressive forecasts," Monash Econometrics and Business Statistics Working Papers 13/14, Monash University, Department of Econometrics and Business Statistics.
    11. Luo, Jian & Hong, Tao & Fang, Shu-Cherng, 2018. "Benchmarking robustness of load forecasting models under data integrity attacks," International Journal of Forecasting, Elsevier, vol. 34(1), pages 89-104.
    12. Araz Taeihagh, 2017. "Crowdsourcing: a new tool for policy-making?," Policy Sciences, Springer;Society of Policy Sciences, vol. 50(4), pages 629-647, December.
    13. Paulino José Garcia Nieto & Esperanza García Gonzalo & Fernando Sanchez Lasheras & Antonio Bernardo Sánchez, 2020. "A Hybrid Predictive Approach for Chromium Layer Thickness in the Hard Chromium Plating Process Based on the Differential Evolution/Gradient Boosted Regression Tree Methodology," Mathematics, MDPI, vol. 8(6), pages 1-20, June.
    14. Khoshrou, Abdolrahman & Pauwels, Eric J., 2019. "Short-term scenario-based probabilistic load forecasting: A data-driven approach," Applied Energy, Elsevier, vol. 238(C), pages 1258-1268.
    15. Abbas Keramati & Hajar Ghaneei & Seyed Mohammad Mirmohammadi, 2016. "Developing a prediction model for customer churn from electronic banking services using data mining," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 2(1), pages 1-13, December.
    16. Vera Miguéis & Dirk Poel & Ana Camanho & João Falcão e Cunha, 2012. "Predicting partial customer churn using Markov for discrimination for modeling first purchase sequences," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 6(4), pages 337-353, December.
    17. Alexis Gerossier & Robin Girard & Alexis Bocquet & George Kariniotakis, 2018. "Robust Day-Ahead Forecasting of Household Electricity Demand and Operational Challenges," Energies, MDPI, vol. 11(12), pages 1-18, December.
    18. Souhaib Ben Taieb & Raphael Huser & Rob J. Hyndman & Marc G. Genton, 2015. "Probabilistic time series forecasting with boosted additive models: an application to smart meter data," Monash Econometrics and Business Statistics Working Papers 12/15, Monash University, Department of Econometrics and Business Statistics.
    19. repec:osf:socarx:jer9k_v1 is not listed on IDEAS
    20. Moreno-Carbonell, Santiago & Sánchez-Úbeda, Eugenio F. & Muñoz, Antonio, 2020. "Rethinking weather station selection for electric load forecasting using genetic algorithms," International Journal of Forecasting, Elsevier, vol. 36(2), pages 695-712.
    21. Antulov-Fantulin, Nino & Lagravinese, Raffaele & Resce, Giuliano, 2021. "Predicting bankruptcy of local government: A machine learning approach," Journal of Economic Behavior & Organization, Elsevier, vol. 183(C), pages 681-699.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0269685. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.