IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v21y2023i1p2-d1303380.html
   My bibliography  Save this article

Classification of Obesity among South African Female Adolescents: Comparative Analysis of Logistic Regression and Random Forest Algorithms

Author

Listed:
  • Ronel Sewpaul

    (Public Health, Societies and Belonging, Human Sciences Research Council, Merchant House, 2 Dock Rail Road, Cape Town 8001, South Africa)

  • Olushina Olawale Awe

    (Institute of Mathematics, Statistics and Scientific Computing (IMECC), University of Campinas, Campinas 13083-859, Brazil)

  • Dennis Makafui Dogbey

    (Medical Biotechnology and Immunotherapy Research Unit, Institute of Infectious Diseases and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town 7700, South Africa)

  • Machoene Derrick Sekgala

    (Non-Communicable Diseases, South African Medical Research Council, Cape Town 7505, South Africa)

  • Natisha Dukhi

    (Public Health, Societies and Belonging, Human Sciences Research Council, Merchant House, 2 Dock Rail Road, Cape Town 8001, South Africa)

Abstract

Background: This study evaluates the performance of logistic regression (LR) and random forest (RF) algorithms to model obesity among female adolescents in South Africa. Methods: Data was analysed on 375 females aged 15–17 from the South African National Health and Nutrition Examination Survey 2011/2012. The primary outcome was obesity, defined as body mass index (BMI) ≥ 30 kg/m 2 . A total of 31 explanatory variables were included, ranging from socio-economic, demographic, family history, dietary and health behaviour. RF and LR models were run using imbalanced data as well as after oversampling, undersampling, and hybrid sampling of the data. Results: Using the imbalanced data, the RF model performed better with higher precision, recall, F1 score, and balanced accuracy. Balanced accuracy was highest with the hybrid data (0.618 for RF and 0.668 for LR). Using the hybrid balanced data, the RF model performed better (F1-score = 0.940 for RF vs. 0.798 for LR). Conclusion: The model with the highest overall performance metrics was the RF model both before balancing the data and after applying hybrid balancing. Future work would benefit from using larger datasets on adolescent female obesity to assess the robustness of the models.

Suggested Citation

  • Ronel Sewpaul & Olushina Olawale Awe & Dennis Makafui Dogbey & Machoene Derrick Sekgala & Natisha Dukhi, 2023. "Classification of Obesity among South African Female Adolescents: Comparative Analysis of Logistic Regression and Random Forest Algorithms," IJERPH, MDPI, vol. 21(1), pages 1-15, December.
  • Handle: RePEc:gam:jijerp:v:21:y:2023:i:1:p:2-:d:1303380
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/21/1/2/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/21/1/2/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:21:y:2023:i:1:p:2-:d:1303380. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.