IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v280y2019i1d10.1007_s10479-019-03156-8.html
   My bibliography  Save this article

Predictive models for bariatric surgery risks with imbalanced medical datasets

Author

Listed:
  • Talayeh Razzaghi

    (New Mexico State University)

  • Ilya Safro

    (Clemson University)

  • Joseph Ewing

    (Greenville Health System)

  • Ehsan Sadrfaridpour

    (Clemson University)

  • John D. Scott

    (Greenville Hospital System University Medical Center)

Abstract

Bariatric surgery (BAR) has become a popular treatment for type 2 diabetes mellitus which is among the most critical obesity-related comorbidities. Patients who have bariatric surgery, are exposed to complications after surgery. Furthermore, the mid- to long-term complications after bariatric surgery can be deadly and increase the complexity of managing safety of these operations and healthcare costs. Current studies on BAR complications have mainly used risk scoring for identifying patients who are more likely to have complications after surgery. Though, these studies do not take into consideration the imbalanced nature of the data where the size of the class of interest (patients who have complications after surgery) is relatively small. We propose the use of imbalanced classification techniques to tackle the imbalanced bariatric surgery data: synthetic minority oversampling technique (SMOTE), random undersampling, and ensemble learning classification methods including Random Forest, Bagging, and AdaBoost. Moreover, we improve classification performance through using Chi-squared, Information Gain, and Correlation-based feature selection techniques. We study the Premier Healthcare Database with focus on the most-frequent complications including Diabetes, Angina, Heart Failure, and Stroke. Our results show that the ensemble learning-based classification techniques using any feature selection method mentioned above are the best approach for handling the imbalanced nature of the bariatric surgical outcome data. In our evaluation, we find a slight preference toward using SMOTE method compared to the random undersampling method. These results demonstrate the potential of machine-learning tools as clinical decision support in identifying risks/outcomes associated with bariatric surgery and their effectiveness in reducing the surgery complications as well as improving patient care.

Suggested Citation

  • Talayeh Razzaghi & Ilya Safro & Joseph Ewing & Ehsan Sadrfaridpour & John D. Scott, 2019. "Predictive models for bariatric surgery risks with imbalanced medical datasets," Annals of Operations Research, Springer, vol. 280(1), pages 1-18, September.
  • Handle: RePEc:spr:annopr:v:280:y:2019:i:1:d:10.1007_s10479-019-03156-8
    DOI: 10.1007/s10479-019-03156-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-019-03156-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-019-03156-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. King, Gary & Zeng, Langche, 2001. "Logistic Regression in Rare Events Data," Political Analysis, Cambridge University Press, vol. 9(2), pages 137-163, January.
    2. Ya-Ju Fan & Wanpracha Chaovalitwongse, 2010. "Optimizing feature selection to improve medical diagnosis," Annals of Operations Research, Springer, vol. 174(1), pages 169-183, February.
    3. Yazan F. Roumani & Yaman Roumani & Joseph K. Nwankpa & Mohan Tanniru, 2018. "Classifying readmissions to a cardiac intensive care unit," Annals of Operations Research, Springer, vol. 263(1), pages 429-451, April.
    4. Sorin Alexe & Eugene Blackstone & Peter Hammer & Hemant Ishwaran & Michael Lauer & Claire Pothier Snader, 2003. "Coronary Risk Prediction by Logical Analysis of Data," Annals of Operations Research, Springer, vol. 119(1), pages 15-42, March.
    5. Talayeh Razzaghi & Oleg Roderick & Ilya Safro & Nicholas Marko, 2016. "Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values," PLOS ONE, Public Library of Science, vol. 11(5), pages 1-18, May.
    6. Cawley, John & Meyerhoefer, Chad, 2012. "The medical care costs of obesity: An instrumental variables approach," Journal of Health Economics, Elsevier, vol. 31(1), pages 219-230.
    7. Onur Şeref & Talayeh Razzaghi & Petros Xanthopoulos, 2017. "Weighted relaxed support vector machines," Annals of Operations Research, Springer, vol. 249(1), pages 235-271, February.
    8. Yazan Roumani & Jerrold May & David Strum & Luis Vargas, 2013. "Classifying highly imbalanced ICU data," Health Care Management Science, Springer, vol. 16(2), pages 119-128, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Che Xu & Wenjun Chang & Weiyong Liu, 2023. "Data-driven decision model based on local two-stage weighted ensemble learning," Annals of Operations Research, Springer, vol. 325(2), pages 995-1028, June.
    2. Manrui Jiang & Lifen Jia & Zhensong Chen & Wei Chen, 2022. "The two-stage machine learning ensemble models for stock price prediction by combining mode decomposition, extreme learning machine and improved harmony search algorithm," Annals of Operations Research, Springer, vol. 309(2), pages 553-585, February.
    3. Viswanath Venkatesh, 2022. "Adoption and use of AI tools: a research agenda grounded in UTAUT," Annals of Operations Research, Springer, vol. 308(1), pages 641-652, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Gartner & Rainer Kolisch & Daniel B. Neill & Rema Padman, 2015. "Machine Learning Approaches for Early DRG Classification and Resource Allocation," INFORMS Journal on Computing, INFORMS, vol. 27(4), pages 718-734, November.
    2. Md. Alauddin Majumder, 2013. "Does Obesity Matter for Wages? Evidence from the United States," Economic Papers, The Economic Society of Australia, vol. 32(2), pages 200-217, June.
    3. Bemile, Esther & Anders, Sven M., 2014. "Linking Diet-Health Behaviour and Obesity using Propensity Score Matching," 2014 International Congress, August 26-29, 2014, Ljubljana, Slovenia 182832, European Association of Agricultural Economists.
    4. Angel M. Morales & Patrick Tarwater & Indika Mallawaarachchi & Alok Kumar Dwivedi & Juan B. Figueroa-Casas, 2015. "Multinomial logistic regression approach for the evaluation of binary diagnostic test in medical research," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 16(2), pages 203-222, June.
    5. F. Gauthier & D. Germain & B. Hétu, 2017. "Logistic models as a forecasting tool for snow avalanches in a cold maritime climate: northern Gaspésie, Québec, Canada," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 89(1), pages 201-232, October.
    6. Allais, Olivier & Etilé, Fabrice & Lecocq, Sébastien, 2015. "Mandatory labels, taxes and market forces: An empirical evaluation of fat policies," Journal of Health Economics, Elsevier, vol. 43(C), pages 27-44.
    7. Douglas Cumming & Lars Hornuf & Moein Karami & Denis Schweizer, 2023. "Disentangling Crowdfunding from Fraudfunding," Journal of Business Ethics, Springer, vol. 182(4), pages 1103-1128, February.
    8. Courtemanche, Charles & Tchernis, Rusty & Zhou, Xilin, 2017. "Parental Work Hours and Childhood Obesity: Evidence Using Instrumental Variables Related to Sibling School Eligibility," IZA Discussion Papers 10739, Institute of Labor Economics (IZA).
    9. Eunae Yoo & Elliot Rabinovich & Bin Gu, 2020. "The Growth of Follower Networks on Social Media Platforms for Humanitarian Operations," Production and Operations Management, Production and Operations Management Society, vol. 29(12), pages 2696-2715, December.
    10. Cemal Eren Arbath & Quamral H. Ashraf & Oded Galor & Marc Klemp, 2018. "Diversity and Conflict," Working Papers 2018-6, Brown University, Department of Economics.
    11. Lo Turco, Alessia & Maggioni, Daniela, 2018. "Effects of Islamic religiosity on bilateral trust in trade: The case of Turkish exports," Journal of Comparative Economics, Elsevier, vol. 46(4), pages 947-965.
    12. Matija Kovacic & Claudio Zoli, 2021. "Ethnic distribution, effective power and conflict," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 57(2), pages 257-299, August.
    13. Blackman, Allen & Guerrero, Santiago, 2012. "What drives voluntary eco-certification in Mexico?," Journal of Comparative Economics, Elsevier, vol. 40(2), pages 256-268.
    14. Jacob Ausderan, 2018. "Reassessing the democratic advantage in interstate wars using k-adic datasets," Conflict Management and Peace Science, Peace Science Society (International), vol. 35(5), pages 451-473, September.
    15. Paul Poast, 2013. "Issue linkage and international cooperation: An empirical investigation," Conflict Management and Peace Science, Peace Science Society (International), vol. 30(3), pages 286-303, July.
    16. Yerko Rojas, 2017. "Evictions and short-term all-cause mortality: a 3-year follow-up study of a middle-aged Swedish population," International Journal of Public Health, Springer;Swiss School of Public Health (SSPH+), vol. 62(3), pages 343-351, April.
    17. Mehrez Ben Slama & Dhafer Saidane & Hassouna Fedhila, 2012. "How to identify targets in the M&A banking operations? Case of cross-border strategies in Europe by line of activity," Review of Quantitative Finance and Accounting, Springer, vol. 38(2), pages 209-240, February.
    18. Marcin Chlebus, 2014. "One-day prediction of state of turbulence for financial instrument based on models for binary dependent variable," Ekonomia journal, Faculty of Economic Sciences, University of Warsaw, vol. 37.
    19. Bastian, Nathaniel D. & Swenson, Eric R. & Ma, Linlin & Na, Hyeong Suk & Griffin, Paul M., 2017. "Incentive contract design for food retailers to reduce food deserts in the US," Socio-Economic Planning Sciences, Elsevier, vol. 60(C), pages 87-98.
    20. Lorenzo Cassi & Anne Plunket, 2014. "Proximity, network formation and inventive performance: in search of the proximity paradox," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 53(2), pages 395-422, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:280:y:2019:i:1:d:10.1007_s10479-019-03156-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.