IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0242334.html
   My bibliography  Save this article

Predicting time to graduation at a large enrollment American university

Author

Listed:
  • John M Aiken
  • Riccardo De Bin
  • Morten Hjorth-Jensen
  • Marcos D Caballero

Abstract

The time it takes a student to graduate with a university degree is mitigated by a variety of factors such as their background, the academic performance at university, and their integration into the social communities of the university they attend. Different universities have different populations, student services, instruction styles, and degree programs, however, they all collect institutional data. This study presents data for 160,933 students attending a large American research university. The data includes performance, enrollment, demographics, and preparation features. Discrete time hazard models for the time-to-graduation are presented in the context of Tinto’s Theory of Drop Out. Additionally, a novel machine learning method: gradient boosted trees, is applied and compared to the typical maximum likelihood method. We demonstrate that enrollment factors (such as changing a major) lead to greater increases in model predictive performance of when a student graduates than performance factors (such as grades) or preparation (such as high school GPA).

Suggested Citation

  • John M Aiken & Riccardo De Bin & Morten Hjorth-Jensen & Marcos D Caballero, 2020. "Predicting time to graduation at a large enrollment American university," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-28, November.
  • Handle: RePEc:plo:pone00:0242334
    DOI: 10.1371/journal.pone.0242334
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0242334
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0242334&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0242334?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. L. Lombardo & M. Cama & C. Conoscenti & M. Märker & E. Rotigliano, 2015. "Binary logistic regression versus stochastic gradient boosted decision trees in assessing landslide susceptibility for multiple-occurring landslide events: application to the 2009 storm event in Messi," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 79(3), pages 1621-1648, December.
    2. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    3. Hongtao Yue & Xuanning Fu, 2017. "Rethinking Graduation and Time to Degree: A Fresh Perspective," Research in Higher Education, Springer;Association for Institutional Research, vol. 58(2), pages 184-213, March.
    4. DesJardins, S. L. & Ahlburg, D. A. & McCall, B. P., 1999. "An event history model of student departure," Economics of Education Review, Elsevier, vol. 18(3), pages 375-390, June.
    5. Jaison R. Abel & Richard Deitz & Yaquin Su, 2014. "Are recent college graduates finding good jobs?," Current Issues in Economics and Finance, Federal Reserve Bank of New York, vol. 20.
    6. James Vaupel & Kenneth Manton & Eric Stallard, 1979. "The impact of heterogeneity in individual frailty on the dynamics of mortality," Demography, Springer;Population Association of America (PAA), vol. 16(3), pages 439-454, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shruti Sachdeva & Bijendra Kumar, 2020. "A Comparative Study between Frequency Ratio Model and Gradient Boosted Decision Trees with Greedy Dimensionality Reduction in Groundwater Potential Assessment," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 34(15), pages 4593-4615, December.
    2. Mansoor, Umer & Jamal, Arshad & Su, Junbiao & Sze, N.N. & Chen, Anthony, 2023. "Investigating the risk factors of motorcycle crash injury severity in Pakistan: Insights and policy recommendations," Transport Policy, Elsevier, vol. 139(C), pages 21-38.
    3. Bagdonavicius, Vilijandas & Nikulin, Mikhail, 2000. "On goodness-of-fit for the linear transformation and frailty models," Statistics & Probability Letters, Elsevier, vol. 47(2), pages 177-188, April.
    4. Yahia Salhi & Pierre-Emmanuel Thérond, 2016. "Age-Specific Adjustment of Graduated Mortality," Working Papers hal-01391285, HAL.
    5. Feehan, Dennis & Wrigley-Field, Elizabeth, 2020. "How do populations aggregate?," SocArXiv 2fkw3, Center for Open Science.
    6. M. K. Lintu & Asha Kamath, 2022. "Performance of recurrent event models on defect proneness data," Annals of Operations Research, Springer, vol. 315(2), pages 2209-2218, August.
    7. Jaison R. Abel & Richard Deitz, 2017. "Underemployment in the Early Careers of College Graduates following the Great Recession," NBER Chapters, in: Education, Skills, and Technical Change: Implications for Future US GDP Growth, pages 149-181, National Bureau of Economic Research, Inc.
    8. Il Do Ha & Maengseok Noh & Youngjo Lee, 2010. "Bias Reduction of Likelihood Estimators in Semiparametric Frailty Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 37(2), pages 307-320, June.
    9. John M. Nunley & Adam Pugh & Nicholas Romero & Richard Alan Seals, Jr., 2014. "Unemployment, Underemployment, and Employment Opportunities: Results from a Correspondence Audit of the Labor Market for College Graduates," Auburn Economics Working Paper Series auwp2014-04, Department of Economics, Auburn University.
    10. Andreas Wienke & Anne M. Herskind & Kaare Christensen & Axel Skytthe & Anatoli I. Yashin, 2002. "The influence of smoking and BMI on heritability in susceptibility to coronary heart disease," MPIDR Working Papers WP-2002-003, Max Planck Institute for Demographic Research, Rostock, Germany.
    11. Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.
    12. Richard Murphy & Gill Wyness, 2023. "Testing Means-Tested Aid," Journal of Labor Economics, University of Chicago Press, vol. 41(3), pages 687-727.
    13. Ibrahim Bicak & Lauren Schudde & Kristina Flores, 2023. "Predictors and Consequences of Math Course Repetition: The Role of Horizontal and Vertical Repetition in Success Among Community College Transfer Students," Research in Higher Education, Springer;Association for Institutional Research, vol. 64(2), pages 260-299, March.
    14. Akash Malhotra, 2018. "A hybrid econometric-machine learning approach for relative importance analysis: Prioritizing food policy," Papers 1806.04517, arXiv.org, revised Aug 2020.
    15. Svetlana V. Ukraintseva & Anatoli I. Yashin, 2005. "Economic progress as cancer risk factor. I: Puzzling facts of cancer epidemiology," MPIDR Working Papers WP-2005-021, Max Planck Institute for Demographic Research, Rostock, Germany.
    16. Silke van Daalen & Hal Caswell, 2015. "Lifetime reproduction and the second demographic transition: Stochasticity and individual variation," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 33(20), pages 561-588.
    17. Nahushananda Chakravarthy H G & Karthik M Seenappa & Sujay Raghavendra Naganna & Dayananda Pruthviraja, 2023. "Machine Learning Models for the Prediction of the Compressive Strength of Self-Compacting Concrete Incorporating Incinerated Bio-Medical Waste Ash," Sustainability, MDPI, vol. 15(18), pages 1-22, September.
    18. Tim Voigt & Martin Kohlhase & Oliver Nelles, 2021. "Incremental DoE and Modeling Methodology with Gaussian Process Regression: An Industrially Applicable Approach to Incorporate Expert Knowledge," Mathematics, MDPI, vol. 9(19), pages 1-26, October.
    19. K. Motarjem & M. Mohammadzadeh & A. Abyar, 2020. "Geostatistical survival model with Gaussian random effect," Statistical Papers, Springer, vol. 61(1), pages 85-107, February.
    20. Schultz, T. Paul, 2010. "Population and Health Policies," Handbook of Development Economics, in: Dani Rodrik & Mark Rosenzweig (ed.), Handbook of Development Economics, edition 1, volume 5, chapter 0, pages 4785-4881, Elsevier.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0242334. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.