IDEAS home Printed from https://ideas.repec.org/a/jns/jbstat/v240y2020i6p743-789n1.html
   My bibliography  Save this article

Early Prediction of University Dropouts – A Random Forest Approach

Author

Listed:
  • Behr Andreas

    (Chair of Statistics, University of Duisburg-Essen, Essen, Germany)

  • Giese Marco

    (Chair of Statistics, University of Duisburg-Essen, Essen, Germany)

  • Teguim K Herve D.

    (Chair of Statistics, University of Duisburg-Essen, Essen, Germany)

  • Theune Katja

    (Chair of Statistics, University of Duisburg-Essen, Essen, Germany)

Abstract

We predict university dropout using random forests based on conditional inference trees and on a broad German data set covering a wide range of aspects of student life and study courses. We model the dropout decision as a binary classification (graduate or dropout) and focus on very early prediction of student dropout by stepwise modeling students’ transition from school (pre-study) over the study-decision phase (decision phase) to the first semesters at university (early study phase). We evaluate how predictive performance changes over the three models, and observe a substantially increased performance when including variables from the first study experiences, resulting in an AUC (area under the curve) of 0.86. Important predictors are the final grade at secondary school, and also determinants associated with student satisfaction and their subjective academic self-concept and self-assessment. A direct outcome of this research is the provision of information to universities wishing to implement early warning systems and more personalized counseling services to support students at risk of dropping out during an early stage of study.

Suggested Citation

  • Behr Andreas & Giese Marco & Teguim K Herve D. & Theune Katja, 2020. "Early Prediction of University Dropouts – A Random Forest Approach," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 240(6), pages 743-789, December.
  • Handle: RePEc:jns:jbstat:v:240:y:2020:i:6:p:743-789:n:1
    DOI: 10.1515/jbnst-2019-0006
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/jbnst-2019-0006
    Download Restriction: no

    File URL: https://libkey.io/10.1515/jbnst-2019-0006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jeremy P. Smith & Robin A. Naylor, 2001. "Dropping out of university: A statistical analysis of the probability of withdrawal for UK university students," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 164(2), pages 389-405.
    2. Schneider, Kerstin & Berens, Johannes & Oster, Simon & Burghoff, Julian, 2018. "Early Detection of Students at Risk - Predicting Student Dropouts Using Administrative Student Data and Machine Learning Methods," VfS Annual Conference 2018 (Freiburg, Breisgau): Digital Economy 181544, Verein für Socialpolitik / German Economic Association.
    3. Ragui Assaad & Caroline Krafft & Shaimaa Yassin, 2018. "Comparing retrospective and panel data collection methods to assess labor market dynamics," IZA Journal of Migration and Development, Springer;Forschungsinstitut zur Zukunft der Arbeit GmbH (IZA), vol. 8(1), pages 1-34, December.
    4. Ralph Stinebrickner & Todd Stinebrickner, 2014. "Academic Performance and College Dropout: Using Longitudinal Expectations Data to Estimate a Learning Model," Journal of Labor Economics, University of Chicago Press, vol. 32(3), pages 601-644.
    5. Montmarquette, Claude & Mahseredjian, Sophie & Houle, Rachel, 2001. "The determinants of university dropouts: a bivariate probability model with sample selection," Economics of Education Review, Elsevier, vol. 20(5), pages 475-484, October.
    6. Di Pietro, Giorgio & Cutillo, Andrea, 2008. "Degree flexibility and university drop-out: The Italian experience," Economics of Education Review, Elsevier, vol. 27(5), pages 546-555, October.
    7. Emanuela Ghignoni, 2015. "Family background and university dropouts during the crisis: the case of Italy," Working Papers in Public Economics 169, University of Rome La Sapienza, Department of Economics and Law.
    8. Gérard Lassibille & María Lucía Navarro Gómez, 2008. "Why do higher education students drop out? Evidence from Spain," Post-Print halshs-00324365, HAL.
    9. J. -P. Vandamme & N. Meskens & J. -F. Superby, 2007. "Predicting Academic Performance by Data Mining Methods," Education Economics, Taylor & Francis Journals, vol. 15(4), pages 405-419.
    10. Isphording, Ingo E. & Wozny, Florian, 2018. "Ursachen des Studienabbruchs – eine Analyse des Nationalen Bildungspanels," IZA Research Reports 82, Institute of Labor Economics (IZA).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Diego Opazo & Sebastián Moreno & Eduardo Álvarez-Miranda & Jordi Pereira, 2021. "Analysis of First-Year University Student Dropout through Machine Learning Models: A Comparison between Universities," Mathematics, MDPI, vol. 9(20), pages 1-27, October.
    2. Atanas Ivanov, 2020. "Decision Trees for Evaluation of Mathematical Competencies in the Higher Education: A Case Study," Mathematics, MDPI, vol. 8(5), pages 1-16, May.
    3. Snezhana Gocheva-Ilieva & Hristina Kulina & Atanas Ivanov, 2020. "Assessment of Students’ Achievements and Competencies in Mathematics Using CART and CART Ensembles and Bagging with Combined Model Improvement by MARS," Mathematics, MDPI, vol. 9(1), pages 1-17, December.
    4. Cecilia Martinez-Castillo & Gonzalo Astray & Juan Carlos Mejuto, 2021. "Modelling and Prediction of Monthly Global Irradiation Using Different Prediction Models," Energies, MDPI, vol. 14(8), pages 1-16, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Contini, Dalit & Salza, Guido, 2020. "Too few university graduates. Inclusiveness and effectiveness of the Italian higher education system," Socio-Economic Planning Sciences, Elsevier, vol. 71(C).
    2. Aina, Carmen & Baici, Eliana & Casalone, Giorgia & Pastore, Francesco, 2018. "The Economics of University Dropouts and Delayed Graduation: A Survey," IZA Discussion Papers 11421, Institute of Labor Economics (IZA).
    3. Aina, Carmen & Baici, Eliana & Casalone, Giorgia & Pastore, Francesco, 2022. "The determinants of university dropout: A review of the socio-economic literature," Socio-Economic Planning Sciences, Elsevier, vol. 79(C).
    4. Gitto, Lara & Minervini, Leo Fulvio & Monaco, Luisa, 2016. "University dropouts in Italy: Are supply side characteristics part of the problem?," Economic Analysis and Policy, Elsevier, vol. 49(C), pages 108-116.
    5. Contini,Dalit & Ricciardi,Riccardo & Romito,Marco & Salza,Guido & Zotti,Roberto, 2020. "Improving university dropout and student careers. What room for institutional action?," Department of Economics and Statistics Cognetti de Martiis. Working Papers 202004, University of Turin.
    6. Paola Perchinunno & Massimo Bilancia & Domenico Vitale, 2021. "A Statistical Analysis of Factors Affecting Higher Education Dropouts," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 156(2), pages 341-362, August.
    7. Horacio Matos-Díaz, 2009. "Determinantes de las tasas universitarias de graduación, retención y deserción en Puerto Rico: Un estudio de Caso," Revista de Economía del Rosario, Universidad del Rosario, May.
    8. Gitto, Lara & Minervini, Leo Fulvio & Monaco, Luisa, 2012. "University dropouts: supply-side issues in Italy," MPRA Paper 56656, University Library of Munich, Germany, revised Nov 2013.
    9. By Vincenzo Carrieri & Marcello D’Amato & Roberto Zotti, 2015. "On the causal effects of selective admission policies on students’ performances: evidence from a quasi-experiment in a large Italian university," Oxford Economic Papers, Oxford University Press, vol. 67(4), pages 1034-1056.
    10. Stephen E. Childs & Ross Finnie & Felice Martinello, 2017. "Postsecondary Student Persistence and Pathways: Evidence From the YITS-A in Canada," Research in Higher Education, Springer;Association for Institutional Research, vol. 58(3), pages 270-294, May.
    11. Contini, Dalit & Salza, Guido & Scagni, Andrea, 2017. "Dropout and Time to Degree in Italian Universities Around the Economic Crisis," Department of Economics and Statistics Cognetti de Martiis. Working Papers 201716, University of Turin.
    12. Maria Marchenko, 2019. "Endogenous Shocks in Social Networks: Exam Failures and Friends' Future Performance," Department of Economics Working Papers wuwp292, Vienna University of Economics and Business, Department of Economics.
    13. Elias Katsikas & Theologos Dergiades, 2012. "Revising higher education policy in Greece: filling the Danaids’ Jar," Empirica, Springer;Austrian Institute for Economic Research;Austrian Economic Association, vol. 39(3), pages 279-292, August.
    14. Rossella Iraci Capuccinello, 2014. "Determinants and timing of dropping out decisions: evidence from the UK FE sector," Working Papers 15742191, Lancaster University Management School, Economics Department.
    15. Isphording, Ingo E. & Raabe, Tobias, 2019. "Early Identification of College Dropouts Using Machine-Learning: Conceptual Considerations and an Empirical Example," IZA Research Reports 89, Institute of Labor Economics (IZA).
    16. Wydra-Somaggio, Gabriele, 2017. "Early termination of vocational training: dropout or stopout?," IAB-Discussion Paper 201703, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    17. CARRIERI, Vincenzo & D'AMATO, Marcello & ZOTTI, Roberto, 2013. "Selective Admission Tests and Students' Performances. Evidence from a Natural Experiment in a Large Italian University," CELPE Working Papers 0/00, CELPE - Centre of Labour Economics and Economic Policy, University of Salerno, Italy.
    18. Roberto Zotti, 2015. "Should I Stay Or Should I Go? Dropping Out From University: An Empirical Analysis Of Students’ Performances," Working Papers 70, AlmaLaurea Inter-University Consortium.
    19. Annalina Sarra & Lara Fontanella & Simone Zio, 2019. "Identifying Students at Risk of Academic Failure Within the Educational Data Mining Framework," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(1), pages 41-60, November.
    20. Schnepf, Sylke V., 2014. "Do Tertiary Dropout Students Really Not Succeed in European Labour Markets?," IZA Discussion Papers 8015, Institute of Labor Economics (IZA).

    More about this item

    Keywords

    student dropout; higher education; dropout prediction; educational data mining; random forest;
    All these keywords.

    JEL classification:

    • I23 - Health, Education, and Welfare - - Education - - - Higher Education; Research Institutions

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:jns:jbstat:v:240:y:2020:i:6:p:743-789:n:1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.