IDEAS home Printed from https://ideas.repec.org/p/geo/guwopa/gueconwpa~17-17-09.html
   My bibliography  Save this paper

Beyond Early Warning Indicators: High School Dropout and Machine Learning

Author

Abstract

This paper provides an algorithm to predict which students are going to drop out of high schools relying only on information from 9th grade. It verifies that using a parsimonious early warning system - as implemented in many schools - leads to poor results. It shows that schools can obtain more precise predictions by exploiting the available high-dimensional data jointly with machine learning tools such as Support Vector Machine, Boosted Regression and Post-LASSO. It carefully selects goodness-of-fit criteria based on the context and the underlying theoretical framework: model parameters are calibrated by taking into account policy goals and budget constraints. Finally, it uses unsupervised machine learning to divide students at risk of dropping out into different clusters.

Suggested Citation

  • Dario Sansone, 2017. "Beyond Early Warning Indicators: High School Dropout and Machine Learning," Working Papers gueconwpa~17-17-09, Georgetown University, Department of Economics.
  • Handle: RePEc:geo:guwopa:gueconwpa~17-17-09
    as

    Download full text from publisher

    File URL: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3062317
    File Function: Full text
    Download Restriction: None
    ---><---

    Other versions of this item:

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. McKenzie, David & Sansone, Dario, 2019. "Predicting entrepreneurial success is hard: Evidence from a business plan competition in Nigeria," Journal of Development Economics, Elsevier, vol. 141(C).
    2. Liyang Tang, 2020. "Application of Nonlinear Autoregressive with Exogenous Input (NARX) neural network in macroeconomic forecasting, national goal setting and global competitiveness assessment," Papers 2005.08735, arXiv.org.
    3. Filmer,Deon P. & Nahata,Vatsal & Sabarwal,Shwetlena, 2021. "Preparation, Practice, and Beliefs : A Machine Learning Approach to Understanding Teacher Effectiveness," Policy Research Working Paper Series 9847, The World Bank.
    4. Maria do Carmo Nicoletti & Osvaldo Luiz de Oliveira, 2020. "A Machine Learning-Based Computational System Proposal Aiming at Higher Education Dropout Prediction," Higher Education Studies, Canadian Center of Science and Education, vol. 10(4), pages 1-12, December.
    5. Delogu, Marco & Lagravinese, Raffaele & Paolini, Dimitri & Resce, Giuliano, 2024. "Predicting dropout from higher education: Evidence from Italy," Economic Modelling, Elsevier, vol. 130(C).
    6. Nuha Alruwais & Mohammed Zakariah, 2023. "Evaluating Student Knowledge Assessment Using Machine Learning Techniques," Sustainability, MDPI, vol. 15(7), pages 1-25, April.
    7. Hazal Colak Oz & Çiçek Güven & Gonzalo Nápoles, 2023. "School dropout prediction and feature importance exploration in Malawi using household panel data: machine learning approach," Journal of Computational Social Science, Springer, vol. 6(1), pages 245-287, April.
    8. Ashesh Rambachan & Amanda Coston & Edward Kennedy, 2022. "Robust Design and Evaluation of Predictive Algorithms under Unobserved Confounding," Papers 2212.09844, arXiv.org, revised Aug 2023.
    9. Bacon, Victoria R. & Kearney, Christopher A., 2020. "School climate and student-based contextual learning factors as predictors of school absenteeism severity at multiple levels via CHAID analysis," Children and Youth Services Review, Elsevier, vol. 118(C).
    10. Isphording, Ingo E. & Raabe, Tobias, 2019. "Early Identification of College Dropouts Using Machine-Learning: Conceptual Considerations and an Empirical Example," IZA Research Reports 89, Institute of Labor Economics (IZA).
    11. Miguel Angel Valles-Coral & Luis Salazar-Ramírez & Richard Injante & Edwin Augusto Hernandez-Torres & Juan Juárez-Díaz & Jorge Raul Navarro-Cabrera & Lloy Pinedo & Pierre Vidaurre-Rojas, 2022. "Density-Based Unsupervised Learning Algorithm to Categorize College Students into Dropout Risk Levels," Data, MDPI, vol. 7(11), pages 1-18, November.

    More about this item

    Keywords

    High School Dropout; Machine Learning; Big Data;
    All these keywords.

    JEL classification:

    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • I20 - Health, Education, and Welfare - - Education - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:geo:guwopa:gueconwpa~17-17-09. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Marcia Suss (email available below). General contact details of provider: http://econ.georgetown.edu/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.