IDEAS home Printed from https://ideas.repec.org/p/usg/econwp/201908.html
   My bibliography  Save this paper

Random Forest Estimation of the Ordered Choice Model

Author

Listed:
  • Lechner, Michael
  • Okasa, Gabriel

Abstract

In econometrics so-called ordered choice models are popular when interest is in the estimation of the probabilities of particular values of categorical outcome variables with an inherent ordering, conditional on covariates. In this paper we develop a new machine learning estimator based on the random forest algorithm for such models without imposing any distributional assumptions. The proposed Ordered Forest estimator provides a flexible estimation method of the conditional choice probabilities that can naturally deal with nonlinearities in the data, while taking the ordering information explicitly into account. In addition to common machine learning estimators, it enables the estimation of marginal effects as well as conducting inference thereof and thus providing the same output as classical econometric estimators based on ordered logit or probit models. An extensive simulation study examines the finite sample properties of the Ordered Forest and reveals its good predictive performance, particularly in settings with multicollinearity among the predictors and nonlinear functional forms. An empirical application further illustrates the estimation of the marginal effects and their standard errors and demonstrates the advantages of the flexible estimation compared to a parametric benchmark model.

Suggested Citation

  • Lechner, Michael & Okasa, Gabriel, 2019. "Random Forest Estimation of the Ordered Choice Model," Economics Working Paper Series 1908, University of St. Gallen, School of Economics and Political Science.
  • Handle: RePEc:usg:econwp:2019:08
    as

    Download full text from publisher

    File URL: http://ux-tauri.unisg.ch/RePEc/usg/econwp/EWP-1908.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    2. Stefan Boes & Rainer Winkelmann, 2006. "Ordered Response Models," Springer Books, in: Olaf Hübler & Jachim Frohn (ed.), Modern Econometric Analysis, chapter 12, pages 167-181, Springer.
    3. Jeremy T. Fox, 2007. "Semiparametric estimation of multinomial discrete-choice models using a subset of choices," RAND Journal of Economics, RAND Corporation, vol. 38(4), pages 1002-1019, December.
    4. Lechner, Michael, 2018. "Modified Causal Forests for Estimating Heterogeneous Causal Effects," IZA Discussion Papers 12040, Institute of Labor Economics (IZA).
    5. Raffaella Piccarreta, 2008. "Classification trees for ordinal variables," Computational Statistics, Springer, vol. 23(3), pages 407-427, July.
    6. Lee, Lung-fei, 1995. "Semiparametric maximum likelihood estimation of polychotomous and sequential choice models," Journal of Econometrics, Elsevier, vol. 65(2), pages 381-428, February.
    7. Antonio Afonso & Pedro Gomes & Philipp Rother, 2009. "Ordered response models for sovereign debt ratings," Applied Economics Letters, Taylor & Francis Journals, vol. 16(8), pages 769-773.
    8. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    9. Daniel Goller & Michael C. Knaus & Michael Lechner & Gabriel Okasa, 2021. "Predicting match outcomes in football by an Ordered Forest estimator," Chapters, in: Ruud H. Koning & Stefan Kesenne (ed.), A Modern Guide to Sports Economics, chapter 22, pages 335-355, Edward Elgar Publishing.
    10. Roger W. Klein & Robert P. Sherman, 2002. "Shift Restrictions and Semiparametric Estimation in Ordered Response Models," Econometrica, Econometric Society, vol. 70(2), pages 663-691, March.
    11. Jeffrey Racine, 2008. "Nonparametric econometrics: a primer (in Russian)," Quantile, Quantile, issue 4, pages 7-56, March.
    12. Matzkin, Rosa L, 1992. "Nonparametric and Distribution-Free Estimation of the Binary Threshold Crossing and the Binary Choice Models," Econometrica, Econometric Society, vol. 60(2), pages 239-270, March.
    13. Racine, Jeffrey S., 2008. "Nonparametric Econometrics: A Primer," Foundations and Trends(R) in Econometrics, now publishers, vol. 3(1), pages 1-88, March.
    14. Powell, James L. & Stoker, Thomas M., 1996. "Optimal bandwidth choice for density-weighted averages," Journal of Econometrics, Elsevier, vol. 75(2), pages 291-316, December.
    15. Klein, Roger W & Spady, Richard H, 1993. "An Efficient Semiparametric Estimator for Binary Response Models," Econometrica, Econometric Society, vol. 61(2), pages 387-421, March.
    16. Janitza, Silke & Tutz, Gerhard & Boulesteix, Anne-Laure, 2016. "Random forest for ordinal responses: Prediction and variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 96(C), pages 57-73.
    17. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
    18. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    19. Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.
    20. Stewart, Mark B., 2005. "A comparison of semiparametric estimators for the ordered response model," Computational Statistics & Data Analysis, Elsevier, vol. 49(2), pages 555-573, April.
    21. Lewbel, Arthur, 2000. "Semiparametric qualitative response model estimation with unknown heteroscedasticity or instrumental variables," Journal of Econometrics, Elsevier, vol. 97(1), pages 145-177, July.
    22. Constantinou Anthony Costa & Fenton Norman Elliott, 2012. "Solving the Problem of Inadequate Scoring Rules for Assessing Probabilistic Football Forecast Models," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 8(1), pages 1-14, March.
    23. Greene,William H. & Hensher,David A., 2010. "Modeling Ordered Choices," Cambridge Books, Cambridge University Press, number 9780521142373.
    24. Greene,William H. & Hensher,David A., 2010. "Modeling Ordered Choices," Cambridge Books, Cambridge University Press, number 9780521194204.
    25. Lin, Zhongjian & Li, Qi & Sun, Yiguo, 2014. "A consistent nonparametric test of parametric regression functional form in fixed effects panel data models," Journal of Econometrics, Elsevier, vol. 178(P1), pages 167-179.
    26. Stoker, Thomas M., 1996. "Smoothing bias in the measurement of marginal effects," Journal of Econometrics, Elsevier, vol. 72(1-2), pages 49-84.
    27. Alberto Abadie & Guido W. Imbens, 2006. "Large Sample Properties of Matching Estimators for Average Treatment Effects," Econometrica, Econometric Society, vol. 74(1), pages 235-267, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Paul Clarke & Annalivia Polselli, 2023. "Double Machine Learning for Static Panel Models with Fixed Effects," Papers 2312.08174, arXiv.org, revised Dec 2023.
    2. Qinglong Shao, 2022. "Does less working time improve life satisfaction? Evidence from European Social Survey," Health Economics Review, Springer, vol. 12(1), pages 1-18, December.
    3. Franziska Braschke & Patrick A. Puhani, 2023. "Population Adjustment to Asymmetric Labour Market Shocks in India: A Comparison to Europe and the United States at Two Different Regional Levels," The Indian Journal of Labour Economics, Springer;The Indian Society of Labour Economics (ISLE), vol. 66(1), pages 7-35, March.
    4. Wang, Shixuan & Syntetos, Aris A. & Liu, Ying & Di Cairano-Gilfedder, Carla & Naim, Mohamed M., 2023. "Improving automotive garage operations by categorical forecasts using a large number of variables," European Journal of Operational Research, Elsevier, vol. 306(2), pages 893-908.
    5. Riccardo Di Francesco, 2023. "Ordered Correlation Forest," Papers 2309.08755, arXiv.org.
    6. Goller, Daniel & Heiniger, Sandro, 2022. "A general framework to quantify the event importance in multi-event contests," Economics Working Paper Series 2204, University of St. Gallen, School of Economics and Political Science.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sayed Alim Samim & Zhiquan Hu & Sebastian Stepien & Sayed Younus Amini & Ramin Rayee & Kunyu Niu & George Mgendi, 2021. "Food Insecurity and Related Factors among Farming Families in Takhar Region, Afghanistan," Sustainability, MDPI, vol. 13(18), pages 1-17, September.
    2. Xi Wang & Songnian Chen, 2022. "Partial Identification and Estimation of Semiparametric Ordered Response Models with Interval Regressor Data," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 84(4), pages 830-849, August.
    3. Qi Li & Jeffrey Scott Racine, 2006. "Nonparametric Econometrics: Theory and Practice," Economics Books, Princeton University Press, edition 1, volume 1, number 8355.
    4. Gabriel Okasa, 2022. "Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance," Papers 2201.12692, arXiv.org.
    5. Stefan Boes, 2013. "Nonparametric analysis of treatment effects in ordered response models," Empirical Economics, Springer, vol. 44(1), pages 81-109, February.
    6. William H. Greene & Mark N. Harris & Rachel J. Knott & Nigel Rice, 2021. "Specification and testing of hierarchical ordered response models with anchoring vignettes," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 31-64, January.
    7. Daniel Goller, 2023. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Annals of Operations Research, Springer, vol. 325(1), pages 649-679, June.
    8. Yingying Dong & Arthur Lewbel, 2015. "A Simple Estimator for Binary Choice Models with Endogenous Regressors," Econometric Reviews, Taylor & Francis Journals, vol. 34(1-2), pages 82-105, February.
    9. Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Kernel Methods for Causal Functions: Dose, Heterogeneous, and Incremental Response Curves," Papers 2010.04855, arXiv.org, revised Oct 2022.
    10. Hoderlein, Stefan & Sherman, Robert, 2015. "Identification and estimation in a correlated random coefficients binary response model," Journal of Econometrics, Elsevier, vol. 188(1), pages 135-149.
    11. Giuseppe De Luca & Valeria Perotti, 2011. "Estimation of ordered response models with sample selection," Stata Journal, StataCorp LP, vol. 11(2), pages 213-239, June.
    12. Rosati, Nicoletta & Bellia, Mario & Matos, Pedro Verga & Oliveira, Vasco, 2020. "Ratings matter: Announcements in times of crisis and the dynamics of stock markets," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 64(C).
    13. Jonathan A. Cook & Saad Siddiqui, 2020. "Random forests and selected samples," Bulletin of Economic Research, Wiley Blackwell, vol. 72(3), pages 272-287, July.
    14. Taisuke Otsu & Mengshan Xu, 2022. "Isotonic propensity score matching," STICERD - Econometrics Paper Series 623, Suntory and Toyota International Centres for Economics and Related Disciplines, LSE.
    15. Yan, Jin & Yoo, Hong Il, 2019. "Semiparametric estimation of the random utility model with rank-ordered choice data," Journal of Econometrics, Elsevier, vol. 211(2), pages 414-438.
    16. William H. Greene & David A. Hensher, 2008. "Modeling Ordered Choices: A Primer and Recent Developments," Working Papers 08-26, New York University, Leonard N. Stern School of Business, Department of Economics.
    17. Yixiao Jiang, 2021. "Semiparametric Estimation of a Corporate Bond Rating Model," Econometrics, MDPI, vol. 9(2), pages 1-20, May.
    18. Melcarne, Alessandro & Monnery, Benjamin & Wolff, François-Charles, 2022. "Prosecutors, judges and sentencing disparities: Evidence from traffic offenses in France," International Review of Law and Economics, Elsevier, vol. 71(C).
    19. Tennant, David F. & Tracey, Marlon R. & King, Damien W., 2020. "Sovereign credit rating: Evidence of bias against poor countries," The North American Journal of Economics and Finance, Elsevier, vol. 51(C).
    20. Zhang, Zhaohui & Paudel, Krishna P., 2019. "Policy improvements and farmers' willingness to participate: Insights from the new round of China's Sloping Land Conversion Program," Ecological Economics, Elsevier, vol. 162(C), pages 121-132.

    More about this item

    Keywords

    Ordered choice models; random forests; probabilities; marginal effects; machine learning;
    All these keywords.

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C25 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Discrete Regression and Qualitative Choice Models; Discrete Regressors; Proportions; Probabilities
    • C40 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:usg:econwp:2019:08. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/vwasgch.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.