IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v96y2016icp57-73.html

Random forest for ordinal responses: Prediction and variable selection

Author

Listed:
  • Janitza, Silke
  • Tutz, Gerhard
  • Boulesteix, Anne-Laure

Abstract

The random forest method is a commonly used tool for classification with high-dimensional data that is able to rank candidate predictors through its inbuilt variable importance measures. It can be applied to various kinds of regression problems including nominal, metric and survival response variables. While classification and regression problems using random forest methodology have been extensively investigated in the past, in the case of ordinal response there is no standard procedure. Extensive studies using random forest based on conditional inference trees are conducted to explore whether incorporating the ordering information yields any improvement in both prediction performance or variable selection. Two novel permutation variable importance measures are presented that are reasonable alternatives to the currently implemented importance measure which was developed for nominal response and makes no use of the ordering in the levels of an ordinal response variable. Results based on simulated and real data suggest that predictor rankings can be improved in some settings by using new permutation importance measures that explicitly use the ordering in the response levels in combination with ordinal regression trees. With respect to prediction accuracy, the performance of ordinal regression trees was similar to and in most settings even slightly better than that of classification trees.

Suggested Citation

  • Janitza, Silke & Tutz, Gerhard & Boulesteix, Anne-Laure, 2016. "Random forest for ordinal responses: Prediction and variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 96(C), pages 57-73.
  • Handle: RePEc:eee:csdana:v:96:y:2016:i:c:p:57-73
    DOI: 10.1016/j.csda.2015.10.005
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947315002601
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2015.10.005?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Raffaella Piccarreta, 2001. "A new measure of nominal-ordinal association," Journal of Applied Statistics, Taylor & Francis Journals, vol. 28(1), pages 107-120.
    2. Hothorn, Torsten & Hornik, Kurt & van de Wiel, Mark A. & Zeileis, Achim, 2006. "A Lego System for Conditional Inference," The American Statistician, American Statistical Association, vol. 60, pages 257-263, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chavez, Alex K. & Bicchieri, Cristina, 2013. "Third-party sanctioning and compensation behavior: Findings from the ultimatum game," Journal of Economic Psychology, Elsevier, vol. 39(C), pages 268-277.
    2. Raphael Knevels & Alexander Brenning & Simone Gingrich & Gerhard Heiss & Theresia Lechner & Philip Leopold & Christoph Plutzar & Herwig Proske & Helene Petschko, 2021. "Towards the Use of Land Use Legacies in Landslide Modeling: Current Challenges and Future Perspectives in an Austrian Case Study," Land, MDPI, vol. 10(9), pages 1-29, September.
    3. Hongjun Bai & Eric Lewitus & Yifan Li & Paul V. Thomas & Michelle Zemil & Mélanie Merbah & Caroline E. Peterson & Thujitha Thuraisamy & Phyllis A. Rees & Agnes Hajduczki & Vincent Dussupt & Bonnie Sli, 2024. "Contemporary HIV-1 consensus Env with AI-assisted redesigned hypervariable loops promote antibody binding," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    4. repec:jss:jstsof:34:i01 is not listed on IDEAS
    5. Tomáš Želinský, 2015. "Nekonzistentnosť časových preferencií ľudí z arginalizovaných rómskych komunít [On inconsistency of time preferences of people from the marginalised roma communities]," Politická ekonomie, Prague University of Economics and Business, vol. 2015(2), pages 204-222.
    6. Bryan Keller, 2012. "Detecting Treatment Effects with Small Samples: The Power of Some Tests Under the Randomization Model," Psychometrika, Springer;The Psychometric Society, vol. 77(2), pages 324-338, April.
    7. repec:osf:osfxxx:ha4cw_v1 is not listed on IDEAS
    8. repec:jss:jstsof:36:i02 is not listed on IDEAS
    9. repec:plo:pone00:0186285 is not listed on IDEAS
    10. Georgina Milne & Andrew William Byrne & Emma Campbell & Jordon Graham & John McGrath & Raymond Kirke & Wilma McMaster & Jesko Zimmermann & Adewale Henry Adenuga, 2022. "Quantifying Land Fragmentation in Northern Irish Cattle Enterprises," Land, MDPI, vol. 11(3), pages 1-16, March.
    11. repec:plo:pone00:0199758 is not listed on IDEAS
    12. Santiago Carbo-Valverde & Pedro Cuadros-Solas & Francisco Rodríguez-Fernández, 2020. "A machine learning approach to the digitalization of bank customers: Evidence from random and causal forests," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-39, October.
    13. Payton J. Jones & Patrick Mair & Thorsten Simon & Achim Zeileis, 2020. "Network Trees: A Method for Recursively Partitioning Covariance Structures," Psychometrika, Springer;The Psychometric Society, vol. 85(4), pages 926-945, December.
    14. Carbó-Valverde, Santiago & Cuadros-Solas, Pedro J. & Rodríguez-Fernández, Francisco, 2025. "Cryptocurrency ownership and cognitive biases in perceived financial literacy," Journal of Behavioral and Experimental Finance, Elsevier, vol. 45(C).
    15. Ribas, Giovana Ghisleni & Zanon, Alencar Junior & Streck, Nereu Augusto & Pilecco, Isabela Bulegon & de Souza, Pablo Mazzuco & Heinemann, Alexandre Bryan & Grassini, Patricio, 2021. "Assessing yield and economic impact of introducing soybean to the lowland rice system in southern Brazil," Agricultural Systems, Elsevier, vol. 188(C).
    16. Rosaria Lombardo & Eric Beh & Antonello D'Ambra, 2011. "Studying the dependence between ordinal-nominal categorical variables via orthogonal polynomials," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(10), pages 2119-2132.
    17. Seibold Heidi & Zeileis Achim & Hothorn Torsten, 2016. "Model-Based Recursive Partitioning for Subgroup Analyses," The International Journal of Biostatistics, De Gruyter, vol. 12(1), pages 45-63, May.
    18. Giuseppe Bove & Pier Luigi Conti & Daniela Marella, 2021. "A measure of interrater absolute agreement for ordinal categorical data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(3), pages 927-945, September.
    19. M. Perakis & P. Maravelakis & S. Psarakis & E. Xekalaki & J. Panaretos, 2005. "On Certain Indices for Ordinal Data with Unequally Weighted Classes," Quality & Quantity: International Journal of Methodology, Springer, vol. 39(5), pages 515-536, October.
    20. Millo, Giovanni, 2014. "Robust standard error estimators for panel models: a unifying approach," MPRA Paper 54954, University Library of Munich, Germany.
    21. Stefano Romano & Jakob Wirbel & Rebecca Ansorge & Christian Schudoma & Quinten Raymond Ducarmon & Arjan Narbad & Georg Zeller, 2025. "Machine learning-based meta-analysis reveals gut microbiome alterations associated with Parkinson’s disease," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
    22. McGinlay, James & Parsons, David J. & Morris, Joe & Hubatova, Marie & Graves, Anil & Bradbury, Richard B. & Bullock, James M., 2017. "Do charismatic species groups generate more cultural ecosystem service benefits?," Ecosystem Services, Elsevier, vol. 27(PA), pages 15-24.
    23. Tlacaelel Rivera-Núñez & Luis García-Barrios & Mariana Benítez & Julieta A. Rosell & Rodrigo García-Herrera & Erin Estrada-Lugo, 2022. "Unravelling the Paradoxical Seasonal Food Scarcity in a Peasant Microregion of Mexico," Sustainability, MDPI, vol. 14(11), pages 1-23, May.
    24. Marc Ditzhaus & Arnold Janssen, 2020. "Bootstrap and permutation rank tests for proportional hazards under right censoring," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 26(3), pages 493-517, July.
    25. Elsäßer Amelie & Victor Anja & Hommel Gerhard, 2011. "Multiple Testing in Candidate Gene Situations: A Comparison of Classical, Discrete, and Resampling-Based Procedures," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-21, November.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:96:y:2016:i:c:p:57-73. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.