IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v55y2011i5p1933-1941.html

Ensemble classification of paired data

Author

Listed:
  • Adler, Werner
  • Brenning, Alexander
  • Potapov, Sergej
  • Schmid, Matthias
  • Lausen, Berthold

Abstract

In many medical applications, data are taken from paired organs or from repeated measurements of the same organ or subject. Subject based as opposed to observation based evaluation of these data results in increased efficiency of the estimation of the misclassification rate. A subject based approach for classification in the generation of bootstrap samples of bagging and bundling methods is analyzed. A simulation model is used to compare the performance of different strategies to create the bootstrap samples which are used to grow individual trees. The proposed approach is compared to linear discriminant analysis, logistic regression, random forests and gradient boosting. Finally, the simulation results are applied to glaucoma diagnosis using both eyes of glaucoma patients and healthy controls. It is demonstrated that the proposed subject based resampling reduces the misclassification rate.

Suggested Citation

  • Adler, Werner & Brenning, Alexander & Potapov, Sergej & Schmid, Matthias & Lausen, Berthold, 2011. "Ensemble classification of paired data," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1933-1941, May.
  • Handle: RePEc:eee:csdana:v:55:y:2011:i:5:p:1933-1941
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(10)00445-7
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Iranpanah, N. & Mohammadzadeh, M. & Taylor, C.C., 2011. "A comparison of block and semi-parametric bootstrap methods for variance estimation in spatial statistics," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 578-587, January.
    2. De Bock, Koen W. & Coussement, Kristof & Van den Poel, Dirk, 2010. "Ensemble classification based on generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 54(6), pages 1535-1546, June.
    3. Hothorn, Torsten & Lausen, Berthold, 2005. "Bundling classifiers by bagging trees," Computational Statistics & Data Analysis, Elsevier, vol. 49(4), pages 1068-1078, June.
    4. Rokach, Lior, 2009. "Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4046-4072, October.
    5. Yuliya V Karpievitch & Elizabeth G Hill & Anthony P Leclerc & Alan R Dabney & Jonas S Almeida, 2009. "An Introspective Comparison of Random Forest-Based Classifiers for the Analysis of Cluster-Correlated Data by Way of RF++," PLOS ONE, Public Library of Science, vol. 4(9), pages 1-10, September.
    6. Adler, Werner & Lausen, Berthold, 2009. "Bootstrap estimated true and false positive rates and ROC curve," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 718-729, January.
    7. Zhang, Chun-Xia & Zhang, Jiang-She & Zhang, Gai-Ying, 2009. "Using Boosting to prune Double-Bagging ensembles," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1218-1231, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Werner Adler & Sergej Potapov & Berthold Lausen, 2011. "Classification of repeated measurements data using tree-based ensemble methods," Computational Statistics, Springer, vol. 26(2), pages 355-369, June.
    2. Narayanaswamy Balakrishnan & Majid Mojirsheibani, 2015. "A simple method for combining estimates to improve the overall error rates in classification," Computational Statistics, Springer, vol. 30(4), pages 1033-1049, December.
    3. Mojirsheibani, Majid & Kong, Jiajie, 2016. "An asymptotically optimal kernel combined classifier," Statistics & Probability Letters, Elsevier, vol. 119(C), pages 91-100.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mojirsheibani, Majid & Kong, Jiajie, 2016. "An asymptotically optimal kernel combined classifier," Statistics & Probability Letters, Elsevier, vol. 119(C), pages 91-100.
    2. Chung, Dongjun & Kim, Hyunjoong, 2015. "Accurate ensemble pruning with PL-bagging," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 1-13.
    3. Chen, Zhelun & O’Neill, Zheng & Wen, Jin & Pradhan, Ojas & Yang, Tao & Lu, Xing & Lin, Guanjing & Miyata, Shohei & Lee, Seungjae & Shen, Chou & Chiosa, Roberto & Piscitelli, Marco Savino & Capozzoli, , 2023. "A review of data-driven fault detection and diagnostics for building HVAC systems," Applied Energy, Elsevier, vol. 339(C).
    4. Hyunju Son & Youyi Fong, 2021. "Fast grid search and bootstrap‐based inference for continuous two‐phase polynomial regression models," Environmetrics, John Wiley & Sons, Ltd., vol. 32(3), May.
    5. Asma Gul & Aris Perperoglou & Zardad Khan & Osama Mahmoud & Miftahuddin Miftahuddin & Werner Adler & Berthold Lausen, 2018. "Ensemble of a subset of kNN classifiers," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(4), pages 827-840, December.
    6. Koen W. de Bock & Arno de Caigny, 2021. "Spline-rule ensemble classifiers with structured sparsity regularization for interpretable customer churn modeling," Post-Print hal-03391564, HAL.
    7. Chen, Xiangmeng & Shafizadeh, Alireza & Shahbeik, Hossein & Nadian, Mohammad Hossein & Golvirdizadeh, Milad & Peng, Wanxi & Lam, Su Shiung & Tabatabaei, Meisam & Aghbashlo, Mortaza, 2025. "Enhanced bio-oil production from biomass catalytic pyrolysis using machine learning," Renewable and Sustainable Energy Reviews, Elsevier, vol. 209(C).
    8. Kraus, Mathias & Tschernutter, Daniel & Weinzierl, Sven & Zschech, Patrick, 2024. "Interpretable generalized additive neural networks," European Journal of Operational Research, Elsevier, vol. 317(2), pages 303-316.
    9. Coolen-Maturi, Tahani & Elkhafifi, Faiza F. & Coolen, Frank P.A., 2014. "Three-group ROC analysis: A nonparametric predictive approach," Computational Statistics & Data Analysis, Elsevier, vol. 78(C), pages 69-81.
    10. Diogo Menezes & Mateus Mendes & Jorge Alexandre Almeida & Torres Farinha, 2020. "Wind Farm and Resource Datasets: A Comprehensive Survey and Overview," Energies, MDPI, vol. 13(18), pages 1-24, September.
    11. Petersen, Maya L. & Molinaro, Annette M. & Sinisi, Sandra E. & van der Laan, Mark J., 2007. "Cross-validated bagged learning," Journal of Multivariate Analysis, Elsevier, vol. 98(9), pages 1693-1704, October.
    12. Chun-Xia Zhang & Jiang-She Zhang & Sang-Woon Kim, 2016. "PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection," Computational Statistics, Springer, vol. 31(4), pages 1237-1262, December.
    13. Adler, Werner & Lausen, Berthold, 2009. "Bootstrap estimated true and false positive rates and ROC curve," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 718-729, January.
    14. Zhang, Chun-Xia & Zhang, Jiang-She & Zhang, Gai-Ying, 2009. "Using Boosting to prune Double-Bagging ensembles," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1218-1231, February.
    15. Castillo-Páez, Sergio & Fernández-Casal, Rubén & García-Soidán, Pilar, 2019. "A nonparametric bootstrap method for spatial data," Computational Statistics & Data Analysis, Elsevier, vol. 137(C), pages 1-15.
    16. Coussement, Kristof & De Bock, Koen W., 2013. "Customer churn prediction in the online gambling industry: The beneficial effect of ensemble learning," Journal of Business Research, Elsevier, vol. 66(9), pages 1629-1636.
    17. Wei-Yin Loh, 2014. "Fifty Years of Classification and Regression Trees," International Statistical Review, International Statistical Institute, vol. 82(3), pages 329-348, December.
    18. De Bock, Koen W. & Coussement, Kristof & Van den Poel, Dirk, 2010. "Ensemble classification based on generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 54(6), pages 1535-1546, June.
    19. Oliver Hümbelin & Lukas Hobi & Robert Fluder, 2021. "Rich Cities, Poor Countryside? Social Structure of the Poor and Poverty Risks in Urban and Rural Places in an Affluent Country. An Administrative Data based Analysis using Random Forest," University of Bern Social Sciences Working Papers 40, University of Bern, Department of Social Sciences, revised 10 Nov 2021.
    20. John Martin & Sona Taheri & Mali Abdollahian, 2024. "Optimizing Ensemble Learning to Reduce Misclassification Costs in Credit Risk Scorecards," Mathematics, MDPI, vol. 12(6), pages 1-15, March.

    More about this item

    Keywords

    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:55:y:2011:i:5:p:1933-1941. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.