IDEAS home Printed from https://ideas.repec.org/a/spr/alstar/v105y2021i1d10.1007_s10182-020-00385-2.html
   My bibliography  Save this article

Visualizing the decision rules behind the ROC curves: understanding the classification process

Author

Listed:
  • Sonia Pérez-Fernández

    (University of Oviedo)

  • Pablo Martínez-Camblor

    (Geisel School of Medicine at Dartmouth)

  • Peter Filzmoser

    (Vienna University of Technology)

  • Norberto Corral

    (University of Oviedo)

Abstract

The receiver operating characteristic (ROC) curve is a graphical method commonly used to study the capacity of continuous variables (markers) to properly classify subjects into one of two groups. The decision made is ultimately endorsed by a classification subset on the space where the marker is defined. In this paper, we study graphical representations and propose visual forms to reflect those classification rules giving rise to the construction of the ROC curve. On the one hand, we use static pictures for displaying the classification regions for univariate markers, which are specially convenient when there is not a monotone relationship between the marker and the likelihood of belonging to one group. In those cases, there are two options to improve the classification accuracy: to allow for more flexibility in the classification rules (for example considering two cutoff points instead of one) or to transform the marker by using a function whose resulting ROC curve is optimal. On the other hand, we propose to build videos for visualizing the collection of subsets when several markers are considered simultaneously. A compilation of techniques for finding a rule that maximizes the area under the ROC curve is included, with a focus on linear combinations. We present a tool for the R software which generates those graphics, and we apply it to one real dataset. The R code is provided as Supplementary Material.

Suggested Citation

  • Sonia Pérez-Fernández & Pablo Martínez-Camblor & Peter Filzmoser & Norberto Corral, 2021. "Visualizing the decision rules behind the ROC curves: understanding the classification process," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 105(1), pages 135-161, March.
  • Handle: RePEc:spr:alstar:v:105:y:2021:i:1:d:10.1007_s10182-020-00385-2
    DOI: 10.1007/s10182-020-00385-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10182-020-00385-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10182-020-00385-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Heikki Kauppi, 2016. "The Generalized Receiver Operating Characteristic Curve," Discussion Papers 114, Aboa Centre for Economics.
    2. Nielsen, Jens D. & Rumí, Rafael & Salmerón, Antonio, 2009. "Supervised classification using probabilistic decision graphs," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1299-1311, February.
    3. Donna Katzman McClish & Stephen H. Powell, 1989. "How Well Can Physicians Estimate Mortality in a Medical Intensive Care Unit?," Medical Decision Making, , vol. 9(2), pages 125-132, June.
    4. Margaret Sullivan Pepe & Tianxi Cai & Gary Longton, 2006. "Combining Predictors for Classification Using the Area under the Receiver Operating Characteristic Curve," Biometrics, The International Biometric Society, vol. 62(1), pages 221-229, March.
    5. Martin W. McIntosh & Margaret Sullivan Pepe, 2002. "Combining Several Screening Tests: Optimality of the Risk Score," Biometrics, The International Biometric Society, vol. 58(3), pages 657-664, September.
    6. Baojiang Chen & Pengfei Li & Jing Qin & Tao Yu, 2016. "Using a Monotonic Density Ratio Model to Find the Asymptotically Optimal Combination of Multiple Diagnostic Tests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 861-874, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kajal Lahiri & Liu Yang, 2023. "Predicting binary outcomes based on the pair-copula construction," Empirical Economics, Springer, vol. 64(6), pages 3089-3119, June.
    2. Chen, Xiwei & Vexler, Albert & Markatou, Marianthi, 2015. "Empirical likelihood ratio confidence interval estimation of best linear combinations of biomarkers," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 186-198.
    3. Osamu Komori, 2011. "A boosting method for maximization of the area under the ROC curve," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 63(5), pages 961-979, October.
    4. Pablo Martínez-Camblor & Sonia Pérez-Fernández & Susana Díaz-Coto, 2021. "Optimal classification scores based on multivariate marker transformations," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 105(4), pages 581-599, December.
    5. Qing Lu & Nancy Obuchowski & Sungho Won & Xiaofeng Zhu & Robert C. Elston, 2010. "Using the Optimal Robust Receiver Operating Characteristic (ROC) Curve for Predictive Genetic Tests," Biometrics, The International Biometric Society, vol. 66(2), pages 586-593, June.
    6. Weining Shen & Jing Ning & Ying Yuan & Anna S. Lok & Ziding Feng, 2018. "Model†free scoring system for risk prediction with application to hepatocellular carcinoma study," Biometrics, The International Biometric Society, vol. 74(1), pages 239-248, March.
    7. Yuxin Zhu & Mei‐Cheng Wang, 2022. "Obtaining optimal cutoff values for tree classifiers using multiple biomarkers," Biometrics, The International Biometric Society, vol. 78(1), pages 128-140, March.
    8. Chiang, Chin-Tsang & Chiu, Chih-Heng, 2012. "Nonparametric and semiparametric optimal transformations of markers," Journal of Multivariate Analysis, Elsevier, vol. 103(1), pages 124-141, January.
    9. Zhang Zhiwei & Ma Shujie & Nie Lei & Soon Guoxing, 2017. "A Quantitative Concordance Measure for Comparing and Combining Treatment Selection Markers," The International Journal of Biostatistics, De Gruyter, vol. 13(1), pages 1-24, May.
    10. Yanqing Wang & Ying‐Qi Zhao & Yingye Zheng, 2020. "Learning‐based biomarker‐assisted rules for optimized clinical benefit under a risk constraint," Biometrics, The International Biometric Society, vol. 76(3), pages 853-862, September.
    11. Daniel J. Luckett & Eric B. Laber & Samer S. El‐Kamary & Cheng Fan & Ravi Jhaveri & Charles M. Perou & Fatma M. Shebl & Michael R. Kosorok, 2021. "Receiver operating characteristic curves and confidence bands for support vector machines," Biometrics, The International Biometric Society, vol. 77(4), pages 1422-1430, December.
    12. Ming-Yueh Huang & Chin-Tsang Chiang, 2017. "Estimation and Inference Procedures for Semiparametric Distribution Models with Varying Linear-Index," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 44(2), pages 396-424, June.
    13. Chin-Tsang Chiang & Shr-Yan Huang, 2009. "Estimation for the Optimal Combination of Markers without Modeling the Censoring Distribution," Biometrics, The International Biometric Society, vol. 65(1), pages 152-158, March.
    14. Jin, Hua & Lu, Ying, 2009. "Permutation test for non-inferiority of the linear to the optimal combination of multiple tests," Statistics & Probability Letters, Elsevier, vol. 79(5), pages 664-669, March.
    15. Margaret Sullivan Pepe & Tianxi Cai & Gary Longton, 2006. "Combining Predictors for Classification Using the Area under the Receiver Operating Characteristic Curve," Biometrics, The International Biometric Society, vol. 62(1), pages 221-229, March.
    16. Debashis Ghosh, 2004. "Semiparametric methods for the binormal model with multiple biomarkers," The University of Michigan Department of Biostatistics Working Paper Series 1046, Berkeley Electronic Press.
    17. Holly Janes & Margaret S. Pepe, 2008. "Matching in Studies of Classification Accuracy: Implications for Analysis, Efficiency, and Assessment of Incremental Value," Biometrics, The International Biometric Society, vol. 64(1), pages 1-9, March.
    18. Xin Huang & Gengsheng Qin & Yixin Fang, 2011. "Optimal Combinations of Diagnostic Tests Based on AUC," Biometrics, The International Biometric Society, vol. 67(2), pages 568-576, June.
    19. Dat Huynh & Oliver Laeyendecker & Ron Brookmeyer, 2014. "A serial risk score approach to disease classification that accounts for accuracy and cost," Biometrics, The International Biometric Society, vol. 70(4), pages 1042-1051, December.
    20. Carol Y. Lin & Lance A. Waller & Robert H. Lyles, 2012. "The likelihood approach for the comparison of medical diagnostic system with multiple binary tests," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(7), pages 1437-1454, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:alstar:v:105:y:2021:i:1:d:10.1007_s10182-020-00385-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.